I have a list 'r' like this:
[["", 1], ["this is a text line", 2], ["this is a text line", 3], ["this is a text line", 4], ["", 5], ["", 6], ["this is a text line", 7],["this is a text line", 8], ["this is a text line", 9], ["this is a text line", 10], ["", 11], ["this is a text line", 12], ["this is a text line", 13], ["this is a text line", 14], ["", 15], ["this is a text line", 16], ["this is a text line", 17], ["this is a text line", 18], ["", 19]]
To know where are my empty lines and lines with text I filter my list:
empty = [x[1] for x in r if regex.search("^\s*$", x[0])]
text = [x[1] for x in r if regex.search("\S", x[0])]
output:
empty = [1, 5, 6, 11, 15, 19]
text= [2, 3, 4, 7, 8, 9, 10, 12, 13, 14, 16, 17, 18]
What I want to do is to combine the numbers in text if they are in sequence (text[i]-text[i+1]) = +1 (in order to define the paragraphs):
finaltext = [[2, 3, 4], [7, 8, 9, 10], [12, 13, 14], [16, 17, 18]]
finaltext including empty = [[2, 3, 4, 5, 6], [7, 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]]
How can I group elements in a list based on a condition?
Pure Python solution without any modules
:
This can be done using modules
such as with numpy
and groupby
, but I thought it would be call to attempt without them, just with plain Python
. Here is my solution:
text = [2, 3, 4, 7, 8, 9, 10, 12, 13, 14, 16, 17, 18]
s = 0
finaltext = []
for i in range(len(text)-1):
if text[i] + 1 != text[i+1]:
finaltext.append(text[s:i+1])
s = i+1
finaltext.append(text[s:])
which gives finaltext
as:
[[2, 3, 4], [7, 8, 9, 10], [12, 13, 14], [16, 17, 18]]
Update
To get both of the lists
(not sure why you would want to), you can use the following:
empty = [1, 5, 6, 11, 15, 19]
text = [2, 3, 4, 7, 8, 9, 10, 12, 13, 14, 16, 17, 18]
s = 0
finaltext = []
finaltext_including_empty = []
for i in range(len(text)-1):
if text[i] + 1 != text[i+1]:
finaltext.append(text[s:i+1])
finaltext_including_empty.append(list(range(text[s], text[i+1])))
s = i+1
finaltext.append(text[s:])
finaltext_including_empty.append(list(range(text[s],max(empty[-1]+1, text[-1]+1))))
which gives finaltext
the same as before and finaltext_including_empty
as:
[[2, 3, 4, 5, 6], [7, 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]]
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments