I have a huge list of strings (similar to strs given below, but much larger). The time stamps are given for each column.
I'd like to efficiently convert it to a table format (numpy array or pandas dataframe or ...) according to the one below.
strs = ['time', 'stamp1', 'a', '1', 'b', '2', 'c', '3',
'time', 'stamp2', 'a', '11', 'b', '22', 'd', '4',
'time', 'stamp3', 'a', '111', 'b', '222', 'c', '333',
'time', 'stamp4', 'a', '1111', 'b', '2222', 'c', '3333', 'd', '444']
time | a | b | c | d |
---|---|---|---|---|
stamp1 | 1 | 2 | 3 | |
stamp2 | 11 | 22 | 4 | |
stamp3 | 111 | 222 | 333 | |
stamp4 | 1111 | 2222 | 3333 | 444 |
You could do:
import pandas as pd
records = []
record = {strs[0]: strs[1]}
for key, value in zip(strs[2::2], strs[3::2]):
if key == "time":
records.append(record)
record = {key: value}
else:
record[key] = value
else:
records.append(record)
table = pd.DataFrame(records)
Result:
time a b c d
0 stamp1 1 2 3 NaN
1 stamp2 11 22 NaN 4
2 stamp3 111 222 333 NaN
3 stamp4 1111 2222 3333 444
Or do it via a generator:
import pandas as pd
def records(lst):
record = {lst[0]: lst[1]}
for key, value in zip(lst[2::2], lst[3::2]):
if key == "time":
yield record
record = {key: value}
else:
record[key] = value
else:
yield record
table = pd.DataFrame(records(strs))
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments