Pandas dataframe - filter list of tuples

blue-sky

I'm attempting to modify a dataframe which contains a list of tuples within it's column values such that if a sequence of 'off' and 'on' is encountered for a sequence of tuples then they are removed from the dataframe.

Here is the dataframe prior to processing :

import pandas as pd
import numpy as np

array = np.array([[1, [('on',1),('off',1),('off',1),('on',1)]], [2,[('off',1),('on',1),('on',1),('off',1)]]])
index_values = ['first', 'second']
column_values = ['id', 'l']
df = pd.DataFrame(data = array, 
                  index = index_values, 
                  columns = column_values)

which renders :

enter image description here

I'm attempting to produce this dataframe :

enter image description here

Here is my attempt :

updated_col = []
for d in df['l'] : 
    for index, value in enumerate(d) : 
        if len(value) == index : 
            break 
        elif value[index] == 'off' and value[index + 1] == 'on' : 
            updated_col.append(value)

The variable updated_col is empty. Cana lambda be used to process over the column and remove values where a sequence of off and on are found ?

Edit :

Custom pairwise function :

this seems to do the trick :

import itertools
def pairwise(x) : 
    return list(itertools.combinations(x, 2))
BeRT2me
from itertools import pairwise
# Or (Depending on python version)
from more_itertools import pairwise

df.l = df.l.apply(lambda v: [x for x in pairwise(v)
                             if x == (('on', 1), ('off', 1))][0]).map(list)

Output:

       id                    l
first   1  [(on, 1), (off, 1)]
second  2  [(on, 1), (off, 1)]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related