Making a list ouf of values in a DataFrame depending on values in another column

Mi.

I have a pandas dataframe as shown here. There are many more columns in that frame that are not important concerning the task.

id    pos      value       sente
1     a         I           21
2     b         have        21
3     b         a           21
4     a         cat         21
5     d         !           21
1     a         My          22
2     a         cat         22
3     b         is          22
4     a         cute        22
5     d         .           22

I would like to make a list out of certain colums so the first sentence (sente=21) and every other looks something like that. Meaing that every sentence has an unique entry for itself.

`[('I', 'a', '1'), ..., ('!','d','5')]`

I already have a function to do this for one sentence but I can not figure out how to do it for all sentences (sentences that have the same sente value) in the frame.

`class SentenceGetter(object):
  def __init__(self, data):
    self.n_sent = 1
    self.data = data
    self.empty = False
  def get_next(self):
    for t in self.data:
        try:
            s = self.data[(self.data["sente"] == 21)]
            self.n_sent += 1
            return 
              s["id"].values.tolist(),   
              s["pos"].values.tolist(),
              s["value"].values.tolist() 
        except:
            self.empty = True
            return None,None,None

foo = SentenceGetter(df)
sent, pos, token = foo.get_next()
in = zip(token, pos, sent)

`

As my frame is very large there is no way to use constructions like this:

df.loc[((df["sente"] == df["sente"].shift(-1)) & (df["sente"] == df["sente"].shift(+1))), ["pos","value","id"]]

Any ideas?

jpp

If you are open to using the standard library, collections.defaultdict offers an O(n) solution:

from collections import defaultdict

d = defaultdict(list)

for _, num, *data in df[['sente', 'value', 'pos', 'id']].itertuples():
    d[num].append(data)

Result:

defaultdict(list,
            {21: [('I', 'a', 1),
                  ('have', 'b', 2),
                  ('a', 'b', 3),
                  ('cat', 'a', 4),
                  ('!', 'd', 5)],
             22: [('My', 'a', 1),
                  ('cat', 'a', 2),
                  ('is', 'b', 3),
                  ('cute', 'a', 4),
                  ('.', 'd', 5)]})

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Add column with values depending on another column to a dataframe

Create a column with values of a list and depending on another column

add new column in a dataframe depending on another dataframe's row values

Replace values in dataframe column depending on another column with condition

Pandas: Generate a Dataframe column which has values depending on another column of a dataframe

Creating a list column in a dataframe based on values in another dataframe

List of Python Dataframe column values meeting criteria in another dataframe?

How to group row values depending on another column?

Different JOIN values depending on the value of another column

Select column dynamically in Pandas dataframe based on values in a list or another column

How to make conditional replacement of values in a column depending on values in another column?

Array of values depending on column of Pandas Dataframe

Add column to dataframe depending on specific row values

Python Pandas Select values from another dataFrame depending on the total from another column

Removing values form a list in pandas dataframe column based on another list

Extend dataframe with a new column that is depending on values that are stored in another (variable-linked) dataframe

How to calculate the values of a pandas DataFrame column depending on the results of a rolling function from another column

replace empty list with values in another column in pandas dataframe

Python pandas dataframe check if values of one column is in another list

Appending column values from one dataframe to another as a list

Replace pandas column with list values from another panadas dataframe

apply function to rows pandas dataframe with different parameters depending on values of another column

Grouping column values in pandas and making other column values into a list

If list of lists values are present in Pandas dataframe column replace them with values from another Pandas column

how fill in NAs (mean/median) in a dataframe depending on the values of another dataframe?

how to access rows of df depending on values of another column in another df

Unpacking dataframe column with list values

Converting dataframe column values to list

Get DataFrame column as list of values

TOP Ranking

HotTag

Archive