Making a list ouf of values in a DataFrame depending on values in another column

Mi. Published at Dev

Mi.

I have a pandas dataframe as shown here. There are many more columns in that frame that are not important concerning the task.

id    pos      value       sente
1     a         I           21
2     b         have        21
3     b         a           21
4     a         cat         21
5     d         !           21
1     a         My          22
2     a         cat         22
3     b         is          22
4     a         cute        22
5     d         .           22

I would like to make a list out of certain colums so the first sentence (sente=21) and every other looks something like that. Meaing that every sentence has an unique entry for itself.

`[('I', 'a', '1'), ..., ('!','d','5')]`

I already have a function to do this for one sentence but I can not figure out how to do it for all sentences (sentences that have the same sente value) in the frame.

`class SentenceGetter(object):
  def __init__(self, data):
    self.n_sent = 1
    self.data = data
    self.empty = False
  def get_next(self):
    for t in self.data:
        try:
            s = self.data[(self.data["sente"] == 21)]
            self.n_sent += 1
            return 
              s["id"].values.tolist(),   
              s["pos"].values.tolist(),
              s["value"].values.tolist() 
        except:
            self.empty = True
            return None,None,None

foo = SentenceGetter(df)
sent, pos, token = foo.get_next()
in = zip(token, pos, sent)

As my frame is very large there is no way to use constructions like this:

df.loc[((df["sente"] == df["sente"].shift(-1)) & (df["sente"] == df["sente"].shift(+1))), ["pos","value","id"]]

Any ideas?

jpp

If you are open to using the standard library, collections.defaultdict offers an O(n) solution:

from collections import defaultdict

d = defaultdict(list)

for _, num, *data in df[['sente', 'value', 'pos', 'id']].itertuples():
    d[num].append(data)

Result:

defaultdict(list,
            {21: [('I', 'a', 1),
                  ('have', 'b', 2),
                  ('a', 'b', 3),
                  ('cat', 'a', 4),
                  ('!', 'd', 5)],
             22: [('My', 'a', 1),
                  ('cat', 'a', 2),
                  ('is', 'b', 3),
                  ('cute', 'a', 4),
                  ('.', 'd', 5)]})

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-10-28

Comments

0 comments

How to calculate the values of a pandas DataFrame column depending on the results of a rolling function from another column

TOP Ranking

Article

Making a list ouf of values in a DataFrame depending on values in another column

Making a list ouf of values in a DataFrame depending on values in another column

pump.io port in URL

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

How to import an asset in swift using Bundle.main.path() in a react-native native module

Inner Loop design for webscrapping

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

mysql.connector.errors.InterfaceError: 2003: Can't connect to MySQL server on '127.0.0.1:3306' (111 Connection refused)

Removed zsh, but forgot to change shell back to bash, and now Ubuntu crashes (wsl)

Ambiguous use of 'init' with CFStringTransform and Swift 3

Resetting Value of <input type="time"> in Firefox

Execute ./script.sh with a crontab

Converting a class method to a property with a backing field

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

How to update azerothcore-wotlk docker container

How to set tab order for array of cluster,where cluster elements have different data types in LabVIEW?

Grails with Oracle thick OCI driver authenticate to Oracle with wrong user

How to pass data to the ng2-bs3-modal?

Making Array From Page Elements in jQuery

Retrieve Element Tag Value XML Using Bash

Laravel's ORM sync with timestamps doesn't update timestamps

Do animations stop css changes after animation completion?