How do I filter a pandas DataFrame based on value counts?

uchuujin :

I'm working in Python with a pandas DataFrame of video games, each with a genre. I'm trying to remove any video game with a genre that appears less than some number of times in the DataFrame, but I have no clue how to go about this. I did find a StackOverflow question that seems to be related, but I can't decipher the solution at all (possibly because I've never heard of R and my memory of functional programming is rusty at best).

Help?

Andy Hayden :

Use groupby filter:

In [11]: df = pd.DataFrame([[1, 2], [1, 4], [5, 6]], columns=['A', 'B'])

In [12]: df
Out[12]:
   A  B
0  1  2
1  1  4
2  5  6

In [13]: df.groupby("A").filter(lambda x: len(x) > 1)
Out[13]:
   A  B
0  1  2
1  1  4

I recommend reading the split-combine-section of the docs.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How can I filter this pandas dataframe based with substrings and row counts stored in another dataframe?

In a pandas dataframe, how can I filter the rows based on a column value, do calculation and assign the result to a new column?

How can I create a new column in a pandas dataframe that counts backwards the rows based on the value of another column?

How do I can filter pandas DataFrame by slice of column value

How do I get the rows from my DataFrame based on the number of the top 10 value_counts?

How do I replace a value in pandas DataFrame based on a condition?

How to do value_counts in all values in Pandas Dataframe

How do I filter based on other dataframe

how do i plot multiple lines using value counts on a dataframe

filter dataframe by value counts

How do I use python or pandas to filter a DataFrame based on a column that consists of a list of dictionary?

How do I filter DataFrame rows based on a key value pair from a list of dictionaries column field?

How do I filter rows based on whether a column value is in a Set of Strings in a Spark DataFrame

How to subset a pandas dataframe on value_counts?

Pandas: How to merge value counts in a grouped dataframe

How do I get the value of a column in a Pandas DataFrame with the name based on another columns value on the same row?

How to filter pandas dataframe based on date value with exact match

How to filter a Pandas dataframe in python based on column value comparison?

How to do append based on multiple filter on pandas dataframe more effectively

How to do value_counts based on value in another dataframe column python?

Filter Pandas DataFrame using value_counts and multiple columns?

How can I filter a substring from a pandas dataframe based on a list?

how to replace values in pandas dataframe column based on value_counts() condition?

How do i filter a value of key in an object based on matching value?

How do I add a new column with a repeated value per group based on condition in a Pandas DataFrame?

How do I efficiently parse a substring in a Pandas Dataframe based on a value in a column?

Filtering dataframe based on column value_counts (pandas)

How do I merge and filter dataframe based on multiple conditions?

How do I filter a dataframe based on complicated conditions?