Is there a default Pandas method for removing null or missing values when they are represented by a custom value like "?" or "Unknown"

Gerardo Zinno

In the dataset I'm working on, the Adult dataset, the missing values are indicated with the "?" string, and I want to discard the rows containing missing values.

In the documentation of the method df.dropna() there is no argument that offers the possibility of passing a custom value to interpret as the null/missing value,

I know I can simply solve the problem with something like:

df_str = df.select_dtypes(['object']) # get the columns containing the strings
for col in df_str.columns:
    df = df[df[col] != '?']

but I was wondering if there is a standard way of achieving this using Pandas apis which possibly offers more flexibility all while being faster.

DocZerø

If you're importing the data from CSV for example, you could use the parameter na_values to define additional strings to recognise as NA/NaN.

Example:

import pandas as pd
from io import StringIO

data = \
"""
A;B;C
1;2;?
4;?;6
?;8;9
"""

df = pd.read_csv(StringIO(data),
                 delimiter=';', 
                 na_values='?')

The resulting dataframe looks like this:

A B C
1 2 NaN
4 NaN 6
NaN 8 9

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

RestTemplate | Reading unknown ENUM values as default value

Null value when Pass values [FromBody] to post method by Postman plugin

Applying default groovy method parameter value when passing null

Default value in select query for null values in postgres

Removing numpy array columns with the same non-missing value, when missing values present

Using a custom ContractResolver, how to set a default value instead of null when deserializing a null JSON property to a value-type member?

Removing from listbox an unknown value

Filling Missing values Pandas Dataframe by specific value

missing default value for NOT NULL TIMESTAMP?

Replace values in pandas column with default value for missing keys

Instruct constructor method to use default value if null

Default values for arguments when argument is null?

Assigning default values to parameters when passed a NULL

I want to set default value that is missing when click on the dropdown list.I would like to be unable to select "Please select" value

select2 not removing default 0 value when another selection

Pandas groupby range of values when range is unknown

Replace missing values in Pandas with previous value if not NAN

Python, Pandas: Add default for missing values

grails method missing exception when trying to find null value in domain

Removing rows with null values in any of a subset of columns (pandas)

Removing specific values when a certain value is before

Concatenate multiple values and removing characaters when null

Set Null Value as Default for Missing Table Fields

null values passed when passing ITestContext attribute value to onTestStart method

Prisma db push error when removing column that has a default value

pandas - fill in default values for missing records

Python Pandas CVS File: Slicing Values and Removing Lines with Missing Values

Java Optionals: not null and method OR default value

JsonConvert.PopulateObject is making my values default or null when inside a custom serializable class

TOP Ranking

HotTag

Archive