Normalize multiple columns of list/tuple data

Sp_95

I have a dataframe with multiple columns of tuple data. I'm trying to normalize the data within the tuple for each row per columns. This is an example with lists, but it should be the same concept for tuples as well-

df = pd.DataFrame(np.random.randn(5, 10), columns=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'])
df['arr1'] = df[['a', 'b', 'c', 'd', 'e']].values.tolist()
df['arr2'] = df[['f', 'g', 'h', 'i', 'j']].values.tolist()

If I wish to normalize each list row for a few columns, I would do this-

df['arr1'] = [preprocessing.scale(row) for row in df['arr1']]
df['arr2'] = [preprocessing.scale(row) for row in df['arr2']]

However, since I have about 100 such columns in my original dataset, I obviously don't want to manually normalize per column. How can I loop across all columns?

Steven Rouk

You can look through columns in a DataFrame like this to process each column:

for col in df.columns:
    df[col] = [preprocessing.scale(row) for row in df[col]]

Of course, this only works if you want to process all of the columns in the DataFrame. If you only want a subset, you could create a list of columns first, or you could drop the other columns.

# Here's an example where you manually specify the columns
cols_to_process = ["arr1", "arr2"]

for col in cols_to_process:
    df[col] = [preprocessing.scale(row) for row in df[col]]


# Here's an example where you drop the unwanted columns first
cols_to_drop = ["a", "b", "c"]
df = df.drop(columns=cols_to_drop)

for col in cols_to_process:
    df[col] = [preprocessing.scale(row) for row in df[col]]


# Or, if you didn't want to actually drop the columns
# from the original DataFrame you could do it like this:
cols_to_drop = ["a", "b", "c"]
for col in df.drop(columns=cols_to_drop):
    df[col] = [preprocessing.scale(row) for row in df[col]]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Normalize Data in Google Sheets - TRANSPOSE multiple columns Fill Down

Normalize data by substracting first row to every values in multiple columns

Normalize a multiple data histogram

Normalize columns of pandas data frame

How to normalize multiple columns of dicts in a pandas dataframe

Pandas JSON Normalize multiple columns in a dataframe

normalize a pandas data frame but skip a few columns

Loop to subtract columns of different data frames to normalize data

How can I normalize the data in a range of columns in my pandas dataframe

Normalize JSON data to Pandas DataFrame where columns and values are in lists

Trying to normalize data, but got undefined columns selected error in R

Using `scale()` to normalize all numeric columns in a data.frame

Transpose the Data for Multiple Columns

Group Data By Multiple Columns

Interpolate data with multiple columns

Breaking data into multiple columns

Multiple columns data frame

Sorting Data in Multiple Columns

Reshaping data with multiple columns

How to normalize time series data with multiple features by using sklearn?

Normalize By Group for All Columns

R Normalize Many Columns

How to use split-apply-combine pattern of pandas groupby() to normalize multiple columns simultaneously

Splitting dataframe column in multiple columns using json_normalize does not work

Sort by data from multiple columns

Comparing data with multiple columns and conditions

ProxySQL data masking for multiple columns

excel sumif multiple columns data

selecting data from multiple columns