How to assign Pandas.Series.str.extractall() result back to original dataset? (TypeError: incompatible index of inserted column with frame index)

jaeseongpark

Dataset brief overview

dete_resignations['cease_date'].head()

gives

result

dete_resignations['cease_date'].value_counts()

gives

result of the code above


What I tried

I was trying to extract only the year value (e.g. 05/2012 -> 2012) from 'dete_resignations['cease_date']' using 'Pandas.Series.str.extractall()' and assign the result back to the original dataframe. However, since not all the rows contain that specific string values(e.g. 05/2012), an error occurred.

Here are the code I wrote.

pattern = r"(?P<month>[0-1][0-9])/?(?P<year>[0-2][0-9]{3})"
years = dete_resignations['cease_date'].str.extractall(pattern)
dete_resignations['cease_date_'] = years['year']

'TypeError: incompatible index of inserted column with frame index'


I thought the 'years' share the same index with 'dete_resignations['cease']'. Therefore, even though two dataset's index is not identical, I expected python automatically matches and assigns the values to the right rows. But it didn't

Can anyone help solve this issue?

Much appreciated if someone can enlighten me!

Quang Hoang

If you only want the years, then don't catch the month in pattern, and you can use extract instead of extractall:

# the $ indicates end of string
# \d is equivalent to [0-9]
# pattern extracts the last digit groups
pattern = '(?P<year>\d+)$'
years = dete_resignations['cease_date'].str.extract(pattern)
dete_resignations['cease_date_'] = years['year']

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Why is there a "TypeError: incompatible index of inserted column with frame index" error in Pandas version 2.0.0 only

incompatible index of inserted column with frame index

when using group_by: TypeError: incompatible index of inserted column with frame index

incompatible index of inserted column with frame index with group by and count

Pandas Series str.split assign the split result back to the original dataframe as separate columns

How to use extractall() result to add partially duplicated rows to original dataset?

How to get rows in pandas data frame, with maximal values in a column and keep the original index?

How do I turn a Pandas DataFrame object with 1 main column into a Pandas Series with the index column from the original DataFrame

How to add differentiate series result to another column from index 0 in pandas dataframe?

Assign new series to existing column by column index

Pandas extractall() - return list, not a MultiLevel index?

How to assign a certain value into a new column in a pandas dataframe depending on index

How to transform recurrent time series pandas data frame to pandas multi-index data frame

pandas - How to append a column to data frame by matching index values?

How to get the index of the mode value of a specific column in a pandas data frame

how to create a column hierarchical index in pandas data frame

How to create a new column in a pandas data frame for values with the same index

How can I find the position of the outputs from pandas.Series.str.extractall()?

How to expand a nested column in pandas data frame and attach back to original dataframe in python

Adding a column based on index to a data frame in Pandas

define a part of column series to index in pandas

How to assign index or access index of time series objects?

Python Pandas: combine 2 dataframes, one frame's column as the final result's index

Pandas, how to add Series to DataFrame column, where series index matches a DataFrame column?

Extract Numbers out of a Column in Pandas DataFrame using pd.series.str.extractall vs. re.findall

Pandas split column and aggreate result with duplicates in index

Filter pandas multi index data frame based on index column values

How to pivot column index to row index in pandas?

Pandas pivot - how to keep original sorting of index?