In addition on my previous question Search for value in all DataFrame columns (except first column !) and add new column with matching column name (where I used a static keyword)
I'd like to check if the string in the first column is contained in one of the another columns in the same row and then add a new column with the matching column name(s). All columns names of all matched values!
Now i'm using this with a static keyword:
keyword='123'
f = lambda row: row.apply(str).str.replace(".","").str.contains(keyword ,na=False, flags=re.IGNORECASE)
df1 = df.iloc[:,1:].apply(f, axis=1)
df.insert(loc=1, column='Matching_Columns', value=df1.dot(df.columns[1:] + ', ').str.strip(', '))
Sample:
Input:
key | col_B | col_C | col_D | col_E
------------------------------------
123 | abcd | 12345 | fght | 7890
567 | tdfe | 6353 | 0567 | 56789
Output:
key | match | col_B | col_C | col_D | col_E
-------------------------------------------------
123 | col_C | abcd | 12345 | fght | 7890
567 | col_D,col_E | tdfe | 6353 | 0567 | 56789
Any help much appreciated!
>>> df
to_find col1 col2
0 a ab ac
1 b aa ba
2 c bc ee
>>> df['found_in'] = df.apply(lambda x: ' '.join(x.iloc[1:][x.iloc[1:].str.contains(str(x['to_find']))].index), axis=1)
>>> df
to_find col1 col2 found_in
0 a ab ac col1 col2
1 b aa ba col2
2 c bc ee col1
For better readability,
>>> def get_columns(x):
... y = x.iloc[1:]
... return y.index[y.str.contains(str(x['to_find']))]
...
>>> df['found_in'] = df.apply(lambda x: ' '.join(get_columns(x)), axis=1)
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments