Applying regex to dataframe column based on value in another column

Emily K

I have this regex_func helper function below that has been working well to extract a match from a df column using map and lambda.

def regex_func(regex_compile,x,item=0,return_list=False):
    """Function to handle list returned by re.findall()
        Takes the first value of the list.
        If empty list, returns empty string"""
    match_list = regex_compile.findall(x)
    if return_list:
        match = match_list
    elif match_list:
        try:
            match = match_list[item]
        except:
             match = ""
    else:
        match = ""
    return match

#Working example
regex_1 = re.compile('(?i)(?<=\()[^ ()]+')
df['colB'] = df['colA'].map(lambda x: regex_func(regex_1, x))

I am having trouble doing a similar task. I want the regex to be based on a value in another column and then applied. One method I was trying that did not work:

# Regex should be based on value in col1
# Extracting that value and prepping to input into my regex_func()
value_list = df['col1'].tolist()
value_list = ['(?i)(?<=' + d + ' )[^ ]+' for d in value_list]
value_list =  [re.compile(d) for d in value_list]
# Adding prepped list back into df as col2
df.insert(1,'col2',value_list)
#Trying to create col4, based on applying my re.compile in col 2 to a value in col3.
df.insert(2,'col4', df['col3'].map(lambda x: df['col2'],x)

I understand why the above doesn't work, but have not been able to find a solution.

a_guest

You can zip the columns and then build the regex on the fly:

df['colB'] = [regex_func('(?i)(?<=' + y + ' )[^ ]+', x)
              for x, y in zip(df['colA'], df['col1'])]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

extract column value based on another column pandas dataframe

Filtering the dataframe based on the column value of another dataframe

get column value based on another column with list of strings in pandas dataframe

Applying function in new dataframe column based on value from another column

Pandas reshape dataframe by adding a column level based on the value of another column

How to find minimum value in a column based on condition in an another column of a dataframe?

Have to split dataframe column based on length value in another column

Replacing a column value by another column value based on regex - Python

Max value of a column based on every unique value of another column (Dataframe)

Update a column value in a spark dataframe based another column

How to add column based on another column value in Pandas dataframe?

Python match a column name based on a column value in another dataframe

Set column status based on another dataframe column value pyspark

Applying a list in a dataframe column to another column

Applying a specific function to replace value of column based on criteria from another column in dataframe

normalize column in pandas dataframe based on value in another column

Select from column in dataframe based on value in another column

Selecting a column value based on the value from another dataframe column

Selecting value of column based on the values in another column, then applying the value at each row in pandas dataframe

Applying abbreviation to the column of a dataframe based on another column of the same dataframe

Update an existing column in one dataframe based on the value of a column in another dataframe

How to write value from a dataframe column to another column based in a condition?

Add a column based on the value of another column in a dataframe

Calculate percentage of occurences of a value in a dataframe column based on another column value

Populate a panda's dataframe column based on another column and dictionary value

Ammed a column in pandas dataframe based on the string value in another column

Python Pandas Dataframe - Create new column using a conditional/applying a function based on another column

Iterate through a column and change another column value based on it (Pandas Dataframe)

Fill DataFrame based on value from another column