I am using apply function to create a new column i.e. ERROR_TV_TIC into dataframe based on existing columns [TV_TIC and ERRORS] values. I am not sure what I am doing wrong. With some conditions it works and with another it doesn't and throw error.
DataFrame:
ERRORS|TV_TIC
|2.02101E+41
['Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)']|nan
['Future Option Indicator is missing']|nan
['Trade Id is missing', 'Future Option Indicator is missing']|nan
['Trade Id is missing', 'Future Option Indicator is missing']|nan
Code when it works:
def validate_tv_tic(trades):
tv_tiv_errors = list()
if pd.isnull(trades['TV_TIC']):
tv_tiv_errors.append("Initial validations passed still TV_TIC missing")
if pd.notnull(trades['TV_TIC']) and len(trades['TV_TIC']) != 42:
tv_tiv_errors.append("Initial validations passed and TV_TIC is also generated but length is != 42 chars")
return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan
trades['ERROR_TV_TIC'] = trades.apply(validate_tv_tic, axis=1)
Code when it doesn't work: Here now condition is on 2 columns of series and I am making sure that I am passing "&" and not "and"
def validate_tv_tic(trades):
tv_tiv_errors = list()
if pd.isnull(trades['ERRORS']) & pd.isnull(trades['TV_TIC']):
tv_tiv_errors.append("Initial validations passed still TV_TIC missing")
if pd.isnull(trades['ERRORS']) & pd.notnull(trades['TV_TIC']) & len(trades['TV_TIC']) != 42:
tv_tiv_errors.append("Initial validations passed and TV_TIC is also generated but length is != 42 chars")
return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan
trades['ERROR_TV_TIC'] = trades.apply(validate_tv_tic, axis=1)
Error I am getting: ('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()', 'occurred at index 3')
Error description with used "and"
Error description when used "&"
My gut feeling is saying that pd.isnull is somewhere causing problem but not sure.
There was no problem with code. Problem exists with data inside dataframe.
column ERRORS was list of string and error was thrown when > 1 item exists as column value. So, I was getting error for line 3 and 4
ERRORS
['Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)']
['Future Option Indicator is missing']
['Trade Id is missing', 'Future Option Indicator is missing']
['Trade Id is missing', 'Future Option Indicator is missing']
After finding the root cause I changed the list to string where elements are separated by non-comma element and that works for me.
Changed my return statement of function validate_tv_tiv from
return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan
to
return ' & '.join(errors) if len(errors) > 0 else np.nan
and this created my dataframe column ERRORS as below:
ERRORS
Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)
Future Option Indicator is missing
Trade Id is missing & Future Option Indicator is missing
Trade Id is missing & Future Option Indicator is missing
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments