How to properly apply a lambda function into a pandas data frame column -


i have pandas data frame, sample, 1 of columns called pr applying lambda function follows:

sample['pr'] = sample['pr'].apply(lambda x: nan if x < 90) 

i following syntax error message:

sample['pr'] = sample['pr'].apply(lambda x: nan if x < 90)                                                          ^ syntaxerror: invalid syntax 

what doing wrong?

you need mask:

sample['pr'] = sample['pr'].mask(sample['pr'] < 90, np.nan) 

another solution loc , boolean indexing:

sample.loc[sample['pr'] < 90, 'pr'] = np.nan 

sample:

import pandas pd import numpy np  sample = pd.dataframe({'pr':[10,100,40] }) print (sample)     pr 0   10 1  100 2   40  sample['pr'] = sample['pr'].mask(sample['pr'] < 90, np.nan) print (sample)       pr 0    nan 1  100.0 2    nan 
sample.loc[sample['pr'] < 90, 'pr'] = np.nan print (sample)       pr 0    nan 1  100.0 2    nan 

edit:

solution apply:

sample['pr'] = sample['pr'].apply(lambda x: np.nan if x < 90 else x) 

timings len(df)=300k:

sample = pd.concat([sample]*100000).reset_index(drop=true)  in [853]: %timeit sample['pr'].apply(lambda x: np.nan if x < 90 else x) 10 loops, best of 3: 102 ms per loop  in [854]: %timeit sample['pr'].mask(sample['pr'] < 90, np.nan) slowest run took 4.28 times longer fastest. mean intermediate result being cached. 100 loops, best of 3: 3.71 ms per loop 

Comments

Popular posts from this blog

c# - DevExpress.Wpf.Grid.InfiniteGridSizeException was unhandled -

scala - 'wrong top statement declaration' when using slick in IntelliJ -

PySide and Qt Properties: Connecting signals from Python to QML -