pyeasyeda.clean_up

Module Contents

Functions

clean_up(df)

Takes a dataframe object and returns a cleaned version

pyeasyeda.clean_up.clean_up(df)[source]

Takes a dataframe object and returns a cleaned version with rows containing any NaN values dropped. Inspects the clean dataframe and prints a list of potential outliers for each explanatory variable, based on the threshold distance of 3 standard deviations.

dfdataframe

dataframe to be cleaned

df_clean

same dataframe with all the NaN’s removed

>>> df_clean = clean_up(df)

The following potenital outliers were detected: Variable X: [ 300, 301, 500, 1000 ] Variable Y: [ 6.42, 6.44, 58.52, 60.22 ]’