pyeasyeda.clean_up
Module Contents
Functions
|
Takes a dataframe object and returns a cleaned version |
- pyeasyeda.clean_up.clean_up(df)[source]
Takes a dataframe object and returns a cleaned version with rows containing any NaN values dropped. Inspects the clean dataframe and prints a list of potential outliers for each explanatory variable, based on the threshold distance of 3 standard deviations.
- dfdataframe
dataframe to be cleaned
- df_clean
same dataframe with all the NaN’s removed
>>> df_clean = clean_up(df)
‘The following potenital outliers were detected: Variable X: [ 300, 301, 500, 1000 ] Variable Y: [ 6.42, 6.44, 58.52, 60.22 ]’