For a project I'm working on I need to visualise and detect outliers. Not finding anything in pandas, scipy or statsmodels, I spent a couple of days learning how this is done then writing some tools that are useful for my work.
Specifically I wrote three general functions to:
- Detect outliers using IQR, MAD and z-score after detrending the data.
- Replace detected outliers with NaN or interpolated values.
- Plot a time series overlaying a trend line (linear), a 2 SD interval guide, and points for outliers found using one of the detection methods.
These work well and I've created a
Notebook showing their use.
My question is are these generally enough and useful enough for inclusion in pandas?