Introducing pandas-profiling: Create beautiful HTML profiling reports from pandas DataFrame objects

2,847 views
Skip to first unread message

Jos Polfliet

unread,
Jan 26, 2016, 9:45:28 PM1/26/16
to PyData
Profiling data always consist of the same basic steps. A few weeks ago I decided to automate the process and summarize everything you want to know in one interactive report.

The source code can be found on GitHub. Installation is as simple as "pip install pandas-profiling"

Click here to see a live demo.

Let me know if you have any comments, find bugs, suggestions for improvements or anything else!

Ivan Ogasawara

unread,
Jan 26, 2016, 9:50:56 PM1/26/16
to pyd...@googlegroups.com

Very interesting! I will try it soon :)
Thank you to share :)

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ivan Ogasawara

unread,
Jan 26, 2016, 9:52:41 PM1/26/16
to pyd...@googlegroups.com


>
> Very interesting! I will try it soon :)
> Thank you to share :)
>

* thank you for share

Eraldo Pomponi

unread,
Jan 27, 2016, 4:10:51 AM1/27/16
to pyd...@googlegroups.com
Useful package ... Thanks for sharing! 

Cheers,
Eraldo  

dartdog

unread,
Jan 27, 2016, 10:27:20 AM1/27/16
to PyData
Very nice just ran on a test set and find it quite helpful.

Paul Hobson

unread,
Jan 27, 2016, 4:38:24 PM1/27/16
to pyd...@googlegroups.com
This is absolutely fantastic. Nice work!

--

Brenda So

unread,
Jan 27, 2016, 5:17:01 PM1/27/16
to PyData
I was just starting pandas and using it to analyze weather data. I thought pandas can already generate bar charts and tables base on data given. How is pandas profiling different from pandas? Also, I am interested in expanding the functionality of pandas-profiling to display nice graphics. What do you think? :)

jos.po...@gmail.com

unread,
Jan 27, 2016, 6:05:11 PM1/27/16
to pyd...@googlegroups.com
pandas is a complete data analysis toolkit that does a lot of different things including plotting and data manipulation. Among a bunch of other things, it offers a standardized and easy accessible structure to access and store data called a DataFrame.
pandas_profiling does only 1 thing and that is generate a report on the different variables in a DataFrame. It is different than pandas because it automates a bunch of things that I believe are repetitive and only does that 1 thing, but nothing else, while pandas does a million different things. 

Not sure which graphics you would want to add, but of course you're always welcome to post ideas or fork the code. If you are interested in nice graphics, check out the seaborn package, which is a graphical layer on top of matplotlib, the standard for plotting in Python.



--
Reply all
Reply to author
Forward
0 new messages