deciphering html

47 views
Skip to first unread message

Will

unread,
Jun 21, 2012, 2:31:30 PM6/21/12
to crunchb...@googlegroups.com
Hey all, 

So lets say I wanted to pull all the API information that comes with "profile page"--ie; all the api information that comes with a profile page, but I do it via the HTML code. Does anyone know how to separate (the api pieces) of the code in an easy way? CSV wouldnt work because there are commas within the descriptions of companies. I'm not sure my explanation translates well to my problem, but let me know. 
HTML Code to what's below--
                    link     CEO    phone   description  employees  etc etc etc
Facebook

Anthony

unread,
Jun 21, 2012, 2:55:44 PM6/21/12
to crunchb...@googlegroups.com
Hi Will

It sounds like you want to screen scrape the HTML code on CrunchBase profile pages. I don't see the reason to, since all the profile information is already accessible via the API and neatly formatted in JSON. Here's an example of using the API with jQuery: https://gist.github.com/2921495. However, If you feel the need to use an HTML parser, then I would suggest using either nokogiri or hpricot

--Anthony
Reply all
Reply to author
Forward
0 new messages