missing license numbers when scraping - what to do?

27 views
Skip to first unread message

smcl

unread,
Dec 22, 2014, 7:16:07 AM12/22/14
to opencorporat...@googlegroups.com
There's a handful of missions where you are meant to retrieve a list of financial licenses from relatively easy to parse pages/PDFs but where there's no indication of any license # anywhere on the linked site. Even after checking additional pages on the site, or following up and digging around in the sites of the linked companies. 

If there's no financial license number listed should this be marked as "I think this is impossible" or is there an alternative solution where we can submit it as "LOW" confidence and have some sort of value that indicates there may not be a license number?

- Sean

peter...@opencorporates.com

unread,
Dec 22, 2014, 8:39:43 AM12/22/14
to opencorporat...@googlegroups.com
Hi Sean, 

Licence number is not currently a required field for the Simple licence schema (our documentation needs revising there) so if there are no licence numbers available then there does not need to be a field for them. The actual currently required fields for the Simple licence schema are: "source_url", "sample_date", "company_name", "company_jurisdiction".

I hope I've answered your question & do feel free to be in touch.

- Peter
Reply all
Reply to author
Forward
0 new messages