More Efficient File Structure and Data Accessibility

eric_...@jsi.com

unread,

Mar 7, 2014, 1:51:53 PM3/7/14

to np...@googlegroups.com

File Structure:

The taxonomy code info and the Other Provider Identifier info should be in separate tables that link via the NPI#. The current structure is very inefficient and requires lengthy duplicate logic to locate the records with the desired profile.

Data Accessibility:

Create a query interface on the web to limit the downloads by state and perhaps also basic provider characteristics/taxonomies rather than downloading the entire file.
GIS: Even better/ in addition would be to create a shared GIS data layer (ArcGIS online, IMS server, etc.) with the records geocoded and all of the associated attribute data available behind the records. This would allow the data to be easily added to GIS enabled projects and extracted for a particular area. The HRSA Geospatial Data Warehouse is a good example of using an IMS layer in this way and ArcGIS online is increasingly becoming the norm for sharing such data.

Alan Viars

unread,

Mar 10, 2014, 2:44:13 PM3/10/14

to np...@googlegroups.com

I'm working on doing what you are asking here. Yes the current file structure is difficult to parse. Multiple tables needed for sure.

Alan Viars

unread,

Mar 10, 2014, 2:44:49 PM3/10/14

to np...@googlegroups.com

would you like to give the API a dry run? Email me for credentials and instructions.

Alan

On Friday, March 7, 2014 1:51:53 PM UTC-5, eric_...@jsi.com wrote:

Chad Kopcak

unread,

Mar 19, 2014, 12:57:01 PM3/19/14

to np...@googlegroups.com

Hi Alan... so is the NPPES data not in a relational format currently? I sort of figured it was, and then was just pulled into a flat file. The number of columns that the file contains is a bit daunting at first, as well as the column headers (field names) being so verbose. Thankfully I was able to develop a process to bring the file in and deal with just the columns that were applicable, and making it somewhat relational and normalized for my purposes.

Alan Viars

unread,

Mar 25, 2014, 12:31:50 PM3/25/14

to np...@googlegroups.com

Hi Chad:

I too have built a process for "flattening" certain data from the public file. For example one taxonomy per row instead of _1, _2, _3, etc.

If you are interested in these flattened CSV files let me know.

Alan

Alan Viars

unread,

May 8, 2014, 3:52:15 PM5/8/14

to np...@googlegroups.com

Here you go:

https://github.com/hhsidealab/provider-data-tools

See post.

Alan

Darrell DeVeaux

unread,

May 28, 2014, 9:21:33 AM5/28/14

to np...@googlegroups.com

No question this is needed and good that Alan is working on. I think we have about 10 normalized tables to hold the data (address, phone, other ids, name, etc.)

On Friday, March 7, 2014 1:51:53 PM UTC-5, eric_...@jsi.com wrote:

Alan Viars

unread,

Jun 2, 2014, 12:00:51 PM6/2/14

to np...@googlegroups.com

Current NPPES system uses a relational database. The new NPPES also uses a relational database. You can see the tables in the files "models.py" in the source code. I've also built up a read-only data dissemination API using MongoDB.

Hope that helps,

Best,

Alan

Reply all

Reply to author

Forward