[Mismatch] Lobbying Data (Bulk) vs Web (opensecrets.org/lobby/)

Skip to first unread message

dshin

unread,
Aug 29, 2018, 5:02:12 PM8/29/18
to OpenSecrets Open Data
Hi there, 

I am a recent user of the OpenSecrets lobbying database - first of all, thank you very much for all that contributes to the database. 

To obtain the lobbying data for a particular firm, year and industry, I downloaded the bulk lobbying data available at https://www.opensecrets.org/bulk-data/downloads#lobbying.
Then, I tried to compare the information from the bulk data set to information obtained from the OpenSecrets looby lookup website at https://www.opensecrets.org/lobby/lookup.php.

I noticed a slight difference in dollar amount between two sources. The below is a specific example. 

I want to get lobbying information of the client Merck (Merck & Co) in 1998 on Pharmaceutical Industry (Catcode: H4300).
From the bulk data, the total lobbying amount for the specification is $ 5,791,731 spent on 33 lobbyists

whereas from the website, for the same specifications (Client: Merck & Co, 1998 summary) the total expenditure is $ 5,000,000 spent on 12 lobbyists.

Could someone please give some insights on such differences? 
Is it merely because of reporting differences in the web and the bulk data? Or is there something that I am missing? 

Thank you so much for your time. 



Daniel Auble

unread,
Aug 30, 2018, 9:39:28 AM8/30/18
to OpenSecrets Open Data
Hello, 
Getting totals by organization requires the [ind] and [use] variables. For the spending only include records where ind='y' and to count up lobbyists where [use]='y'. These criteria will make sure you don't count reports that were later amended and will not double count money reported in both a client's self filing and in the reports filed by lobby firms they hired.
Best,
Dan Auble
Senior Researcher

dshin

unread,
Aug 30, 2018, 1:03:13 PM8/30/18
to OpenSecrets Open Data
Dan, thank you for the response.
Using only [ind]='y' and [use]='y' on the bulk data set, I was able to match the expenditure total of a given year for a given company from the bulk data to that on the OpenSecret website. 

I replied to you privately with another question. 
But I thought I share this for the public as well. 

Would there be a way to know from the bulk data, how many issues were lobbied for a given year for a given company? Using the same sample, how many issues did Merck lobby for in 1998 with 5,000,000 (total lobbying expenditure)? 

On the contrary, this information seems to be present on the OpenScret website (below), but in an ambiguous way. 


There are a total of 12 issues that Merck lobbied for in 1998, where each issue generated as low as 1 to as many as 10 reports, without any specific issues. Does it suffice to say 5,000,000 was used to raise 12 issues? Or should I add up the total number of reports to calculate issues lobbied for (e.g. 46 reports = 46 issues?)

Thank you again for your time and help.

Best,
D Shin 
 

On Wednesday, August 29, 2018 at 5:02:12 PM UTC-4, dshin wrote:
Reply all
Reply to author
Forward
0 new messages