Importing .txt files into Stata (or R)

366 views
Skip to first unread message

Thomas Wiener

unread,
Oct 16, 2021, 10:21:26 PM10/16/21
to OpenSecrets Open Data
Hello all,

I am trying to import the bulk data .txt files into Stata, and have some limited experience with the software. Is the a recommended method that OpenSecrets/CRP have for importing the bulk data into Stata that handles the pipeline characters, and knows to ignore the commas within pipelines/strings?
Would I be better off trying to import the data into R?

I have tried manually editing the .txt files to replace the pipeline character with quotation characters or something that helps Stata identify the right delimiters, but there is too much data for my computer to handle. I was wondering also if anyone had had similar experiences with this.

Thank you for any help!

Thomas


Jakob Stoll

unread,
Jan 26, 2022, 3:15:43 AM1/26/22
to OpenSecrets Open Data
Push. I have the same problem. Would be great if you have found a solution or anyone else.

Best,
Jakob

Rasmus Dam Poulsen (AU)

unread,
Mar 3, 2022, 6:14:17 AM3/3/22
to OpenSecrets Open Data
Hey.

Did any of you find a solution? 

Best regards,
Rasmus

Jakob Stoll

unread,
Mar 3, 2022, 7:00:26 AM3/3/22
to OpenSecrets Open Data
Hi Rasmus,

I suggest to use find-and-replace via Terminal or a third-party-program. See for example https://stackoverflow.com/questions/6951687/find-and-replace-text-in-a-47gb-large-file

Best,
Jakob

Nick Jenkins

unread,
Mar 3, 2022, 12:18:31 PM3/3/22
to OpenSecrets Open Data
One way to import the data correctly into R is with the readr package. The trick is to set the delimiter to a comma and add the pipe as a quotation character:

read_delim("lob_bills.txt", delim = ",",  col_names = FALSE,  quote = "|") 

This has worked for me with the lobbying datasets. 

Nick

Rasmus Dam Poulsen (AU)

unread,
Mar 4, 2022, 4:15:07 AM3/4/22
to OpenSecrets Open Data
Great, Nick. This worked perfectly. 
Thank you very much!
Reply all
Reply to author
Forward
0 new messages