Importing .txt files into Stata (or R)

Thomas Wiener

unread,

Oct 16, 2021, 10:21:26 PM10/16/21

to OpenSecrets Open Data

Hello all,

I am trying to import the bulk data .txt files into Stata, and have some limited experience with the software. Is the a recommended method that OpenSecrets/CRP have for importing the bulk data into Stata that handles the pipeline characters, and knows to ignore the commas within pipelines/strings?

Would I be better off trying to import the data into R?

I have tried manually editing the .txt files to replace the pipeline character with quotation characters or something that helps Stata identify the right delimiters, but there is too much data for my computer to handle. I was wondering also if anyone had had similar experiences with this.

Thank you for any help!

Thomas

Jakob Stoll

unread,

Jan 26, 2022, 3:15:43 AM1/26/22

to OpenSecrets Open Data

Push. I have the same problem. Would be great if you have found a solution or anyone else.

Best,

Jakob

Rasmus Dam Poulsen (AU)

unread,

Mar 3, 2022, 6:14:17 AM3/3/22

to OpenSecrets Open Data

Hey.

Did any of you find a solution?

Best regards,

Rasmus

Jakob Stoll

unread,

Mar 3, 2022, 7:00:26 AM3/3/22

to OpenSecrets Open Data

Hi Rasmus,

I suggest to use find-and-replace via Terminal or a third-party-program. See for example https://stackoverflow.com/questions/6951687/find-and-replace-text-in-a-47gb-large-file

Best,

Jakob

Nick Jenkins

unread,

Mar 3, 2022, 12:18:31 PM3/3/22

to OpenSecrets Open Data

One way to import the data correctly into R is with the readr package. The trick is to set the delimiter to a comma and add the pipe as a quotation character:

read_delim("lob_bills.txt", delim = ",", col_names = FALSE, quote = "|")

This has worked for me with the lobbying datasets.

Nick

Rasmus Dam Poulsen (AU)

unread,

Mar 4, 2022, 4:15:07 AM3/4/22

to OpenSecrets Open Data

Great, Nick. This worked perfectly.
Thank you very much!

Reply all

Reply to author

Forward