Scraping from Google Docs to Google Sheets

684 views
Skip to first unread message

Rebecca Buena

unread,
Aug 16, 2023, 11:07:53 AM8/16/23
to Google Apps Script Community
Hello everyone, 

I was tasked to scrape data from a bunch of google documents inside one folder into a google spreadsheet. Is there any way I can do that using appscript? Googlesheets built in import functions does not have an option for a google document. I tried researching for codes but cant seem to successfully run one. 

Also, if automation would be possible that every time a new document is added to the folder, the spreadsheet automatically scrapes the data.

Thanks in advance everyone!
Message has been deleted

Rebecca Buena

unread,
Aug 16, 2023, 11:38:19 AM8/16/23
to Google Apps Script Community

```javascript

function scrapeGoogleDocs() {

  // Define the URL of the Google Docs file to be scraped

  var docUrl = "https://docs.google.com/document/d/DOC_ID";


  // Get the contents of the Google Docs file

  var doc = DocumentApp.openByUrl(docUrl);

  var body = doc.getBody();

  var paragraphs = body.getParagraphs();


  // Get the Google Sheets file and the active sheet

  var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();


  // Loop through the paragraphs and import them into Google Sheets

  for (var i = 0; i < paragraphs.length; i++) {

    var paragraph = paragraphs[i].getText();

    sheet.getRange(i + 1, 1).setValue(paragraph);

  }

}

```

Phil Bainbridge

unread,
Aug 17, 2023, 6:43:55 AM8/17/23
to Google Apps Script Community

Hi

I did some Apps Script stuff a couple of years ago around this kind of thing. All of the code is fully accessible and commented. Bulk extract text from Google Docs for analysis.

Kind regards
Phil

Rebecca Buena

unread,
Aug 17, 2023, 10:50:41 AM8/17/23
to Google Apps Script Community
Hi Phil,

Thank you so much for this! I tried it out and modified a few things however Im still not getting the output that I want. If you dont mind me asking, is there any way that I can scrape the data in google docs and import it on spreadsheet but it would be aligned horizontally across the sheet? The Google Docs they are using for notes are in table forms so it would be great if I can just automatically align it horizontally, in their respective columns. 

Thanks!

Phil Bainbridge

unread,
Aug 18, 2023, 3:16:00 PM8/18/23
to Google Apps Script Community

Hi Rebecca

Yes that is feasible - at the moment the data from the Doc is pushed into an array in a way that puts it over 2 columns. It all depends on the structure of your Doc as to how much work it would be to be honest.

Somebody else may be able to chip in and offer suggestions otherwise it would involve looking at what you have and what you want. It would be a little bit more involved for me given workloads so it would be part of Freelance work that I could take a look with you if you really needed something.

Kind regards
Phil
Reply all
Reply to author
Forward
0 new messages