Another way to process HTML in GAS server side?

Skip to first unread message

ChiefChippy2 is awesome

Apr 15, 2019, 1:27:13 PM4/15/19
to Google Apps Script Community
So I often have HTML to process in GAS, but I can't always do it in the client side.
In GAS server side the (best) only way to process HTML is by XmlService, but it often returns an error when parsing the HTML.
A specific example would be :
1.I have a piece of HTML, i.e
<h1 class="title">Title</h1>
<span class="text">Lorem Ipsum...</span>
<img src="/img/lorem.png">
<span class="more text">Why Ipsum?</span>
<h1 class="title">Another title</h1>
<span class="text">Not Lorem Ipsum...but Ipsum Lorem</span>
<img src="/img/ipsum.png">
<span class="more text">Why not Lorem?</span>
2. I want to make this into a JSON format - I want something like

{"body":[{"title":"Title","text":["Lorem Ipsum...","Why Ipsum?"]},{"title":"Title","text":["Not Lorem Ipsum... but Ipsum Lorem","Why not Lorem?"]}]}

In JavaScript I would do something like (I know it is not optimized ):

var JsOn={}
var array=[]
for(var i =0;i<document.getElementsByTagName("h1").length;i++){
var txt=document.getElementsByTagName("h1")[i].getElementsByClassName("text")

But I can't use this in GAS server side since document is a client side thing.
If I am right it is possible via search and split but it is gonna be painfully complicated for me.
Thanks for any thoughts or advices.

Adam Morris

Apr 15, 2019, 9:29:40 PM4/15/19
Client-side there is probably a library that will help you parse the
HTML to get what you want. Probably even jQuery.
But yes, don't use regular expressions or string manipulations; you
are intending to parse a tree and so use a solution that handles that
> --
> You received this message because you are subscribed to the Google Groups
> "Google Apps Script Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
> Visit this group at
> To view this discussion on the web visit
> For more options, visit


*Adam Morris** | IT Systems & English Teacher | **IGB*
* International School*Jalan Sierramas Utama, Sierramas,
47000 Sungai Buloh, Selangor DE, Malaysia

*t *+60 3 6145 4688
*f *+60 3 6145 4600
*w *
*e <>*

Romain Vialard

Apr 16, 2019, 4:24:50 AM4/16/19
to Google Apps Script Community
Here's an example of functions getElementById(), getElementsByClassName() & getElementsByTagName() that work well with the XML Service:

But indeed, if the XmlService returns an error when parsing your HTML in won't be helpful.
You could try the XmlService on a substring of your HTML or indeed use regex or search and split if you are not able to use the XmlService at all...
Reply all
Reply to author
0 new messages