URLTextSearcher and ampersands in the URL

6 views
Skip to first unread message

John Lockwood

unread,
Jul 31, 2023, 1:59:55 PM7/31/23
to autopkg-discuss
Ugh! In order to create my latest recipe, I need to load a webpage and search for the download URL which is a long ugly URL containing amongst other things a time limited id string to authenticate the download request.

I have managed to create a regex string which matches but URLDownloader is failing.

This appears to be because the URL contains multiple copies of & which is the official escaped version of an ampersand.

Here is the snippet of the actual html code that contains the URL - note by the time you read this the values will have expired.


My regex is - 

                                <key>re_pattern</key>
<string>(\/\/cdn\.document360\.io\/\S{8}-\S{4}-\S{4}-\S{4}-\S{12}\/Images\/Documentation\/Jamf%20Compliance%20Editor%20v\d+.\d+.\d+.tar.gz\?sv=\d{4}-\d{2}-\d{2}.*spr=https%2Chttp.*st=\d{4}-\d{2}-\d{2}.*se=\d{4}-\d{2}-\d{2}.*srt=.*sp=r)</string>

This when used with the following in URLDownloader produces the following URL

<key>Processor</key>
<string>URLDownloader</string>
<key>Arguments</key>
<dict>
<key>url</key>
<string>https:%match%</string>
</dict>

URL = 


As you will see the original &amp; has been changed to &amp;amp; and hence the URL is rendered incorrect and produces a 404 error.

Help much appreciated!

Anthony Reimer

unread,
Jul 31, 2023, 2:15:31 PM7/31/23
to autopkg...@googlegroups.com
Why are you trying to use such a complicated regular expression? Everything after the ? could be captured by `[^"]+)"`. 

Anthony Reimer
Integrated Arts Media Labs
Brief message from my iPhone

From: autopkg...@googlegroups.com <autopkg...@googlegroups.com> on behalf of John Lockwood <jeloc...@gmail.com>
Sent: Monday, July 31, 2023 11:59:55 AM
To: autopkg-discuss <autopkg...@googlegroups.com>
Subject: [autopkg-discuss] URLTextSearcher and ampersands in the URL
 
[△EXTERNAL]


--
You received this message because you are subscribed to the Google Groups "autopkg-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to autopkg-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/autopkg-discuss/41ecab69-e2b8-4418-9c58-519d6b891bd9n%40googlegroups.com.

John Lockwood

unread,
Aug 1, 2023, 4:24:58 AM8/1/23
to autopkg-discuss
@Anthony Reimer

Thanks for the regex suggestion, it of course worked and did make it simpler, new regex is below. However it still does not solve the problem regarding the undesired modification of &amp; entries in the URL and hence I am still getting 404 errors.

<key>re_pattern</key>
<string>(\/\/cdn\.document360\.io\/\S{8}-\S{4}-\S{4}-\S{4}-\S{12}\/Images\/Documentation\/Jamf%20Compliance%20Editor%20v\d+.\d+.\d+.tar.gz\?[^"]+)</string>

Graham Pugh

unread,
Aug 1, 2023, 4:37:00 AM8/1/23
to autopkg...@googlegroups.com
If you want a working recipe for JamfComplianceExitor: https://github.com/autopkg/grahampugh-recipes/tree/main/JamfComplianceEditor

Cheers,
Graham
 

Sent from my iPhone

On 1 Aug 2023, at 10:25, John Lockwood <jeloc...@gmail.com> wrote:

@Anthony Reimer

John Lockwood

unread,
Aug 1, 2023, 8:18:06 AM8/1/23
to autopkg-discuss
@Graham Pugh
Many thanks for pointing me at your existing download recipe.

I have modified my Munki recipe to point to your download recipe and got it all working.

I can see from your download recipe I was on the right track but along with some steps I had overlooked and not yet added e.g. unarchive the main mistake I made I think was not including a user agent entry for the URLTextSearcher.

I had been testing my own download recipe first before trying the Munki one.

For anyone interested in a Munki recipe for this, see - https://github.com/jelockwood/recipes/tree/master/Jamf%20Compliance%20Editor
Reply all
Reply to author
Forward
0 new messages