URLTextSearcher and ampersands in the URL

John Lockwood

unread,

Jul 31, 2023, 1:59:55 PM7/31/23

to autopkg-discuss

Ugh! In order to create my latest recipe, I need to load a webpage and search for the download URL which is a long ugly URL containing amongst other things a time limited id string to authenticate the download request.

I have managed to create a regex string which matches but URLDownloader is failing.

This appears to be because the URL contains multiple copies of & which is the official escaped version of an ampersand.

Here is the snippet of the actual html code that contains the URL - note by the time you read this the values will have expired.

href="//cdn.document360.io/e5d71abd-07b9-46d0-8876-03cc9073df6b/Images/Documentation/Jamf%20Compliance%20Editor%20v1.1.5.tar.gz?sv=2019-07-07&sig=eJq3H6fTd%2BZ2OkpkmtvcVK%2BHcwjmyF3RmG8FD61CCDE%3D&spr=https%2Chttp&st=2023-07-31T16%3A15%3A33Z&se=2023-07-31T16%3A25%3A33Z&srt=o&ss=b&sp=r">Jamf Compliance Editor v1.1.5.tar.gz</a>

My regex is -

<key>re_pattern</key>
<string>(\/\/cdn\.document360\.io\/\S{8}-\S{4}-\S{4}-\S{4}-\S{12}\/Images\/Documentation\/Jamf%20Compliance%20Editor%20v\d+.\d+.\d+.tar.gz\?sv=\d{4}-\d{2}-\d{2}.*spr=https%2Chttp.*st=\d{4}-\d{2}-\d{2}.*se=\d{4}-\d{2}-\d{2}.*srt=.*sp=r)</string>

This when used with the following in URLDownloader produces the following URL

<key>Processor</key>
<string>URLDownloader</string>
<key>Arguments</key>
<dict>
<key>url</key>
<string>https:%match%</string>
</dict>

URL =

https://cdn.document360.io/e5d71abd-07b9-46d0-8876-03cc9073df6b/Images/Documentation/Jamf%20Compliance%20Editor%20v1.1.5.tar.gz?sv=2019-07-07&amp;sig=uMi62HCZg8XAcM33ADUS1WGZ3hTNqwuwBBaGp7Cyt50%3D&amp;spr=https%2Chttp&amp;st=2023-07-31T17%3A40%3A05Z&amp;se=2023-07-31T17%3A50%3A05Z&amp;srt=o&amp;ss=b&amp;sp=r

As you will see the original & has been changed to &amp; and hence the URL is rendered incorrect and produces a 404 error.

Help much appreciated!

Anthony Reimer

unread,

Jul 31, 2023, 2:15:31 PM7/31/23

to autopkg...@googlegroups.com

Why are you trying to use such a complicated regular expression? Everything after the ? could be captured by `[^"]+)"`.

Anthony Reimer

Integrated Arts Media Labs

Brief message from my iPhone

From: autopkg...@googlegroups.com <autopkg...@googlegroups.com> on behalf of John Lockwood <jeloc...@gmail.com>
Sent: Monday, July 31, 2023 11:59:55 AM
To: autopkg-discuss <autopkg...@googlegroups.com>
Subject: [autopkg-discuss] URLTextSearcher and ampersands in the URL

[△EXTERNAL]

--
You received this message because you are subscribed to the Google Groups "autopkg-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to autopkg-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/autopkg-discuss/41ecab69-e2b8-4418-9c58-519d6b891bd9n%40googlegroups.com.

John Lockwood

unread,

Aug 1, 2023, 4:24:58 AM8/1/23

to autopkg-discuss

@Anthony Reimer

Thanks for the regex suggestion, it of course worked and did make it simpler, new regex is below. However it still does not solve the problem regarding the undesired modification of & entries in the URL and hence I am still getting 404 errors.

<key>re_pattern</key>
<string>(\/\/cdn\.document360\.io\/\S{8}-\S{4}-\S{4}-\S{4}-\S{12}\/Images\/Documentation\/Jamf%20Compliance%20Editor%20v\d+.\d+.\d+.tar.gz\?[^"]+)</string>

Graham Pugh

unread,

Aug 1, 2023, 4:37:00 AM8/1/23

to autopkg...@googlegroups.com

If you want a working recipe for JamfComplianceExitor: https://github.com/autopkg/grahampugh-recipes/tree/main/JamfComplianceEditor

Cheers,

Graham

Sent from my iPhone

On 1 Aug 2023, at 10:25, John Lockwood <jeloc...@gmail.com> wrote:

@Anthony Reimer

To view this discussion on the web visit https://groups.google.com/d/msgid/autopkg-discuss/b9db3913-7fbd-43cd-98a1-1c8cb66d41cfn%40googlegroups.com.

John Lockwood

unread,

Aug 1, 2023, 8:18:06 AM8/1/23

to autopkg-discuss

@Graham Pugh
Many thanks for pointing me at your existing download recipe.

I have modified my Munki recipe to point to your download recipe and got it all working.

I can see from your download recipe I was on the right track but along with some steps I had overlooked and not yet added e.g. unarchive the main mistake I made I think was not including a user agent entry for the URLTextSearcher.

I had been testing my own download recipe first before trying the Munki one.

For anyone interested in a Munki recipe for this, see - https://github.com/jelockwood/recipes/tree/master/Jamf%20Compliance%20Editor

Reply all

Reply to author

Forward