Regex replacement item for entire line?

81 views
Skip to first unread message

David Rostenne

unread,
Oct 1, 2019, 2:05:05 PM10/1/19
to bbe...@googlegroups.com
Hi folks,

I have a regex that takes a url apart and makes me a csv of the components:

Regex:
http://ftp\.newedinburgh\.ca/wp-content/uploads/2019/[019]+/([0-9]+)[_.]([0-9]+)-([a-zA-Z]+).*

and the replacement:
\3 \1, NEN PDF, \1/\2/01, &

which gives, for example:
April 1976, NEN PDF, 1976/04/01, http://ftp.newedinburgh.ca/wp-content/uploads/2019/09/1976_04-April-New-Edinburgh-News_web.pdf

I am wondering if there is a way to leave the http…2019 out of the regex and, instead of the & (which matches entire regex selection) in the replacement, use something else to represent the entire source line.

As you can see i’ve found a solution.. but I am always curious to see if there is a simpler way.

Cheers,

Dave

Kerri Hicks

unread,
Oct 1, 2019, 4:51:45 PM10/1/19
to bbe...@googlegroups.com
If the URL is always exactly the same up to the point of the date that you're capturing now (as it appears it must be for the expression to match), you could create a capture group of the first 51 characters (assuming I counted them right), and then use a backreference to that. I'm not sure that would be better in any way, but it would be a shorter regex.

--Kerri

--
This is the BBEdit Talk public discussion group. If you have a
feature request or need technical support, please email
"sup...@barebones.com" rather than posting to the group.
Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/545B40B3-48A6-47FA-9161-ACE45EA55B3C%40gmail.com.

David Rostenne

unread,
Oct 1, 2019, 5:47:13 PM10/1/19
to bbe...@googlegroups.com
Good idea.. I will try that!

Cheers,

Dave
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/CAEmA4uZ-j0aRxfKXDmmwK0q0Va0E%2BLJ-wn2b0dswW_iyn-9s7A%40mail.gmail.com.

ThePorgie

unread,
Oct 2, 2019, 7:45:02 AM10/2/19
to BBEdit Talk
Would something like this be what you're looking for?

.+?/2019/[019]+/([0-9]+)[_.]([0-9]+)-([a-zA-Z]+).* 

This will give you the same result with your replacement string.

\3 \1, NEN PDF, \1/\2/01, & 

The ".+?" tells the expression to find everything till the rest of the expression that remains matches. If anything is off in the rest of the expression it won't find a result in the string being searched.

Is that what you're looking for?

David Rostenne

unread,
Oct 2, 2019, 8:01:34 AM10/2/19
to bbe...@googlegroups.com
I was hoping for an alternative to the & in the replacement strung that means ‘entire source line’ but this works perfectly.

Thanks!

Cheers,

Dave
> --
> This is the BBEdit Talk public discussion group. If you have a
> feature request or need technical support, please email
> "sup...@barebones.com" rather than posting to the group.
> Follow @bbedit on Twitter: <https://twitter.com/bbedit>
> ---
> You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/3aea19a9-d72e-4afb-ad16-8cf4e336e9b7%40googlegroups.com.

ThePorgie

unread,
Oct 3, 2019, 6:34:40 PM10/3/19
to BBEdit Talk
I was looking at this again and I think I now understand what you're asking for....You don't want to have to write the regex to find the text before what you're wanting to put into your delimited result, but you also would like it to put the result in front of the whole string. Have I got that right? If so, I don't think you can do that with a regex...err, should I say "I certainly don't know how to pull that one off".
> To unsubscribe from this group and stop receiving emails from it, send an email to bbe...@googlegroups.com.

Bruce Van Allen

unread,
Oct 3, 2019, 8:16:28 PM10/3/19
to bbe...@googlegroups.com
On 10/2/19 at 5:01 AM, dros...@gmail.com (David Rostenne) wrote:

>I was hoping for an alternative to the & in the replacement
>strung that means ‘entire source line’ but this works perfectly.

Surround the whole pattern with parentheses. That will then be
the capture contained in \1. You will then have to increment
each other capture by 1.

Original:
http://ftp.newedinburgh.ca/wp-content/uploads/2019/09/1976_04-April-New-Edinburgh-News_web.pdf

Pattern:
(.+?/2019/[019]+/([0-9]+)[_.]([0-9]+)-([a-zA-Z]+).*\n)

Replacement:
\4 \2, NEN PDF, \2/\3/01, \1

Result:
Note that the above pattern includes the end of line ("\n"); if
you don't want that, move the closing parenthesis to before the "\n".

HTH
--

- Bruce

_bruce__van_allen__santa_cruz__ca_

David Rostenne

unread,
Oct 5, 2019, 9:38:43 AM10/5/19
to bbe...@googlegroups.com
Nice solution.. very elegant and simple. Thanks, I’ll use this from now on!

Cheers,

Dave
> --
> This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email
> "sup...@barebones.com" rather than posting to the group.
> Follow @bbedit on Twitter: <https://twitter.com/bbedit>
> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/r480Ps-10146i-E76E9325F0C54E60B0B97B9CF35E9C2E%40Forest.local.

Roland Küffner

unread,
Oct 7, 2019, 3:49:18 AM10/7/19
to BBEdit Talk
Hi,
Bruce Van Allen has alredady provided a solution - so, just out of curiosity: I was wondering why you were searching for a replacement for the "&" in the replacement string? It does exactly what you want and it is a single character. It's hard to get that simpler or more elegant. But maybe I'm missing a thought.

Roland

David Rostenne

unread,
Oct 8, 2019, 8:38:54 PM10/8/19
to bbe...@googlegroups.com
Hi Roland,

The solutions proposed all work.. but they require that the regex pattern matches the entire line first. i was being lazy and hoping for a way to grab the entire line without matching it first.

Cheers,

Dave
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/CABybPXaGnh82gD6w%3DiwmEYGOht-ydewr3QzhqkHU-nX56%2Ba1FA%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages