Re: Can you use regular expression to grab the last 5 characters of a word?

5,630 views
Skip to first unread message

Nestor E. Aguilera

unread,
Dec 12, 2012, 10:37:45 AM12/12/12
to textwr...@googlegroups.com

On 12 Dec 2012, at 03:25, JT wrote:

> Hi All,
>
> I'm trying to find an expression to grab the last 16 characters of a word but can't figure it out. The issue is that the number of characters before the last 5 characters are variable.
>
> EG:
>
> abdsadsajkeaw_12345
> bdGdsadsa_jkeaw_abcde
> ass0012_67890

Are you trying to grab the last 16 or the last 5 characters?

If the latter, you could try with

(\w*)(\w{5})

and replace with, say, \2--\1 to see the effect.

If you are trying to grab the last 16, the problem is that you may have fewer characters as in your last example string. In this case perhaps you could try with

(\w*)(\w{5,16})

or variants. See page 143 of TextWrangler's manual.

> Does bbedit/regex have a back count feature that I haven't found?

I don't know about bbedit.

Best,

Nestor

Thomas Fischer

unread,
Dec 12, 2012, 1:10:13 PM12/12/12
to textwr...@googlegroups.com
Hello,

it depends a little on what you want to do with the hits, "grab" isn't quite clear. Something like grep search for 
(\w{5})\b
would find the last five characters in a word, provided you and TextWrangler agree on the definition of a word (\b finds a word boundary).
In your example I would probably rather look for the '_'.

And I don't know what you mean with "back count", but there is something called "look behind" in the RegExp engine, check the manual for details.

Best
Thomas

Am 12.12.2012 um 07:25 schrieb JT:

Hi All,

I'm trying to find an expression to grab the last 16 characters of a word but can't figure it out. The issue is that the number of characters before the last 5 characters are variable.

EG:

abdsadsajkeaw_12345
bdGdsadsa_jkeaw_abcde
ass0012_67890

Does bbedit/regex have a back count feature that I haven't found?


--
You received this message because you are subscribed to the
"TextWrangler Talk" discussion group on Google Groups.
To post to this group, send email to textwr...@googlegroups.com
To unsubscribe from this group, send email to
textwrangler...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/textwrangler?hl=en
If you have a feature request or would like to report a problem,
please email "sup...@barebones.com" rather than posting to the group.

Steve

unread,
Dec 12, 2012, 9:27:28 PM12/12/12
to textwr...@googlegroups.com
If you want to totally ignore the preceding characters and capture only the last 5 (or 16, or however many characters you want), use the \b (word boundary) and \B (non-word boundary) characters together:

(\B[a-f\d]{5}\b)

abdsadsajkeaw_12345
bdGdsadsa_jkeaw_abcde
ass0012_67890

In this example code, the \B will ensure that there is no space between the '_' character and the last 5 characters, as you don't want it to capture if the word is ONLY 5 characters long.
Then it searches for {5} characters that fit within the range of [a-f] or a digit (\d) -- which you can easily modify to be whatever characters you need.
Last, it ensures that the last of the {5} characters is not followed by another character: it is at the word boundary.

Putting everything within () parentheses ensures that you capture those characters, for use later as a \1 replacement.
Reply all
Reply to author
Forward
0 new messages