Small regex

15 views
Skip to first unread message

Kevin Ingwersen

unread,
Apr 21, 2013, 10:03:20 AM4/21/13
to textwr...@googlegroups.com
Hey!

So I got a very long file with informations and tasks, and the guy formated the file using a css-like syntax:

Task name {
Content
} [
Info
]

Sometimes the info block is one line like:
} [ Info ]

What Regex can I use to have the name as \1, the {} block as \2 and [] as \3?

Thanks and have a great day!

Steve

unread,
Apr 21, 2013, 9:25:29 PM4/21/13
to textwr...@googlegroups.com, ingwi...@googlemail.com
When you search, enable the Grep option. Parentheses are pattern groups (which you use \1, \2, etc. to reference).

Both the curly brackets {} and square brackets [] are special to grep/regex, so finding them in your search requires that you escape them with backslash \.

I find that, if Grep is checked, simply selecting that portion of code that you want and using Command-E will bring escape all the characters for you. It will make the text you entered look like this in the Find window:

Task name \{\r        Content\r\} \[\r        Info\r\]

As you can see, { becomes \{, [ becomes \[, and newlines become \r.

^(.*)\s*\{\s* (.*?)\s*\}\s*\[\s*(.*?)\s*\]

In this example, ^ at the beginning means to match the beginning of the line. (.*?) does two things: ( and ) make a group (which you will reference with \1) and .*? searches for "everything else on the line" but will stop "as early as possible" (if you leave off the ? at after .* then it just means "go for as far as possible"). \s* means "all whitespace" which includes newlines, tabs, and spaces. So you will find your first \1 group, then look for as much whitespace as possible, then find an opening {, then look for more whitespace (if any), then store everything it can in \2 until it finds more whitespace followed by the closing }, and do the same for the [ and ], storing its info into \3. This handles both your cases where the brackets can span single or multiple lines.

You can, of course, move the ( and ) around to suit your needs if you want to capture more or less than what I have in the example.

-Steve

Christopher Stone

unread,
Apr 22, 2013, 8:05:32 AM4/22/13
to textwr...@googlegroups.com
On Apr 21, 2013, at 09:03, Kevin Ingwersen <ingwi...@googlemail.com> wrote:
> So I got a very long file with informations and tasks, and the guy formated the file using a css-like syntax:

______________________________________________________________________

Hey Kevin,

Please note that when asking for help with this sort of thing it's much better to offer as real a data sample as possible. It usually saves time and trouble.

From what you describe your task is possible with regex find/replace, but it would appear to be much easier to use a Text Filter.

As you'll see below this filter just cleans up the formatting of the tasks a bit using sequential regex find/replace.

From there it's really simple to get rid of braces and brackets and change the formatting.

Your TextWrangler text filters folder should be here:

~/Library/Application Support/TextWrangler/Text Filters/

If it's not then create it.

Save the text filter below as a file in that folder.

Apply it from the {MENU-BAR} --> {Text} --> {Apply Text Filter} --> {Your Filter Name}

--
Best Regards,
Chris

#-------------------------------------------------------------------------------------------
# TEXT FILTER
#-------------------------------------------------------------------------------------------

#! /usr/bin/env perl -0777 -n
use v5.12; use strict; use warnings;
#---------------------------------------

s!\s*\{\s*!\n{!gi;
s!\s*\}\s*!}!gi;
s!\s*\[\s*!\n[!gi;
s!\s*\]!]!gi;

print;

#-------------------------------------------------------------------------------------------
# TEST DATA
#-------------------------------------------------------------------------------------------

Task Name 1 {
Study JFK Speech.
} [
"Now is the time for all good men to come to the aid of their country."
]


Task_Name_2 {
More Contents
} [ Info ]


Task_Name_3 {
An unwrapped study of Lorem Ipsum.
} [

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

]


Task Name 4 {
A compressed study of Lorem Ipsum.
} [

Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit
esse cillum dolore eu fugiat nulla pariatur. Excepteur sint
occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum.

]

#-------------------------------------------------------------------------------------------
# RESULT
#-------------------------------------------------------------------------------------------

Task Name 1
{Study JFK Speech.}
["Now is the time for all good men to come to the aid of their country."]


Task_Name_2
{More Contents}
[Info]


Task_Name_3
{An unwrapped study of Lorem Ipsum.}
[Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.]


Task Name 4
{A compressed study of Lorem Ipsum.}
[Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit
esse cillum dolore eu fugiat nulla pariatur. Excepteur sint
occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum.]

#-------------------------------------------------------------------------------------------
Reply all
Reply to author
Forward
0 new messages