Help with a GREP task

105 views
Skip to first unread message

David Brostoff

unread,
Jun 29, 2022, 9:59:15 PM6/29/22
to BBEdit-Talk
I have a list of numbers in this format:

123 56
789 01

How can I create two separate documents with 123 and 789 in one and 56 and 01 in the other?

David

David Kelly

unread,
Jun 29, 2022, 10:43:26 PM6/29/22
to bbe...@googlegroups.com

On Jun 29, 2022, at 8:59 PM, David Brostoff <dav...@earthlink.net> wrote:

> I have a list of numbers in this format:
>
> 123 56
> 789 01
>
> How can I create two separate documents with 123 and 789 in one and 56 and 01 in the other?

I don’t think that is a good grep task. It cries for awk.

In terminal.app it would be something like this:

awk ‘{ print $1 >> “col-1.txt”
print $2 >> “col-2.txt” }’ input.txt

Anyway, that is the idea. My awk is rusty.

--
David Kelly N4HHE, dke...@HiWAAY.net
============================================================
Whom computers would destroy, they must first drive mad.

Christopher Waterman

unread,
Jun 29, 2022, 10:58:31 PM6/29/22
to bbe...@googlegroups.com
It is pretty easy to do this with two finds with extract.
Extract opens the matches in a new document.

Find: ^\d{3}
Then hit extract

Find: \b\d{2}\b
Then hit extract again

Regex 1
^ = beginning of line
\d{3} = 3 digits

Regex 2
\b = beginning of “word"
\d{2} = 2 digits
\b = end of "word"



— Chris(topher)?



--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/D7CFE4CC-DB27-42F8-9A89-6877D6521CAB%40earthlink.net.

David Brostoff

unread,
Jun 29, 2022, 11:10:20 PM6/29/22
to BBEdit-Talk
On Jun 29, 2022, at 7:43 PM, David Kelly <dke...@hiwaay.net> wrote:
>
> In terminal.app it would be something like this:
>
> awk ‘{ print $1 >> “col-1.txt”
> print $2 >> “col-2.txt” }’ input.txt

Thank you -- I haven't used awk before though so I will have to get up to speed. (I asked about GREP only because I have used it a little in the past.)

David

David Brostoff

unread,
Jun 29, 2022, 11:18:49 PM6/29/22
to BBEdit-Talk
On Jun 29, 2022, at 7:58 PM, Christopher Waterman <ch...@rustydogink.com> wrote:
>
> It is pretty easy to do this with two finds with extract.
> Extract opens the matches in a new document.
>
> Find: ^\d{3}
> Then hit extract
>
> Find: \b\d{2}\b
> Then hit extract again

Thank you for the easy-to-follow instructions.

When I click extract, it highlights the matches and opens a new document but it's blank?

Is there another step?

David


David Brostoff

unread,
Jun 29, 2022, 11:22:50 PM6/29/22
to BBEdit-Talk
On Jun 29, 2022, at 7:43 PM, David Kelly <dke...@hiwaay.net> wrote:
>
> In terminal.app it would be something like this:
>
> awk ‘{ print $1 >> “col-1.txt”
> print $2 >> “col-2.txt” }’ input.txt

As I mentioned, I am completely ignorant of awk, so sorry for the basic question, but how do I get Terminal to point to the source document?

David

--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/F3146D85-3A4B-4D8A-9F49-B12324BA3951%40earthlink.net.

Christopher Waterman

unread,
Jun 29, 2022, 11:49:41 PM6/29/22
to bbe...@googlegroups.com
It breaks down like this:

awk = The command; it takes two parameters. Param 1: The script. Param 2: A Path to the source file.

‘{ print $1 >> “col-1.txt”
print $2 >> “col-2.txt” }’ = The Script, everything between the single quotes.

This script is printing or saving column 1 ($1) to a file called col-1.txt and column 2 ($2) to col-2.txt.

input.txt = Source file

Does that help?
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/A58A9E81-D6F0-4751-AC2D-03D95C4E15F6%40earthlink.net.

Christopher Waterman

unread,
Jun 29, 2022, 11:52:51 PM6/29/22
to bbe...@googlegroups.com
Well… No, no more steps.

If it highlights it should work. If it doesn’t find a match it should just beep without opening a new document.

I’m flummoxed. 🤷🏼‍♂️
Maybe if I could see the full doc I could figure out what is happening.

— Chris(topher)?
> --
> This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
> ---
> You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/555CFE8B-8E9A-48D4-B2C9-444F86E3FABB%40earthlink.net.

David Brostoff

unread,
Jun 30, 2022, 12:07:01 AM6/30/22
to bbe...@googlegroups.com
On Jun 29, 2022, at 8:49 PM, Christopher Waterman <ch...@rustydogink.com> wrote:
>
> Does that help?

Yes, now I get it -- thank you.

David

David Brostoff

unread,
Jun 30, 2022, 12:08:24 AM6/30/22
to BBEdit-Talk
On Jun 29, 2022, at 8:52 PM, Christopher Waterman <ch...@rustydogink.com> wrote:
>
> Well… No, no more steps.

Is it because I am using BBEdit in free mode?

David

David Brostoff

unread,
Jun 30, 2022, 12:19:00 AM6/30/22
to BBEdit-Talk
On Jun 29, 2022, at 8:52 PM, Christopher Waterman <ch...@rustydogink.com> wrote:
>
> Well… No, no more steps.

Mystery solved:

I somehow had the Regex command highlighted in the Find box. As soon as I clicked elsewhere to dismiss the highlighting, the extracted text appeared in the new document.

Thanks again -- I really appreciate your help.

David

--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/6B2A593B-C6C4-4EAE-A75F-5B93CAF26B17%40earthlink.net.

Kaveh

unread,
Jun 30, 2022, 5:06:09 AM6/30/22
to bbe...@googlegroups.com
David, Extract is a great feature. Simple but clever. I use it all the time for quickly analysing text I have scraped for example...



--
Kaveh Bazargan PhD
Director
Accelerating the Communication of Research

John E. Connerat

unread,
Jun 30, 2022, 8:57:01 AM6/30/22
to BBEdit Talk
Although this has been solved in numerous ways, there is one more solution that might work if all the data are formatted with exactly three digits followed by a space and two digits. It's something I use all the time with fixed-width column data, and it's particularly quick, especially if you are dealing with a column in the middle of many columns.

Let me expound upon the example slightly.

123 56
444 28
232 41
413 82
313 43
789 01

With a list like this you can put your cursor at the top left and click. Do not select the row. Then, (on the Mac), hold down the shift and the option key and click immediately to teh right of the 9 (In 789). Instead of highlighting all characters, all rows, you are just highlighting the exact 18 characters you want for "column 1." You can then copy or cut and do the same thing with column 2.

For a PC, I don't know the exact combination, but I would think it work similarly with Shift and Alt or something like that.

NOTE: if you have wide lines, and soft-wrap text is turned on and you have wrapping line, this won't work since it would give unintended results.

I use this all the time. It's a super-fast way to extract a delimited column of text from a bunch of rows without having to write a regex or use awk. It's just a way to invoke copy and paste interstitially!

-John

David Brostoff

unread,
Jun 30, 2022, 12:54:19 PM6/30/22
to BBEdit-Talk
On Jun 30, 2022, at 2:05 AM, Kaveh <ka...@rivervalleytechnologies.com> wrote:
>
> David, Extract is a great feature. Simple but clever. I use it all the time for quickly analysing text I have scraped for example...

Yes -- for the past couple of years I have only been using BBEdit in a very limited way and this was the first time I used Extract, but I can see its potential.

David

David Brostoff

unread,
Jun 30, 2022, 12:59:15 PM6/30/22
to BBEdit-Talk
On Jun 30, 2022, at 5:47 AM, John E. Connerat <john.c...@gmail.com> wrote:
>
> Although this has been solved in numerous ways, there is one more solution that might work if all the data are formatted with exactly three digits followed by a space and two digits. It's something I use all the time with fixed-width column data, and it's particularly quick, especially if you are dealing with a column in the middle of many columns.

Thanks for the tip -- I have done this occasionally in the past and had briefly thought about it for this but somehow didn't even try it.

David

David Brostoff

unread,
Jul 2, 2022, 10:26:31 PM7/2/22
to BBEdit-Talk
On Jun 29, 2022, at 8:49 PM, Christopher Waterman <ch...@rustydogink.com> wrote:
>
> It breaks down like this:
>
> awk = The command; it takes two parameters. Param 1: The script. Param 2: A Path to the source file.

Again sorry for the beginner question, but what format should I use for the path if, for example, the file is on my Desktop?

David

David Kelly

unread,
Jul 3, 2022, 12:29:56 AM7/3/22
to bbe...@googlegroups.com
Type the command line and rather than type the input file name just drag the file to the command line. Finder/Terminal will write the file's full path on the command line.

By default Terminal's current directory is your home directory, abbreviated ~ so that is where awk will write the output.

David Brostoff

unread,
Jul 3, 2022, 1:52:46 AM7/3/22
to bbe...@googlegroups.com
On Jul 2, 2022, at 9:29 PM, David Kelly <dke...@hiwaay.net> wrote:
>
> Type the command line and rather than type the input file name just drag the file to the command line. Finder/Terminal will write the file's full path on the command line.

Thanks for the tip, but I must be doing something wrong.

First I copy and paste this command:

awk ‘{ print $1 >> “col-1.txt”
print $2 >> “col-2.txt” }’ input.txt

Then I drag the source file to the command line and press Enter, which produces this error message:

awk: syntax error at source line 1
context is
>>> ? <<<
missing }
awk: bailing out at source line 1

Two text files are produced, but the one named "col-1.txt" is blank and the "col-2.txt" has this line, repeated three times:

}’ input.txt/Users/davidbrostoff/Desktop/Sample2010-2011.txt

David


David Brostoff

unread,
Jul 3, 2022, 1:55:50 AM7/3/22
to bbe...@googlegroups.com
On Jul 2, 2022, at 9:29 PM, David Kelly <dke...@hiwaay.net> wrote:
>
> Type the command line and rather than type the input file name just drag the file to the command line. Finder/Terminal will write the file's full path on the command line.

P.S. When I said the following line is repeated three times, I now realize that's only because I made three attempts:

}’ input.txt/Users/davidbrostoff/Desktop/Sample2010-2011.txt

David


Chris

unread,
Jul 3, 2022, 3:07:27 AM7/3/22
to bbe...@googlegroups.com

> On Jul 2, 2022, at 10:55 PM, David Brostoff <dav...@earthlink.net> wrote:
>
> }’ input.txt/Users/davidbrostoff/Desktop/Sample2010-2011.txt

You need to replace ‘input.txt’ with the file you are dragging in ‘/Users/davidbrostoff/Desktop/Sample2010-2011.txt’, as ‘input.txt‘was just a stand in for the file you’re working with.

--Chris(topher)?

David Kelly

unread,
Jul 3, 2022, 10:27:02 AM7/3/22
to bbe...@googlegroups.com
Sorry about previously not testing it myself and trying to lead you down a problematic path of bundling the awk script on a split command line. Some things work in bash that don't work in csh or zsh.

This works no matter what shell:

Create an awk script file. Lets call it "script.awk" that looks like this:

{
print $1 >> "col-1.txt"
print $2 >> "col-2.txt"
}

If memory serves the leading tabs may not be necessary, but the above is tested. The tab before first { is where one puts the line-match grep pattern but in awk a blank matches all lines. For instance you could make the script only split lines that contain numeric digits.

Then type "awk -f script.awk " with a trailing space then drag your file to the command line.

Might want to delete or rename previous col-[12].txt because each invocation will add new contents to existing files. If you run the script twice you will get the 2nd run appended to the first.

David Brostoff

unread,
Jul 3, 2022, 2:17:09 PM7/3/22
to BBEdit-Talk
I had tried that but it still didn't work -- thanks anyway.

David

David Brostoff

unread,
Jul 3, 2022, 2:43:13 PM7/3/22
to BBEdit-Talk
On Jul 3, 2022, at 7:26 AM, David Kelly <dke...@hiwaay.net> wrote:
>
> Create an awk script file. Lets call it "script.awk" that looks like this:
>
> {
> print $1 >> "col-1.txt"
> print $2 >> "col-2.txt"
> }

Is creating an awk script file different from entering the above script in Terminal?

If not, do I add "awk" before the leading curly bracket?

Thank you,

David

Rod Buchanan

unread,
Jul 3, 2022, 3:14:31 PM7/3/22
to 'Dmitry Markman' via BBEdit Talk
On Jun 29, 2022, at 10:22 PM, David Brostoff <dav...@earthlink.net> wrote:


> On Jun 29, 2022, at 7:43 PM, David Kelly <dke...@hiwaay.net> wrote:
>>
>> In terminal.app it would be something like this:
>>
>> awk ‘{ print $1 >> “col-1.txt”
>> print $2 >> “col-2.txt” }’ input.txt
>
> As I mentioned, I am completely ignorant of awk, so sorry for the basic question, but how do I get Terminal to point to the source document?

(Sorry for being late to the party, just back from a road trip).

Another command-line option would be to use cut. Assuming the fields are separated by a space:

$ cut -d ' ' -f1 source_file.txt > output_file_1.txt
$ cut -d ' ' -f2 source_file.txt > output_file_1.txt

Where:

-d ‘ ‘ Tells cut the fields are separated by a space
-f 1 Specifieds the field, in this case field 1
source_file.txt the name of the file containing the data
output_file_1.txt the name of the file you want the output placed in

If the fields are separated by TAB, place the cursor between the '' (make sure there is no space) and type Ctrl-V, then TAB.

--
Rod

David Brostoff

unread,
Jul 3, 2022, 3:35:24 PM7/3/22
to BBEdit-Talk
On Jul 3, 2022, at 10:07 AM, Rod Buchanan <li...@sofstats.com> wrote:
>
> Another command-line option would be to use cut. Assuming the fields are separated by a space:
>
> $ cut -d ' ' -f1 source_file.txt > output_file_1.txt
> $ cut -d ' ' -f2 source_file.txt > output_file_1.txt
>
> Where:
>
> -d ‘ ‘ Tells cut the fields are separated by a space
> -f 1 Specifieds the field, in this case field 1
> source_file.txt the name of the file containing the data
> output_file_1.txt the name of the file you want the output placed in

Thank you for this interesting tip.

I now know how to specify the output path by dragging the output file to the command line when the item it is replacing is the last one in the command, but how do I do it for an item in the middle of the command?

Also, when I entered the above text in Terminal to try it out, I got the error message "zsh: command not found: $".

David

Christopher Waterman

unread,
Jul 3, 2022, 3:54:22 PM7/3/22
to bbe...@googlegroups.com

On Jul 2, 2022, at 7:26 PM, David Brostoff <dav...@earthlink.net> wrote:

Again sorry for the beginner question, but what format should I use for the path if, for example, the file is on my Desktop?

It seems like you are missing some fundamentals when dealing with the command line and paths and such.

So this might be, probably is an over explanation. But here you go, and maybe I’ll touch on something that you are missing.

It would be a url or POSIX style paths with forward slashes. 

So the path to my desktop: /Users/chris/Desktop

Your command line shell (likely zsh on the mac) will use the ~ as a shortcut to your home directory. 

Again to my desktop: ~/Desktop

This command makes col-1.txt & col-2.txt on my desktop. It is using numbers.txt as the source file.
awk '{print $1 >> "/Users/chris/Desktop/col-1.txt"; print $2 >> "/Users/chris/Desktop/col-2.txt" }' ~/Desktop/numbers.txt

Your shell sees this as a command and two parameters. The command is awk. Param-1 is a string of text, the script that awk interprets. Param-2 is a path. Note that the paths inside the script are not using the ~ "/Users/chris/Desktop/col-1.txt". Only paths read by the shell use this shortcut.

If file names have spaces you have to escape them with a \ 
So: ~/Desktop/my\ numbers

Here is the deal though. I generally do this work in the same directory. So I might make a folder named ‘columns’ on my desktop.

I would then move my source file to that directory and navigate there in the command line. Using the change directory command, like so.

cd ~/Desktop/columns

Now I don’t have to worry about long mistake prone paths. This command will do the job.

awk '{print $1 >> "col-1.txt"; print $2 >> "col-2.txt" }' numbers.txt


P.S. If you are trying to learn awk I think David K’s suggestion of breaking it out into it’s own file helps a lot. I struggled some until I started doing that regularly.

— Chris(topher)?




David Kelly

unread,
Jul 3, 2022, 4:26:58 PM7/3/22
to bbe...@googlegroups.com
No, create a file exactly as shown above. 4 lines. Use a tab or a space or many spaces, it doesn't matter.

The difference is by putting the script (awk commands) in a file we don't have to figure out how to escape the newline between the two actions. That kind of thing varies depending on what shell you are using in Terminal.

Also putting the script in a file lets you create very complex scripts.

macOS Monterey seems to come with every popular Unix shell: sh, csh, tcsh, zsh, and bash. zsh seems to be the default now.

As I originally stated, you now invoke the script file with
"awk -f script.awk " with a trailing space then drag your input file to the command line to finish.

-f tells awk to get its commands from the specified file rather than the command line.

David Brostoff

unread,
Jul 3, 2022, 4:55:37 PM7/3/22
to bbe...@googlegroups.com
On Jul 3, 2022, at 12:54 PM, Christopher Waterman <ch...@rustydogink.com> wrote:
>
> It seems like you are missing some fundamentals when dealing with the command line and paths and such.

Yes, missing almost all the fundamentals (except for knowing how to create a POSIX style path), so thank you for the very clear explanation, which worked great.

By the way, these days I don't have use for this kind of thing very often -- or else I would take the time to learn it from scratch instead of filling in someone else's generously provided template -- but it's extremely interesting to see its potential.

David


David Brostoff

unread,
Jul 3, 2022, 6:40:27 PM7/3/22
to BBEdit-Talk
On Jul 3, 2022, at 1:26 PM, David Kelly <dke...@hiwaay.net> wrote:
>
> As I originally stated, you now invoke the script file with
> "awk -f script.awk " with a trailing space then drag your input file to the command line to finish.
>
> -f tells awk to get its commands from the specified file rather than the command line.

Thanks for the detailed explanation and your patience -- this worked great and I really learned a lot.

This is the most helpful email list I have ever been on.

David

Rod Buchanan

unread,
Jul 3, 2022, 7:43:56 PM7/3/22
to 'Dmitry Markman' via BBEdit Talk

One correction ... the second command s/b:

$ cut -d ' ' -f2 source_file.txt > output_file_2.txt

As I sent it the second command will overwrite the file created by the first command.

Apologies for missing that.

--
Rod
> --
> This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
> ---
> You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/C5AD7034-128B-4D1E-BF65-971A06CE6861%40earthlink.net.



--
Rod

Rod Buchanan

unread,
Jul 3, 2022, 7:43:58 PM7/3/22
to 'Dmitry Markman' via BBEdit Talk

--
Rod
Don't enter the '$'. I was showing the shell prompt (in my case, the bash shell). I probably should've left that out.


David Brostoff

unread,
Jul 3, 2022, 10:26:46 PM7/3/22
to BBEdit-Talk
On Jul 3, 2022, at 3:17 PM, Rod Buchanan <li...@sofstats.com> wrote:
>
> Don't enter the '$'. I was showing the shell prompt (in my case, the bash shell). I probably should've left that out.

No problem of course -- now it works great. (In case it matters, I am using Monterey, so the default is zsh.)

Thank you very much -- again I have learned a lot.

David
Reply all
Reply to author
Forward
0 new messages