Multiple greps for extracting portions of multiple files

42 views
Skip to first unread message

Kim Mosley

unread,
Jul 3, 2021, 7:44:31 PM7/3/21
to BBEdit Talk

I have two greps that are working for extracting portions of multiple files. Can I combine the greps so the extraction pulls two phrases from the files and combines them as one extraction?

jj

unread,
Jul 4, 2021, 4:12:33 AM7/4/21
to BBEdit Talk
Hi Kim,

If you could provide some sample text, grep patterns and expected results, it would be easier to answer your question.

Meanwhile this pattern may help:

Find: (?s)(?P<HEAD>first_pattern|second_pattern).*?(?P<TAIL>first_pattern|second_pattern)
Replace: \P<HEAD>, \P<TAIL>


Sample.txt
--
first_pattern bla bla bla second_pattern

second_pattern foo
bar first_pattern

first_pattern foo bar fizz first_pattern
--

Extracted result:
--
first_pattern, second_pattern
second_pattern, first_pattern
first_pattern, first_pattern
--

HTH

Jean Jourdain

Kim Mosley

unread,
Jul 4, 2021, 3:17:43 PM7/4/21
to bbe...@googlegroups.com
I think I understand but I'm not sure how to translate that to my problem.


Here is a sample of the text:

  <div class="field field--name-field-collection-artist field--type-entity-reference field--label-inline">
  <div class="field--label">Artist</div>
        <div class="field--item"><a href="../../collection-artist/alfred-leslie.html" hreflang="en">Alfred Leslie</a></div>
      </div>

  <div class="field field--name-field-artist-dob field--type-string field--label-inline">
  <div class="field--label">Date of Birth</div>
        <div class="field--item">(b. 1927)</div>
      </div>

  <div class="field field--name-field-item-date field--type-string field--label-inline">
  <div class="field--label">Date</div>
        <div class="field--item">1976</div>
      </div>


The three grep patterns are:

(?<=Artist</div>
        <div class="field--item">)(.*)(?=</a></div>)

(?<=Date of Birth</div>
        <div class="field--item">)(.*)(?=</div>)

(?<=Date</div>
        <div class="field--item">)(.*)(?=</div>)

The current results are working well independently (the text in blue) - but I'd like them to be returned in sequence (together) with a tilda separating them:

result A ~ result B ~ result C 

--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to a topic in the Google Groups "BBEdit Talk" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bbedit/wU9X1YEKMYc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bbedit+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/0c59c775-c304-447a-ae79-0f3cb367fcd4n%40googlegroups.com.

jj

unread,
Jul 4, 2021, 3:52:20 PM7/4/21
to BBEdit Talk
Try this pattern:

(?s)>Artist</div>\s*<div class="field--item"><a[^>]+>(.*?)</a></div>.+?>Date of Birth</div>\s*<div class="field--item">(.*?)</div>.+?>Date</div>\s*<div class="field--item">(.*?)</div>

with this replacement:

\1 ~ \2 ~ \3
Reply all
Reply to author
Forward
0 new messages