Regex for dealing with commas inside of quotes

83 views
Skip to first unread message

Russ Pixels

unread,
Oct 11, 2019, 10:08:26 PM10/11/19
to BBEdit Talk
I use Text Factory with GREP to process downloaded bank statements, batch converting them from CSV to tab .txt using a simple comma find and replace, but if there are commas in the comments field that screws things up. Can someone supply a regex example that converts only the commas found between the quotes to some other character or a space? In the factory it will run first, then I can convert the commas to tabs. Thanks

Example of issue: 

28 MAR 2018,27 MAR 2018,"Comments for credited amount Ref: TH 1,2,3,4",,1025.00,2453.85

ThePorgie

unread,
Oct 12, 2019, 9:02:31 AM10/12/19
to BBEdit Talk
I'm assuming the Comments can have a varying amount and they might not be numbers as well?
If this were a problem I had to tackle I think it would be easier to import the data into Excel (or your spreadsheet software of choice) and do a Find & Replace on just the one column....I'm coming up empty for a grep solution if the data varies like I think it would.

GP

unread,
Oct 13, 2019, 7:15:35 AM10/13/19
to BBEdit Talk
You don't need to do it in two steps.  The following pattern captures in groups everything but commas in non-quoted strings:

^(\d{2}\s[A-Z]{3}\s\d{4}),(\d{2}\s[A-Z]{3}\s\d{4}),("[^"]*"),([^,]*),(\d{1,}\.\d{2}),(\d{1,}\.\d{2})$

then do the substitution with the captured groups separated by tab characters.

$1\t$2\t$3\t$4\t$5\t$6

This assumes each entry is formatted like your example.

The ("[^"]*") pattern is what captures the quoted string with embedded commas.

The ([^,]*) pattern handles your example's empty field but can handle a non-empty field in that position as long as it doesn't have an embedded comma.

You may have to do some tweaking on the date and dollar capturing patterns if your statements have some variations in their formats.

Russ Pixels

unread,
Oct 13, 2019, 9:18:12 PM10/13/19
to BBEdit Talk
Wow - thanks for that - a bit more complex than I expected but it does the trick, and I appreciate your time in constructing it. Its also a good example for me of how to construct complex regex, as I'm often scratching my head (mainly as I only use it now and again). I can scavenge bits in future as I can see common patterns I need in there as well.  

ThePorgie

unread,
Oct 14, 2019, 2:23:24 PM10/14/19
to BBEdit Talk
("[^"]*")

Finding little Snip-its like above is why I like to come look thru these grep treads. They completely show me how I'm thinking of the problem all wrong. Brilliant!
Thanks!
Reply all
Reply to author
Forward
0 new messages