Using Regex to parse file

24 views
Skip to first unread message

Arielle Albon

unread,
Dec 26, 2014, 1:42:56 PM12/26/14
to mdcsvi...@googlegroups.com
Hi, 

I have a file from my bank which using the field chooser works sort of ok, however I have a problem that the description field is parsed as one field as it is a composite in Quotation marks. This is what I would expect but I would like to break out the values into the component parts.

Are there some instructions on how to use the Regex mode? I would like to make this parser work (and then contribute it for the other Credit Suisse CH DirectNet customers). I have attached a slightly modified CSV export, format is the same, I have culled transactions.

Thanks

Arielle
example.csv

stashu.pub

unread,
Dec 28, 2014, 12:34:09 AM12/28/14
to mdcsvi...@googlegroups.com
Hello Arielle/Mike,

Well, I goofed. The regex part does save and restore the config,
but then it does not actually use it. It looks like it is using my
hardcoded regex list all the time (from when I was testing it).
So, it works for me :-)  and I did not notice.
I have to fix that and let you know when it's ready.

btw, you will have to play with regex when I get it out there. Version 18.
I looked at that for a while. Regex is not simple. The way I am coding it,
I move left to right across the line and don't let you pick out pieces here
and there so much. If it matches on 10 char.s, it simply removes the
first 10 char.s for the next matching.

Now, normally in csv format this line:
24.12.2014,"desc, memo", 123.09

would parse into 3 elements because " " is used to hide , commas inside them
like if you have a company name:  "Maine Lobster, Inc."   - you do not want that comma inside to matter.

So my guess is you will use my example regex to parse:  ([^,]*([,]|\Z)).*
and play with that and try to make it parse into 6 pieces like this:

*24.12.2014*,**"*desc*,* memo*"**, *123.09*

*24.12.2014*    date
**                        ignore
*desc*                desc or whatever
*memo*            memo or whatever
**                        ignore
*123.09*            - Payment -

([^,]*([,]|\Z)).*
([^"]*(["]|\Z)).*
([^,]*([,]|\Z)).*
([^,]*([,]|\Z)).*
([^"]*(["]|\Z)).*
([^,]*([,]|\Z)).*

Something like that. I did not test it yet.
I'll have to let you know when I fix it.

Thanks,
Stan


My guess is you will have to know how many
--
You received this message because you are subscribed to the Google Groups "mdcsvimporter" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mdcsvimporte...@googlegroups.com.
To post to this group, send email to mdcsvi...@googlegroups.com.
Visit this group at http://groups.google.com/group/mdcsvimporter.
For more options, visit https://groups.google.com/d/optout.

Arielle Albon

unread,
Dec 28, 2014, 4:48:16 AM12/28/14
to mdcsvi...@googlegroups.com
Hi Stan,

Arielle Please, Mike is the old name and now dead for usage purposes, only google's stupid no changing gmail account names policy makes me still keep it.

Ok sounds like I'll make a Java importer like the others you have there so I can break the fields out as I like. I assume you'll accept contributions? It'd be nice if you could define fields into named groups then break those out into the moneydance transaction. That is well beyond my Java though, lol

ant to build is it? any other pre-requisites? as I can see the jars in the repo.

Thanks

Arielle

--
You received this message because you are subscribed to a topic in the Google Groups "mdcsvimporter" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mdcsvimporter/ZLkANWVIHXg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mdcsvimporte...@googlegroups.com.

stashu.pub

unread,
Dec 28, 2014, 10:09:33 AM12/28/14
to mdcsvi...@googlegroups.com

Hi Arielle,

I don't think you need to write a new importer. In fact I took them all out and only use the CustomReader for everything. You use that to define your own at use time, the user does.

So, if I get the [ ] use regex option to work I think you won't need to program. You'll just have to figure out how to use regex.

P.S.  I don't know what you meant by "define fields into named groups..."

Thanks

Stan


Sent with SolMail App

stashu.pub

unread,
Jan 1, 2015, 12:55:26 AM1/1/15
to mdcsvi...@googlegroups.com
Hi Arielle,

it was storing and retrieving the config definition but then it was not using the user regex strings to parse with. it was using hardcoded test strings. I fixed it to use the config strings like it should.
I put out a new mdcsvimporter-beta-18.zip. Please give that a try and let me know if it works for you.

Work with strings similar to these:

([^,]*([,]|\Z)).*
"([^,]*([,]|\Z)).*
([^,]*([,]|\Z)).*
([^"]*)(["]|\Z).*
([^,]*([,]|\Z)).*
([^,]*([,]|\Z)).*
([^,]*([,]|\Z)).*





for csv file:

Exported from Direct Net on 26.12.2014 / 10:19 CET,,,,
Bookings,,,,
Account Details ,,,,
Account,"1234567-01 Private account Bonviva Silver Joe Public, Switzerland ",Balance,CHF 300.00,,
IBAN,CH73 0000 0000 0000 0000 0,BIC / SWIFT,CRESCHZ999R,,
Booking Entries from 22.12.2014 - 26.12.2014,,,
Booking Date,Text,Debit,Credit,Value Date,Balance
24.12.2014,"Bonviva Silver package fee      ,                                               ",45.00,,31.12.2014,300.00
24.12.2014,"Standing order                  ,CABLECOM                                       ",117.00,,24.12.2014,
22.12.2013,"Balance of closing entries      ,as shown separately                            ",,,31.12.2013,462.00
Total of Column,,162.00:w,0.00,,


Thanks,
Stan

Reply all
Reply to author
Forward
0 new messages