I have two SWIFT messages in a file. I have read the entire file into
a StringBuffer. Now using java.util.regex, I am able to retrieve SWIFT
message blocks 1 ( start with {1: , end with } ) and 2 ( start with
{2: and end with } )
However, I am unable to grab the block 4 ( start with {4: and end with
the first occurence of -} ). My regex pattern is \\{4:.*-\\} .
However, this picks up the message until the last occurence of -}. I
am not sure how to restrict the regex to stop looking beyond the first
occurence of -} . Can you assist please?
Thank you,
Arun
{1:F01AAAABB99BSMK3513951576}
{2:O9400934081223BBBBAA33XXXX03592332770812230834N}{4:
:20:0112230000000894
:25:GSAKW827958933CAD
:28C:255/1
:60F:C011223CAD32,55
:62F:C011223CAD32,55
-}{5:
{CHK:794BB7656E00}}
{1:F01AAAABB99BSMK3513951576}
{2:O9400934081223BBBBAA33XXXX03592332770812230834N}{4:
:20:0112230000000890
:25:SAKG800030155USD
:28C:255/1
:60F:C011223USD175768,92
:61:0112201223CD110,92NDIVNONREF//08 IL053309
/GB/2542049/SHS/312,
:62F:C011021USD175879,84
-}{5:
{CHK:0F4E5614DD28}}
> I have two SWIFT messages in a file. I have read the entire file into
> a StringBuffer. Now using java.util.regex, I am able to retrieve SWIFT
> message blocks 1 ( start with {1: , end with } ) and 2 ( start with
> {2: and end with } )
>
> However, I am unable to grab the block 4 ( start with {4: and end with
> the first occurence of -} ). My regex pattern is \\{4:.*-\\} .
> However, this picks up the message until the last occurence of -}. I
> am not sure how to restrict the regex to stop looking beyond the first
> occurence of -} . Can you assist please?
You might try a reluctant quantifier: \\{4:.*?-\\} (untested).
Does a {4: block include line terminators?
<http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html>
> {1:F01AAAABB99BSMK3513951576}
> {2:O9400934081223BBBBAA33XXXX03592332770812230834N}{4:
> :20:0112230000000894
> :25:GSAKW827958933CAD
> :28C:255/1
> :60F:C011223CAD32,55
> :62F:C011223CAD32,55
> -}{5:
> {CHK:794BB7656E00}}
> {1:F01AAAABB99BSMK3513951576}
> {2:O9400934081223BBBBAA33XXXX03592332770812230834N}{4:
> :20:0112230000000890
> :25:SAKG800030155USD
> :28C:255/1
> :60F:C011223USD175768,92
> :61:0112201223CD110,92NDIVNONREF//08 IL053309
> /GB/2542049/SHS/312,
> :62F:C011021USD175879,84
> -}{5:
> {CHK:0F4E5614DD28}}
--
John B. Matthews
trashgod at gmail dot com
http://home.roadrunner.com/~jbmatthews/
Looks like a case for the reluctant quantifier
<http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html>
<http://java.sun.com/docs/books/tutorial/essential/regex/quant.html>
\\{4:.*?-\\}
if I read the docs correctly.
--
Lew
John,
Thank you. It worked. I am reading through rethe reluctant quantifiers
now. And yes {4: has a line terminator
WIth your assistance I was able to grab each of the messages
separately.
In the above example, I have a multiline message (:61: followed by
text, followed by a crlf/line terminator and a next line of text
followed by :62F:.
:61:0112201223CD110,92NDIVNONREF//08 IL053309
/GB/2542049/SHS/312,
:62F:C011021USD175879,84
Here the line following the line containing :61: is optional like
:61:0112201223CD110,92NDIVNONREF//08 IL053309
:62F:C011021USD175879,84
or the third line could be another starting with :61: like
:61:0112201223CD110,92NDIVNONREF//08 IL053309
/GB/2542049/SHS/312,
:61:0112201223CD110,92NDIVNONREF//08 IL053309
/GB/2542049/SHS/312,
I wrote something like
((:61:)(\\d{6})([\\d]{4})([CD]?[A-Z]?)(\\d*[,]?\\d*)([\\w\\S]{4})(.*&&
[^:]))
It did not work. :(
Where could I be wrong?
Thank you verymuch.
Arun
> On Dec 28, 12:22 am, "John B. Matthews" <nos...@nospam.com> wrote:
> > In article
> > <417b4c4a-6b86-42aa-a99f-e1ce887b8...@o40g2000yqb.googlegroups.com>,
> >
> > Arun <set...@gmail.com> wrote:
> > > I have two SWIFT messages in a file. I have read the entire file into
> > > a StringBuffer. Now using java.util.regex, I am able to retrieve SWIFT
> > > message blocks 1 ( start with {1: , end with } ) and 2 ( start with
> > > {2: and end with } )
> >
> > > However, I am unable to grab the block 4 ( start with {4: and end with
> > > the first occurence of -} ). My regex pattern is \\{4:.*-\\} .
> > > However, this picks up the message until the last occurence of -}. I
> > > am not sure how to restrict the regex to stop looking beyond the first
> > > occurence of -} . Can you assist please?
> >
> > You might try a reluctant quantifier: \\{4:.*?-\\} (untested).
> > Does a {4: block include line terminators?
> >
> > <http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html>
[...]
> Thank you. It worked. I am reading through rethe reluctant quantifiers
> now. And yes {4: has a line terminator
>
> With your assistance I was able to grab each of the messages
> separately.
>
> In the above example, I have a multiline message (:61: followed by
> text, followed by a crlf/line terminator and a next line of text
> followed by :62F:.
>
[...]
> I wrote something like
> ((:61:)(\\d{6})([\\d]{4})([CD]?[A-Z]?)(\\d*[,]?\\d*)([\\w\\S]{4})(.*&&
> [^:]))
>
> It did not work. :(
>
> Where could I be wrong?
Sorry, I don't understand SWIFT message syntax well enough to comment.
IIUC, a pre-XML SWIFT parser is non-trivial. You might Google for an
existing solution.
John,
Simply put, in the lines below,
:61:0112201223CD110,92NDIVNONREF//08 IL053309
/GB/2542049/SHS/312,
:62F:C011021USD175879,84
I need to grab line 1&2 in a buffer separately. The rule is start
from :61: and read until i encounter the next :.
Thank you verymuch
Arun
What about line 3?
I think I understand what you were saying, but Usenet wraps lines, so it's
tricky to refer to line numbers that might not match what people are reading.
--
Lew
Hi Lew,
In my example
LINE 1 -> :61:0112201223CD110,92NDIVNONREF//08 IL053309
LINE 2 -> /GB/2542049/SHS/312,
LINE 3 -> :62F:C011021USD175879,84
Here LINE 2 can be any text , basically a (.*) .
LINE 3 could be another line starting with a :
My requirement is if the line starts with :61: , match all characters
until you see a next ":" ( and not :62F: as in above example because
LINE 1 is repetitive, LINE 2 may or may not occur after LINE 2.
Did I understand your question correcty? And did I give a correct
response? Please let me know.
Thank you
Arun
Lew,
I am enclosing each line between braces ().
(:61:0112201223CD110,92NDIVNONREF//08 IL053309 )
(/GB/2542049/SHS/312,)
(:62F:C011021USD175879,84)
Thank you
Arun
[...]
> In my example
>
> LINE 1 -> :61:0112201223CD110,92NDIVNONREF//08 IL053309
> LINE 2 -> /GB/2542049/SHS/312,
> LINE 3 -> :62F:C011021USD175879,84
>
> Here LINE 2 can be any text , basically a (.*) .
>
> LINE 3 could be another line starting with a :
>
> My requirement is if the line starts with :61: , match all characters
> until you see a next ":" ( and not :62F: as in above example because
> LINE 1 is repetitive, LINE 2 may or may not occur after LINE 2.
[...]
Do you mean like this:
<sscce>
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Splitting {
public static void main(String[] args) {
String s = ""
+ ":60F:C011223USD175768,92\n"
+ ":61:0112201223CD110,92NDIVNONREF//08 IL053309\n"
+ "/GB/2542049/SHS/312,\n"
+ ":62F:C011021USD175879,84\n";
Pattern p = Pattern.compile(
"(:.*?:.[^:]+)", Pattern.DOTALL);
Matcher m = p.matcher(s);
int i = 1;
while (m.find()) {
System.out.println("(" + i++ + ") " + m.group());
}
}
}
<sscce>
<console>
(1) :60F:C011223USD175768,92
(2) :61:0112201223CD110,92NDIVNONREF//08 IL053309
/GB/2542049/SHS/312,
(3) :62F:C011021USD175879,84
</console>
See also:
<http://java.sun.com/docs/books/tutorial/essential/regex/>
John,
Yes. It worked. Thank you so much.Your regex is generic and it worked
for all tags.
Thanks much. I appreciate that.
Arun
John,
In the below example
LINE 1 -> :61:0112201223CD110,92NDIVNONREF//08 IL053309
LINE 2 -> /GB/2542049/SHS/312,
LINE 3 -> :62F:C011021USD175879,84
I tried to split line 1 and 2 into logical groups ( for clarity
purpose I had separated each token with braces )
:61:(011220)(1223)(CD)(110,92)(NDIV)(NONREF//08 IL053309)
(/GB/2542049/SHS/312,)
using the following regex pattern
:61:(\\d{6})(\\d{4})([CD]?[A-Z]?)(\\d*[\\,]?\\d*)(\\w{4})(.*?\\n)(.*?
[^:]+)
however, I am not able to grab the second line using matcher.group(i)
where i is the group number.
What is wrong in )(.*?[^:]+) ?
Thank you
Arun
:61:(\\d{6})(\\d{4})([CD]?[A-Z]?)(\\d*[\\,]?\\d*)(\\w{4})(.*?\\n)(.*?
[^:]+)
[...]
> In the below example
>
> LINE 1 -> :61:0112201223CD110,92NDIVNONREF//08 IL053309
> LINE 2 -> /GB/2542049/SHS/312,
> LINE 3 -> :62F:C011021USD175879,84
>
> I tried to split line 1 and 2 into logical groups ( for clarity
> purpose I had separated each token with braces )
>
> :61:(011220)(1223)(CD)(110,92)(NDIV)(NONREF//08 IL053309)
> (/GB/2542049/SHS/312,)
>
> using the following regex pattern
> :61:(\\d{6})(\\d{4})([CD]?[A-Z]?)(\\d*[\\,]?\\d*)(\\w{4})(.*?\\n)(.*?
> [^:]+)
>
> however, I am not able to grab the second line using matcher.group(i)
> where i is the group number.
>
> What is wrong in )(.*?[^:]+) ?
>[...]
> :61:(\\d{6})(\\d{4})([CD]?[A-Z]?)(\\d*[\\,]?\\d*)(\\w{4})(.*?\\n)(.*?
> [^:]+)
I don't understand. Perhaps you could modify the <http://sscce.org/> I
provided above to clarify the problem. The following tutorial shows how
to catch syntax errors using the methods of PatternSyntaxException:
I think I did not explain my requirement.
I have 3 lines
LINE 1 -> :61:0112201223CD110,92NDIVNONREF//08 IL053309
LINE 2 -> /GB/2542049/SHS/312,
LINE 3 -> :62F:C011021USD175879,84
And I grab line 1 & 2 using pattern "(:61:.*?.[^:]+)" and copy it to a
StringBuffer
Now, with matcher.group(int arg) function, i need to group the
sequence so that i can get the 2nd line.
matcher1.group(1) should return :61:0112201223CD110,92NDIVNONREF//08
IL053309 ( along with the \n ) and matcher1.group(2) should return /GB/
2542049/SHS/312,
This regex is harassing me!!!
Thank you
Arun
> On Dec 29, 8:29 pm, "John B. Matthews" <nos...@nospam.com> wrote:
[...]
> > <http://java.sun.com/docs/books/tutorial/essential/regex/>
What syntax errors did this approach discover?
[Please trim sigs.]
> I think I did not explain my requirement.
> I have 3 lines
>
> LINE 1 -> :61:0112201223CD110,92NDIVNONREF//08 IL053309
> LINE 2 -> /GB/2542049/SHS/312,
> LINE 3 -> :62F:C011021USD175879,84
>
> And I grab line 1 & 2 using pattern "(:61:.*?.[^:]+)" and copy it to
> a StringBuffer. Now, with matcher.group(int arg) function, i need to
> group the sequence so that i can get the 2nd line.
>
> matcher1.group(1) should return :61:0112201223CD110,92NDIVNONREF//08
> IL053309 ( along with the \n ) and matcher1.group(2) should return /GB/
> 2542049/SHS/312,
[...]
You could try matching the \n:
Pattern p = Pattern.compile("(^.*\n)(.*\n)", Pattern.DOTALL);
Matcher m = p.matcher(s);
if (m.matches()) ...
Again, an <http://sscce.org/> would make discussion easier.
[Please trim sigs.]
>However, I am unable to grab the block 4 ( start with {4: and end with
>the first occurence of -} ). My regex pattern is \\{4:.*-\\} .
>However, this picks up the message until the last occurence of -}. I
>am not sure how to restrict the regex to stop looking beyond the first
>occurence of -} . Can you assist please?
Just a general comment. Regex does not handle delimiter nesting of
variable depth. I did not follow the details of your message, but got
the general impression that might be the problem.
If you have such nesting you need a parser, either one roll yourself
with a finite state automaton, using an enum to track the various
states, and State next( char ) to figure out which state to go to
next depending on the next char.
http://mindprod.com/jgloss/finitestate.html
For tougher parsing you need a parser generator. See
http://mindprod.com/jgloss/parser.html
--
Roedy Green Canadian Mind Products
http://mindprod.com
PM Steven Harper is fixated on the costs of implementing Kyoto, estimated as high as 1% of GDP.
However, he refuses to consider the costs of not implementing Kyoto which the
famous economist Nicholas Stern estimated at 5 to 20% of GDP