Help with regex split on text in a file

30 views
Skip to first unread message

Kris Ring

unread,
Jul 25, 2013, 5:06:05 PM7/25/13
to re...@googlegroups.com
Hi all, I am setting up some classes for a test I'm running, and I am needing help with the regex split on the file that I am trying to keep from being exposed in my code. 

So, here goes:

I have a file that contains the following format:
DEV      dfathsnbfag+fsfdbvdsafwrehtgdbfagre/fgsdadewtreytg
CERT    dfasdfadf23453yhrew456ehgfnbdfge65ieyjtwujsjyi464
PERF    sh4u364746uerthdmhfyo67r586rutsjdko7+tjdmnsh/ea
PROD   sdnbry46857ukfghfdjrtue65i+reywurjsgnh/aq35uwetsjn

I have created a configuration and enum in java to depict which regex gets run, and what I need returned is just the alphanumeric token, based on the config that I chose.

For instance, when my config depicts 'DEV' - I wan the token on the line for DEV to be returned (so that it can be passed to another method). Right now the what I am using is the java string split which takes a regex, and I have behind an switch which regex is used, I just need it to return the token.

Thank you to anyone interested in pursuing a solution to this with me!

iiz

unread,
Jul 29, 2013, 4:23:15 AM7/29/13
to re...@googlegroups.com
 
Dear Kris,
 
You can parse DEV/CERT/PEF or PROD into a parameter tthe function (lets call it p1)
 
then use lookaround with the parameter as input to your regex like so te return the proper string:
regex=(?<=p1)\s+\S+
so if your string is s
s.matches('(?<=p1)\s+\S+')  should get you your results. If the parameter does not parse, then first fill a string concatenating '(?<=´ with p1 and ´)\s+\S+'.
 
One caveat this will also return the blanks between DEV and the pattern, so you neet to trim or left or filter those.

Ring, Kris

unread,
Jul 29, 2013, 11:59:26 AM7/29/13
to re...@googlegroups.com
When you say to parse the items to a parameter, are you referring to a Pattern.compile using the regex you specified? 

Right now, my code looks like the following:
File file = new File();
String input = file.readTextFile();
Pattern pattern = Pattern.compile("(?<=p1)\\s+\\S+");
Matcher matcher = pattern.matcher(input);

Right now, this does the same as I was getting before in that the entire text of the file is put in to the matcher variable, and I'm not quite sure yet what the code would look like for your second suggestion, would I use the mater with just '(?<=' and then concatenate that with 'p1)\s+\S+'? 

Thank you for your help!


--
--
Sub, Unsub, Read-on-the-web, tune your personal settings for this Regex forum:
http://groups.google.com/group/regex?hl=en
 
---
You received this message because you are subscribed to the Google Groups "Regex" group.
To unsubscribe from this group and stop receiving emails from it, send an email to regex+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

iiz

unread,
Jul 29, 2013, 6:19:47 PM7/29/13
to re...@googlegroups.com

Dear Kris,

 

I’ve put a little time in it for you. I am not a JAVA expert, but have a little experience in it; I did a course once J

I’ve used google to see how matches and patterns work:

http://docs.oracle.com/javase/6/docs/api/java/util/regex/Matcher.html

http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html

It would seem you would like a class that takes the string, and a parameter (for instance DEV) and then returns the second argument. Now first of all you have got some things wrong here: Matches will not give you the pattern, but will test the string. You can get the match with the group() method, with an optional integer depicting the occurance, so if

".*("+p1+")\\s+(\\S+).*" is the pattern, and 2 is the occurance of the group you want to return (\\S+). That being said lookbehind will not be needed, since we can just capture the group as depicted.

One more caveat: You are matching multiline text.

Google tells us:

http://stackoverflow.com/questions/3651725/match-multiline-text-using-regular-expression

So set DOTALL property on the pattern when compiling it, to make the . match linefeeds.

So then I would get to the following code.

[CODE]

//Title of this code

//'main' method must be in a class 'Rextester'.

// get the matcher and pattern class

import java.util.regex.Matcher;

import java.util.regex.Pattern;

 

// apparently these are needed.

import java.util.*;

import java.lang.*;

 

// class returnval holds one method 'Getit'. It expexts a string and an argument

class returnval

{   

 

    public String getit(String txtin, String p1)

    {

// create pattern using parameter p1. Use brackets to reference what you want to match (\\S)

// set DOTALL so . wil match linefeed

       Pattern pattern = Pattern.compile(".*"+p1+"\\s+(\\S+).*", Pattern.DOTALL);

// so pattern states 'anything followed by p1 followed by zero or more whitespaces followed by zero or more non-whitespaces

// construct matcher object from the pattern. Use text as input.

       Matcher matcher = pattern.matcher(txtin);

        if (matcher.matches()){

// if match is found, return the first captured group

            return matcher.group(1);

        }

        else

        {

            return "foobar";

        }

    }

}

 

// main class

class Rextester

{ 

   

    public static void main(String args[])

    {

// filled statically now, should be replaced with file input

        String txtfile="DEV      dfathsnbfag+fsfdbvdsafwrehtgdbfagre/fgsdadewtreytg\n" +

                       "CERT    dfasdfadf23453yhrew456ehgfnbdfge65ieyjtwujsjyi464\n"+

                       "PERF    sh4u364746uerthdmhfyo67r586rutsjdko7+tjdmnsh/ea\n"+

                       "PROD   sdnbry46857ukfghfdjrtue65i+reywurjsgnh/aq35uwetsjn";

// filled statically now, can be any string. Maybe you should test for valid strings first.

        String p1="PROD";

// create object of the returnval class

        returnval returnval = new returnval();

// call the getit method using parameter and input

        System.out.println(returnval.getit(txtfile, p1));

    }

}

 

[/CODE]

Little test on

http://rextester.com/runcode

P1=”PROD”

Returns sdnbry46857ukfghfdjrtue65i+reywurjsgnh/aq35uwetsjn

 P1=”DEV”

Returns

dfathsnbfag+fsfdbvdsafwrehtgdbfagre/fgsdadewtreytg

 

That seams to work.

 

 

 

 

 

 

 

 

 

 

 

Op maandag 29 juli 2013 17:59:26 UTC+2 schreef Kris Ring het volgende:

Ring, Kris

unread,
Jul 29, 2013, 7:08:17 PM7/29/13
to re...@googlegroups.com
This will make for some great reading tonight, thank you VERY much for the time you have spent on helping me here! 
Reply all
Reply to author
Forward
0 new messages