// Declare the field names used to parse out of the loan performance file
Fields loanPerfFields = new Fields( "LoanID", "month","day","year","name","intRate","upBalance");
// Define the regular expression used to parse the input file
String loanPerfRegex = "([A-z,0-9].*)\\|(\\d{2})\\/(\\d{2})\\/(\\d{4})\\|([A-z,0-9]+)\\|([0-9\\.]+)\\|([0-9\\.]+).*";
// Declare the groups from the above regex. Each group will be given a field name from 'loanPerfFields'
int[] allGroups = {1, 2, 3, 4, 5 ,6 ,7};
// Create the parser
RegexParser parser = new RegexParser( loanPerfFields, loanPerfRegex, allGroups );
// Create the main import pipe element, and with the input argument named "line"
Pipe processPipe = new Each( "processPipe", new Fields( "line" ), parser, Fields.RESULTS );
// Creating unique tuples of LoanID + Month combination
// Pipe uniquePipe = new Unique( processPipe, new Fields( "LoandID","month") );
From: Bots A
Sent: March 26, 2015 5:22:03am PDT
To: cascadi...@googlegroups.com
Subject: How to skip regex mismatches and make the job continue without failing?
Hello,As shown below, I'm parsing my input data with the following regex. But it fails when there are null fields for the 5th group. All I want it to skip them and continue the job. Can someone help?
// Declare the field names used to parse out of the loan performance file
Fields loanPerfFields = new Fields( "LoanID", "month","day","year","name","intRate","upBalance");
// Define the regular expression used to parse the input file
String loanPerfRegex = "([A-z,0-9].*)\\|(\\d{2})\\/(\\d{2})\\/(\\d{4})\\|([A-z,0-9]+)\\|([0-9\\.]+)\\|([0-9\\.]+).*";
// Declare the groups from the above regex. Each group will be given a field name from 'loanPerfFields'
int[] allGroups = {1, 2, 3, 4, 5 ,6 ,7};
// Create the parser
RegexParser parser = new RegexParser( loanPerfFields, loanPerfRegex, allGroups );
// Create the main import pipe element, and with the input argument named "line"
Pipe processPipe = new Each( "processPipe", new Fields( "line" ), parser, Fields.RESULTS );
// Creating unique tuples of LoanID + Month combination
// Pipe uniquePipe = new Unique( processPipe, new Fields( "LoandID","month") );
Also, as you may have noticed in the last line, I'm trying to see if I can use "Unique" pipe to remove duplicated lines that may exist in the input. I am still having errors getting that to work. Any advice on that would be appreciated too.Thanks.A.