private static final Pattern java_proc =
Pattern.compile("(java)");
does not work, because parentheses are treated as groupings.
Using "\" to designate the parentheses as literal characters does
not work --- not sure why:
private static final Pattern java_proc = Pattern.compile("\(java
\)");
I searched for and read a related post here, but it did not
help. I seem to be having a different problem than they. Or I just
don`t understand the post.
What am I doing wrong? Thanks, Alan
private static final Pattern java_proc = Pattern.compile("\\\\.+\
\Process\\(java\\)\\");
The error says:
java.lang.ExceptionInInitializerError
Caused by: java.util.regex.PatternSyntaxException: Unknown character
property name {r} near index 6
\\.+\Process\(java\)\
^
This does not make sense to me.
I`m trying to match text of the form (example):
\\GOLLY\Process(java)\% Processor Time
Thanks, Alan
This is what the regex is seeing. Don't forget that `\' is also a
metacharacter in regexes. So to match a '\' in regex requires you to use
'\\\\', which causes the regex to see '\\', which is what it uses to
match as a '\'. So the regex you're probably trying to compile:
"\\\\{2}.+\\\\Process\\(java\\)\\\\" (The {2} is so that you don't have
to type in 8 slashes)
--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
I have one last remaining problem. The full data I`m working with,
in CSV format, looks like this:
"(PDH-CSV 4.0) (Eastern Daylight Time)(240)","\\GOLLY\Memory\%
Committed Bytes In Use","\\GOLLY\Process(java)\% Processor Time"
I want to match on
\\GOLLY\Process(java)\
so I can replace it.
The regular expression
\\\\{2}.+\\\\Process\\(java\\).
matches, but it matches too much of it:
\\GOLLY\Memory\% Committed Bytes In Use","\\GOLLY\Process(java)\
How can I get it to only match the part I want?
Thanks again, Alan
In that case, you probably want this regex:
\\\\{2}[^\\\\]+\\\\Process\\(java\\)
Double backslash your pattern: \\(java)\\
AHS
> private static final Pattern java_proc = Pattern.compile("\(java
>\)");
It gets complicated because you have both Java and regex escape
quoting.
See http://mindprod.com/jgloss/regex.html#QUOTING
--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
FWIW, you could avoid a little of the backslash escape mess
by using single-char character classes, e.g.:
Pattern.compile("[\\]{2}[^\\]+[\\]Process[(]java[)]") ;
// ...outside of a Java string that'd be [\]{2}[^\]+
[\]Process[(]java[)]
You also might get rid of some of those backslashes by substituting
another character, then using replace() on the string before compiling it.
final static String PATTERN = "``{2}.+``Process`(java`)";
String myRegex = PATTERN.replace("`", "\\" );
System.out.println( myRegex );
Result:
\\{2}.+\\Process\(java\)
It just makes things more readable. Using `, or %, or # in a string,
then replace that character with \'s before compiling it as a regex can
save your eyes.
Incidentally, I wonder if Sun could be convinced to add this themselves.
Maybe add a new operator/keyword altogether. Like # introduces new
keywords or operators. It's followed by the keyword or operator. This
just allows Sun to make new keywords or operators, with out breaking any
existing code. So #s might give us new string constatns. Let's say '
then means like a Unix shell string, where escaping is ignored.
String regex = #s'\\{2}.+\\Process\(java\)';
Would give that literal string, without the need to escape the
backslashes. Easier for regex at least. Other types of flags besides '
could be introduced too. `,$,@,%,= might do the same thing, just use a
different character as a string terminator, in case you want a ' to be
part of the string. """ might introduce a "here-is" operator. Etc.
Just thinking out loud....
>You also might get rid of some of those backslashes by substituting
>another character, then using replace() on the string before compiling it.
Other ideas:
1. Use Quoter to insert \ quoting, both for regex and Java strings.
see http://mindprod.com/applet/quoter.html
2. implement one or more of my regex student projects
http://mindprod.com/project/regexutility.html
http://mindprod.com/project/regexcomposer.html
http://mindprod.com/project/regexdebugger.html
http://mindprod.com/project/regexproofreader.html
3. use \Q ... \E
> 3. use \Q ... \E
OK, that's cool. It only works with regex, but it's darn handy for
them. Thanks!
The statement
Pattern JAVA_PROC = Pattern.compile("[\\]{2}[^\\]+[\
\]Process[(]java[)]");
compiles but raises an exception there:
run:
Exception in thread "main" java.util.regex.PatternSyntaxException:
Unclosed character class near index 30
[\]{2}[^\]+[\]Process[(]java[)]
^
All: Thank you for your suggestions.
>[\]{2}[^\]+[\]Process[(]java[)]
> ^
() both need escapes. If that is a Java literal, you also need to
escape \ both for Java and for regex.
see http://mindprod.com/jgloss/regex.html#QUOTING
You have to quote the slashes here still since the slashes are currently
quoting the close of the character class expression.
> I`m trying to use regex to match/replace a word in parentheses.
>The regular expression
An aside, you can't use a regex to tell if ( ) are nested and
balanced correctly to arbitrary depth.
For that you need a parser.
See http://mindprod.com/jgloss/parser.html