it says:
> The string literal "\(hello\)" is illegal and leads to a compile-time error; in order to match the string (hello) the string literal "\\(hello\\)" must be used.
Now, if I read my input strings from a file I have to convert my
strings in order to match them against a pattern. How do I do that?
Is there a predefined method to do it? Like quotemeta in perl?
Thanks!
Markus
There's several concepts you need to get straight here. One is "what is
the character-content of a String in memory?" which I will call "String A"
for short, and "What string do I have to type in my Java source code to get
String A into memory?" which I will call "String B".
So if you type a String B like "\\(hello\\)" then String A will be
"\(hello\)".
If you type a String B like "\t\\t\t", then String A will be something
like " \t ".
Now, let me define a new string called String C, as follows: "What does
my file have to contain so that when I read in that string, String A gets
loaded into memory?"
It turns out that String C and String A are exactly the same. If you
want " \t " to appear in memory, then your file should contain
" \t ". If you want "\(hello\)" to appear in memory, then your
file should contain "\(hello\)".
- Oliver
I think it should be made absolutely clear that it is the Java compiler
that turns '\\' into '\'. Thus only string constants in code that will
be compiled, i.e. in source code, need to have the overabundance of
'\\'. Everywhere else (external files, memory images, etc), what you
see is what you get.
>I think it should be made absolutely clear that it is the Java compiler
>that turns '\\' into '\'. Thus only string constants in code that will
>be compiled, i.e. in source code, need to have the overabundance of
>'\\'. Everywhere else (external files, memory images, etc), what you
>see is what you get.
A SCID could hide this \ quoting goofiness by displaying and editing
strings in two colours, one for literal chars, and one for
representations of unprintable characters. Unicode has special glyphs
for the control chars you could use. Ditto for regex. We have the
hardware. We act as if had only TTYs to code on.
It would make proofreading 100 times easier. 40% of the difficultly of
regexes comes from the double layer of quoting.
See
http://mindprod.com/projects/regexcomposer.html
http://mindprod.com/projects/regexproofreader.html
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.
As others have pointed out, pattern strings obtained by means other than
string literals are not bound by the constraints of string literals
(though they may have their own constraints). Another thing to
consider, though, is how to use a string -- from whatever source -- as a
literal pattern, handling erstwhile metacharacters as normal characters.
It isn't clear to me whether that's what you want, but if it is then
you should look into Pattern.quote().
--
John Bollinger
jobo...@indiana.edu