illegal path bug after URI changes?

5 views
Skip to first unread message

Jim Hargrave

unread,
Apr 8, 2013, 4:49:52 PM4/8/13
to okapi...@googlegroups.com
50450 C01 v.2.9 CR.xml

Our unit tests (private code) are failing now - I believe it may be caused by the changes made by Aaron to fix the space path issue. This could be a unix only issue in case you can't reproduce it on Windows

java.lang.RuntimeException: java.net.URISyntaxException: Illegal character in path at index 122:

Jim

Chase Tingley

unread,
Apr 8, 2013, 5:08:44 PM4/8/13
to okapi...@googlegroups.com
Can you share the URI in question?


--
You received this message because you are subscribed to the Google Groups "okapi-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jim Hargrave

unread,
Apr 8, 2013, 5:11:30 PM4/8/13
to okapi...@googlegroups.com, Chase Tingley
The filename below is what broke the code - I renamed it and the test is working. It is probably the extra "."'s

Any path should recreate the problem:  C:\foo\50450 C01 v.2.9 CR.xml

J

Aaron Madlon-Kay

unread,
Apr 9, 2013, 2:46:47 AM4/9/13
to okapi...@googlegroups.com, Chase Tingley
Hi Jim. Sorry about that!

Can you tell me what test or module it is?

I did test my changes on Windows and OS X, but I wonder if we simply don't have any test files that have extra "."s in them.

-Aaron

Yves Savourel

unread,
Apr 9, 2013, 7:31:16 AM4/9/13
to okapi...@googlegroups.com
Hi Aaron, Jim, all,

This occurs in a private test I think.
But the case is when a document is something like: "C:\foo\50450 C01 v.2.9 CR.xml"
I've tried to extract with both the XML and the XML Stream filters a test file with such name and all seems fine.

Jim: We need to know more: what is done to the file (filter, steps).
It may also be an issue with the test helper methods.

-ys

Jim Hargrave

unread,
Apr 9, 2013, 9:11:49 AM4/9/13
to okapi...@googlegroups.com, Yves Savourel
It could be a unix only issue - I'll see if I can create a new test case
to reproduce it - renaming the file fixed the problem in my case.

Jim

Jim Hargrave

unread,
Apr 9, 2013, 11:25:18 AM4/9/13
to okapi...@googlegroups.com, Yves Savourel
Here is a unit test that triggers the problem on my machine:

@Test
public void testToURLtoURI () throws MalformedURLException {
Util.URLtoURI(new URL("file:/home/jimh/50450 C01 v.2.9 CR.xml"));
}


java.lang.RuntimeException: java.net.URISyntaxException: Illegal
character in path at index 21: file:/home/jimh/50450 C01 v.2.9 CR.xml
at net.sf.okapi.common.Util.URLtoURI(Util.java:880)
at net.sf.okapi.common.UtilTest.testToURLtoURI(UtilTest.java:317)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.net.URISyntaxException: Illegal character in path at
index 21: file:/home/jimh/50450 C01 v.2.9 CR.xml
at java.net.URI$Parser.fail(URI.java:2810)
at java.net.URI$Parser.checkChars(URI.java:2983)
at java.net.URI$Parser.parseHierarchical(URI.java:3067)
at java.net.URI$Parser.parse(URI.java:3015)
at java.net.URI.<init>(URI.java:577)
at java.net.URL.toURI(URL.java:918)
at net.sf.okapi.common.Util.URLtoURI(Util.java:878)
... 26 more

Chase Tingley

unread,
Apr 9, 2013, 12:32:42 PM4/9/13
to okapi...@googlegroups.com, Yves Savourel
That test does fail for me as well (OSX).  Note that index 21 is the first whitespace character, not the period.


--
You received this message because you are subscribed to the Google Groups "okapi-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

Yves Savourel

unread,
Apr 9, 2013, 12:36:37 PM4/9/13
to okapi...@googlegroups.com

Fails for me too (Win-7 32bit)

 

From: tin...@gmail.com [mailto:tin...@gmail.com] On Behalf Of Chase Tingley
Sent: Tuesday, April 09, 2013 10:33 AM
To: okapi...@googlegroups.com
Cc: Yves Savourel
Subject: Re: [okapi-devel] illegal path bug after URI changes?

 

That test does fail for me as well (OSX).  Note that index 21 is the first whitespace character, not the period.

 

To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel...@googlegroups.com.

Chase Tingley

unread,
Apr 9, 2013, 12:39:48 PM4/9/13
to okapi...@googlegroups.com
Looking at the javadoc, this seems a little nasty:
The URL class does not itself encode or decode any URL components according to the escaping mechanism defined in RFC2396. It is the responsibility of the caller to encode any fields, which need to be escaped prior to calling URL, and also to decode any escaped fields, that are returned from URL.

This makes it sound like Jim's unittest fails by design of java.net.URL. If that's the case, then the Util.URLtoURI() method that Aaron added (at my suggestion, iirc, to clean up some other code he'd written) shouldn't be used unless the called also can take responsibility for the URL content itself being escaped.

ct

Aaron Madlon-Kay

unread,
Apr 9, 2013, 10:05:34 PM4/9/13
to okapi...@googlegroups.com
Hi all.

Chase is on the money.

One of the things I learned while cleaning up for the spaces-in-path issue is that it's very difficult to correctly create a URL or URI from a string that represents a path, because of encoding issues. I would recommend never doing that; instead, if what you have is actually a file path, do:

new File(myString).toURI()

or if you really need a URL (because File.toURL() is deprecated due to unsafe character handling):

new File(myString).toURI().toURL()

When using the Okapi Util methods, you should do:

Util.toURI(myString)

This first tries to directly make a URI, on the off chance you did handle encoding correctly. If that fails, it will fall back to going through File as above.

According to my understanding, in practice (if you're doing it right) the only time you will have a URL without having had a URI first is when getting a resource from a class or ClassLoader via getResource(). URLtoURI() really only exists (per Chase's suggestion) to avoid try/catching the URISyntaxException in this case, which is not very common (only 8 references in all of Okapi).

-Aaron


2013/4/10 Chase Tingley <tin...@sundell.net>
You received this message because you are subscribed to a topic in the Google Groups "okapi-devel" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/okapi-devel/5_MF_x8wYhw/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to okapi-devel...@googlegroups.com.

Aaron Madlon-Kay

unread,
Apr 9, 2013, 10:10:07 PM4/9/13
to okapi...@googlegroups.com
I should add that if you really want to create a URL or URI from a string (as I note that Jim's string really is trying to be a URL, not a path), you should pass your string through URLEncoder.encode(myString, "utf-8") before constructing the URL or URI.

-Aaron
Reply all
Reply to author
Forward
0 new messages