As a Norwegian I have to take a look a look at this, the Norwegian å-Å, æ-Æ and ø-Ø characters are often used in names.
Benny
As a Norwegian I have to take a look a look at this, the Norwegian å-Å, æ-Æ and ø-Ø characters are often used in names.
Benny
From: rav...@googlegroups.com [mailto:rav...@googlegroups.com] On Behalf Of Ayende Rahien
Sent: 20. mai 2010 22:16
To: rav...@googlegroups.com
Subject: Re: [RavenDB] [Client API] special characters encoding/decoding in queries
This has proven to be surprisingly difficult to solve.
It looks like the HttpListenerRequest.QueryString is the bad ****** here. Replaced it with System.Web.HttpUtility.ParseQueryString(Uri.UnescapeDataString(request.Url.Query)); in the HttpListenerRequestAdapter. And the query looks correct on the serverside. But the test still fails.
It looks like I have a fix for this now. I’ll send a pull request soon.
It’s seems like the server screw things up in the QueryString, I made a temporary fix to this. I have to look into what happens behind the scene at the webserver.
We have a little problem with seperators, DocumentStoreServerTests.Can_get_correct_averages_from_map_reduce_index fails in my environment, because the server responds with age: 26,5, this value is interpreted as 265 from the NewTonSoft.Json library.
It looks like it works when adding documents from the UI and querying the index.
I tried using curl to add information:
curl -X PUT http://localhost:8080/docs/bob -d "{ Name: 'Bob', HomeState: 'Småland', ObjectType: 'User' }"
curl -X PUT http://localhost:8080/docs/sarah -d "{ Name: 'Sarah', HomeState: 'Illinois', ObjectType: 'User' }"
curl -X PUT http://localhost:8080/docs/paul -d "{ Name: 'Paul', HomeState: 'Småland', ObjectType: 'User' }"
curl -X PUT http://localhost:8080/docs/mary -d "{ Name: 'Mary', HomeState: 'Småland', ObjectType: 'User' }"
Creating the index:
curl -X PUT http://localhost:8080/indexes/usersByHomeState -d "{ Map:'from doc in docs\r\nwhere doc.ObjectType==\"User\"\r\nselect new { doc.HomeState }' }"
Quering the index:
curl -X GET http://localhost:8080/indexes/usersByHomeState?query=HomeState:Småland
This gives ut the result:
{
"Results": [],
"IsStale": false,
"TotalResults": 0
}
Looking at the document from the Raven UI

Edit Document Bob shows us that the Småland was saved with the wrong character:

I change it to å like it should be in the UI and it looks pretty, both in the list and in the edit document dialog:


I don’t like that it shows ut the \u00e5, but I can live with that for now as it shows correct in the Edit Document dialog.

When I know run the query from Curl I get what I want:
curl -X GET http://localhost:8080/indexes/usersByHomeState?query=HomeState%3ASmåland
{
"Results": [
{
"Name": "Bob",
"HomeState": "Småland",
"ObjectType": "User",
"@metadata": {
"Content-Type": "application/x-www-form-urlencoded",
"Last-Modified": "Fri, 21 May 2010 09:46:03 GMT",
"@id": "bob",
"@etag": "37d6aef2-001e-9f31-11df-64bdaace59f8"
}
}
],
"IsStale": false,
"TotalResults": 1
}
Voila, almost perfect! So it looks like I have to take a round with the put responder to fix that.
From: rav...@googlegroups.com [mailto:rav...@googlegroups.com] On Behalf Of Ayende Rahien
Sent: 21. mai 2010 10:12
To: rav...@googlegroups.com
Subject: Re: [RavenDB] [Client API] special characters encoding/decoding in queries
That one I know how to fix, no worries here.
2010/5/21 Asbjørn Ulsberg <asbj...@gmail.com>
The separator issue sounds like an i18n problem. Not sure what culture options you have in NewtonSoft.Json.
-Asbjørn
On Fri, 21 May 2010 02:40:20 +0200, Jan Benny Thomas <jan.t...@lyse.net> wrote:
It looks like I have a fix for this now. I’ll send a pull request soon.
It’s seems like the server screw things up in the QueryString, I made a
temporary fix to this. I have to look into what happens behind the scene at
the webserver.
We have a little problem with seperators,
DocumentStoreServerTests.Can_get_correct_averages_from_map_reduce_index
fails in my environment, because the server responds with age: 26,5, this
value is interpreted as 265 from the NewTonSoft.Json library.
From: rav...@googlegroups.com [mailto:rav...@googlegroups.com] On Behalf
Of Ayende RahienSent: 20. mai 2010 22:37
To: rav...@googlegroups.com
Subject: Re: [RavenDB] [Client API] special characters encoding/decoding in
queries
We might need to test this on IIS as well, this is running using the
embedded web server.
On Thu, May 20, 2010 at 11:36 PM, Ayende Rahien <aye...@ayende.com
> wrote:
Run the test, you'll be able to see what is going on.
The received string is located in Index.Query(IndexQuery query), in the
query.Query property.
On Thu, May 20, 2010 at 11:31 PM, Jan Benny Thomas <jan.t...@lyse.net>
wrote:
As a Norwegian I have to take a look a look at this, the Norwegian å-Å, æ-Æ
and ø-Ø characters are often used in names.
Benny
From:
rav...@googlegroups.com [mailto:rav...@googlegroups.com] On Behalf
Of Ayende Rahien
Sent: 20. mai 2010 22:16
To:
Subject: Re: [RavenDB] [Client API] special characters encoding/decoding in
queries
This has proven to be surprisingly difficult to solve.
The failing test can be found here:
DocumentStoreServerTests.Can_query_using_special_characters
The problem is that I don't understand how the query string is being parsed.
More to the point, everything that I tried (Uri.EscapeDataString,
Uri.EscapeUriString, HttpUtility.UrlEncode) and with all the encoding that I
tried, it doesn't work.
Any idea how to generate the working string from .NET?
On Thu, May 20, 2010 at 8:26 PM, Ayende Rahien <aye...@ayende.com
> wrote:
Hm, I'll add a test case for that, I am doing something bad in the encoding,
that I already knew..
On Thu, May 20, 2010 at 8:23 PM, styx31 <
thomas...@gmail.com> wrote:
It’s still a issue when using CURL.
It’s still a issue when using CURL.
It looks like it is a Curl issue. Ravens implementation defaults the encoding to UTF-8, but Curl sends the data as UTF-7.
If we demand the content to be encoded as UTF-8 we should be safe.
It looks like it is a Curl issue. Ravens implementation defaults the encoding to UTF-8, but Curl sends the data as UTF-7.
If we demand the content to be encoded as UTF-8 we should be safe.
I could only see application/x-www-form-urlencoded when we us CURL, so we have to do UrlDecode on the inputstream. Using UTF-7 seems to work as well.
I could only see application/x-www-form-urlencoded when we us CURL, so we have to do UrlDecode on the inputstream. Using UTF-7 seems to work as well.
From: rav...@googlegroups.com [mailto:rav...@googlegroups.com] On Behalf Of Ayende Rahien
Sent: 22. mai 2010 01:02
It may look like the HttpListener doesn’t automatical doing the urldecoding on behalf of us like it should.
It seems that the Httplistener uses the Raw stream instead of the the unescaped data.
The UnescapeRequestUrl property indicates if HttpListener uses the raw unescaped URI instead of the converted URI where any percent-encoded values are converted and other normalization steps are taken. Default value true. It looks like setting this to false changes things.
It may look like the HttpListener doesn’t automatical doing the urldecoding on behalf of us like it should.
It seems that the Httplistener uses the Raw stream instead of the the unescaped data.
The UnescapeRequestUrl property indicates if HttpListener uses the raw unescaped URI instead of the converted URI where any percent-encoded values are converted and other normalization steps are taken. Default value true. It looks like setting this to false changes things.
It worked and got correct for the Curl Put operation, but it set everything else I did to failure…
It worked and got correct for the Curl Put operation, but it set everything else I did to failure…
All good now…almost. All looks in the line of my recent attempts.
The
curl -X GET http://localhost:8080/indexes/Raven/DocumentsByEntityName?query=&start=0&pageSize=10&cutOff=2010-05-22T11:47:21.9098944+02:00 , what happens here is that the + sign gets stripped in the QueryString handling.
All good now…almost. All looks in the line of my recent attempts.
The
curl -X GET http://localhost:8080/indexes/Raven/DocumentsByEntityName?query=&start=0&pageSize=10&cutOff=2010-05-22T11:47:21.9098944+02:00 , what happens here is that the + sign gets stripped in the QueryString handling.
Yes, been there, done that! The test works, Curl attempt gives us an error. If we can live we that, there is no problem.
Yes, been there, done that! The test works, Curl attempt gives us an error. If we can live we that, there is no problem.
Funnily enough, the + turning into a space is a long-running browser compat feature, as space encoded as + hasn’t been in any spec for a long time. It’s not even an accepted charater so its presence itself is a bug. :)
I have read about that, I thougth that Curl would urlencode it, but it didn’t.
So it is not a problem.