lexical error: invalid character inside string

911 views
Skip to first unread message

Thomas Bock

unread,
Jan 29, 2013, 12:04:56 PM1/29/13
to us...@couchdb.apache.org

Dear List,

I can not figure out what's wrong here. At first:

 curl -X PUT --data '{"a": "\r"}' --header "Content-Type: application/json" http://localhost:5984/r_test/ab7a
 
works. However, if I use RCurl to send
{\n \"a\": \"\r\" \n}
              ^
I got

[error] [<0.8201.0>] attempted upload of invalid JSON (set log_level to debug to log it)
[debug] [<0.8201.0>] Invalid JSON: {{error,
                                     {9,
                                      "lexical error: invalid character inside string.\n"}},
                                    <<"{\n \"a\": \"\r\" \n}">>}

but with {\n \"a\": \"\\r\" \n}
                      ^^
the same way (RCurl) workes fine. In futon I can have "\r" also.

Maybe someone have some (also general) hints?

Regards
Thomas



Here is some more info:

\\r is working:
## > PUT /r_test/e74e4233746b1ef5605751489d010d54 HTTP/1.1
## Host: localhost:5984
## Accept: */*
## Content-Type: application/json
## Content-Length: 16
##
## * upload completely sent off: 16 out of 16 bytes
## < HTTP/1.1 201 Created
## < Server: CouchDB/1.2.0 (Erlang OTP/R15B02)
## < Location: http://localhost:5984/r_test/e74e4233746b1ef5605751489d010d54
## < ETag: "1-5efc54426509641a0a109a756b96eaa7"
## < Date: Tue, 29 Jan 2013 16:19:18 GMT
## < Content-Type: text/plain; charset=utf-8
## < Content-Length: 95
## < Cache-Control: must-revalidate
## <
## * Connection #0 to host localhost left intact
## >

\r not working
## > PUT /r_test/e74e4233746b1ef5605751489d010e23 HTTP/1.1
## Host: localhost:5984
## Accept: */*
## Content-Type: application/json
## Content-Length: 14
##
## * upload completely sent off: 14 out of 14 bytes
## < HTTP/1.1 400 Bad Request
## < Server: CouchDB/1.2.0 (Erlang OTP/R15B02)
## < Date: Tue, 29 Jan 2013 16:20:54 GMT
## < Content-Type: text/plain; charset=utf-8
## < Content-Length: 48
## < Cache-Control: must-revalidate
## <
## * Connection #0 to host localhost left intact
## Fehler in cdbAddDoc(ccc) (von #37) :  bad_request
## >

Jens Alfke

unread,
Jan 29, 2013, 12:46:57 PM1/29/13
to us...@couchdb.apache.org

On Jan 29, 2013, at 9:04 AM, Thomas Bock <thste...@web.de> wrote:

> curl -X PUT --data '{"a": "\r"}' --header "Content-Type: application/json" http://localhost:5984/r_test/ab7a

I think JSON does not allow literal returns/newlines (or other control characters) inside strings; they have to be backslash-escaped. It’s hard to follow exactly what’s going on through all the levels of quoting and unquoting that happen in the shell and in the Erlang output, but I think you haven’t escaped that \r enough — it probably needs to be "\\\r".

FYI, as a side note, I’ve found the ‘httpie’[1] utility a lifesaver when talking to CouchDB (and other REST/JSON APIs) from the command line. It’s like a souped-up curl with a much clearer syntax for setting query parameters, and the ability to easily specify a JSON body. For example, I can run your same command as:

http PUT :5984/r_test/ab7a a='\r’

—Jens

[1]: https://github.com/jkbr/httpie

Thomas Bock

unread,
Jan 30, 2013, 2:47:14 AM1/30/13
to us...@couchdb.apache.org

Thank you for the answer!

> >  curl -X PUT --data '{"a": "\r"}' --header "Content-Type: application/json"
> http://localhost:5984/r_test/ab7a
>
> I think JSON does not allow literal returns/newlines (or other
> control characters) inside strings; they have to be backslash-

 But the curl line works; and in futon I also can type in a simple \r

> escaped. It’s hard to follow exactly what’s going on through all the
> levels of quoting and unquoting that happen in the shell and in the
> Erlang output, but I think you haven’t escaped that \r enough — it
> probably needs to be "\\\r".

 This ("\\\r") don't work; "\\r" works. Since \r is the end sign
of a gauge (and it don't accept any other) for a measurement value query
I need a reliable way.

Paul Davis

unread,
Jan 30, 2013, 3:17:40 AM1/30/13
to us...@couchdb.apache.org
As you specifically showed in your original email sending the two
bytes 0x5C72 (\r) from curl works correctly. Your R environment
appears to be doing various types of unescaping before sending the
data along to CouchDB which is what's causing you issues.

From what you've shown the three characters 0x5C5C72 (\\r) works while
two bytes 0x5C72 (\r) and four bytes 0x5C5C5C72 (\\\r) do not work.
Most likely what is happening here is that R is unescaping your input.
Thus the three byte example 0x5C65C72 (\\r) is unescaped internally to
the two byte 0x5C72 (\r) sequence. Where as 0x5C72 (\r) and 0x5C5C5C72
(\\\r) are unescaped to 0x0D (carriage return) and 0x5C0D (\ and a
carriage return) respectively.

Basically, you need to work on your R un|escaping semantics to make
sure you're sending the right byte sequences to CouchDB. The easiest
way to accomplish this is to use a proper JSON library instead of
building JSON objects by hand.

HTH

Thomas Bock

unread,
Jan 30, 2013, 6:01:39 AM1/30/13
to us...@couchdb.apache.org

Thank you Paul!

I found a solution that I'm not glad with

(last part of Readme in https://github.com/wactbprot/R4CouchDB)

The chain was this:

1. put \r somehow in the db

2. get it tnto a R list by RJSONIO (results in e.g. a="\r")

3. do stuff

4. convert list to json by RJSONIO (results in e.g. \"a\":\"\r\")

5. send it back to db by RCurl

6. receive a error

between 4. and 5. I now replace \r by \\r resulting in a \r in

the database  document

...

> As you specifically showed in your original email sending the two
> bytes 0x5C72 (\r) from curl works correctly. Your R environment
> appears to be doing various types of unescaping before sending the
> data along to CouchDB which is what's causing you issues.
>
> From what you've shown the three characters 0x5C5C72 (\\r) works while
> two bytes 0x5C72 (\r) and four bytes 0x5C5C5C72 (\\\r) do not work.
> Most likely what is happening here is that R is unescaping your input.
> Thus the three byte example 0x5C65C72 (\\r) is unescaped internally to
> the two byte 0x5C72 (\r) sequence. Where as 0x5C72 (\r) and 0x5C5C5C72
> (\\\r) are unescaped to 0x0D (carriage return) and 0x5C0D (\ and a
> carriage return) respectively.
>
> Basically, you need to work on your R un|escaping semantics to make
> sure you're sending the right byte sequences to CouchDB. The easiest
> way to accomplish this is to use a proper JSON library instead of
> building JSON objects by hand.

I wrote a mail to Duncan author of RJSONIO;

if someone is interested in this issue I can | it ...

>
> HTH

it does

regards

Thomas
Reply all
Reply to author
Forward
0 new messages