is there any json parser that can parse UTF-8 json text

5,098 views
Skip to first unread message

Alex

unread,
Jul 14, 2010, 9:43:22 AM7/14/10
to Google Web Toolkit
i tried gson, but it give me error.
this is the code i used

public class Test {
private static final String charEncoding="UTF-8";

private static final String fileName="c:\\test.txt";
public static void main(String args[]){
try{
File file=new File(fileName);
if(file.canRead()){
FileInputStream inStream=new FileInputStream(file);
InputStreamReader reader=new InputStreamReader(inStream,
charEncoding);
JsonParser parser=new JsonParser();
JsonElement jsonA=parser.parse(new BufferedReader(reader));
System.out.println(jsonA.isJsonArray());
System.out.println(jsonA.toString());
}
}catch(IOException e){
e.printStackTrace();
}
}

}

the jar from json.org dun work too.

so is there any json parser that can parse UTF-8 json text (japanese
and chinese character in particular)

Jaroslav Záruba

unread,
Jul 14, 2010, 9:56:00 AM7/14/10
to google-we...@googlegroups.com
Have you tried these?
for server-side (GAE) com.google.appengine.repackaged.org.json.JSONObject.JSONObject(String arg0)
for client com.google.gwt.json.client.JSONParser

I haven't tried japanese or chinese characters though, rather stuff like this: "Příliš žluťoučký kuň úpěl ďábělské ódy"


--
You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group.
To post to this group, send email to google-we...@googlegroups.com.
To unsubscribe from this group, send email to google-web-tool...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-web-toolkit?hl=en.


lineman78

unread,
Jul 14, 2010, 2:39:40 PM7/14/10
to Google Web Toolkit
I use overlay types via the javascript JSON.parse method for the
client side, which is available in newer browsers and you need to
include json2.js for older browsers(eval is also an alternative). For
server side, I suggest using the Jersey project with a JSON context
provider as shown here:

http://blogs.sun.com/enterprisetechtips/entry/configuring_json_for_restful_web

On Jul 14, 7:56 am, Jaroslav Záruba <jaroslav.zar...@gmail.com> wrote:
> Have you tried these?
> for server-side (GAE)
> *com.google.appengine.repackaged.org.json.JSONObject.JSONObject(String
> arg0)*
> for client *com.google.gwt.json.client.JSONParser*
>
> I haven't tried japanese or chinese characters though, rather stuff like
> this: "Příliš žluťoučký kuň úpěl ďábělské ódy"
>
> > google-web-tool...@googlegroups.com<google-web-toolkit%2Bunsu...@googlegroups.com>
> > .

Alex

unread,
Jul 15, 2010, 2:59:04 AM7/15/10
to Google Web Toolkit
can i still use the
com.google.appengine.repackaged.org.json.JSONObject.JSONObject(String
arg0) if i dont deploy it to GAE? where can i get the jar file?

On Jul 14, 9:56 pm, Jaroslav Záruba <jaroslav.zar...@gmail.com> wrote:
> Have you tried these?
> for server-side (GAE)
> *com.google.appengine.repackaged.org.json.JSONObject.JSONObject(String
> arg0)*
> for client *com.google.gwt.json.client.JSONParser*
>
> I haven't tried japanese or chinese characters though, rather stuff like
> this: "Příliš žluťoučký kuň úpěl ďábělské ódy"
>
> > google-web-tool...@googlegroups.com<google-web-toolkit%2Bunsu...@googlegroups.com>
> > .

Alex

unread,
Jul 15, 2010, 3:05:45 AM7/15/10
to Google Web Toolkit
for client side i have alwasy use
com.google.gwt.json.client.JSONParser without any problem.

its for my server side.

im not trying to convert the json string into java class, but rather i
just need to send the string representation of a json object that was
extracted from a json array to client side.

so its like i have a txt file that contains json array, inside the
array has many json object, i just need to randomly pick any one of
the json object in that array and send it to client app.

does the jersey project thingy able to do that?

On Jul 15, 2:39 am, lineman78 <linema...@gmail.com> wrote:
> I use overlay types via the javascript JSON.parse method for the
> client side, which is available in newer browsers and you need to
> include json2.js for older browsers(eval is also an alternative).  For
> server side, I suggest using the Jersey project with a JSON context
> provider as shown here:
>
> http://blogs.sun.com/enterprisetechtips/entry/configuring_json_for_re...
>
> On Jul 14, 7:56 am, Jaroslav Záruba <jaroslav.zar...@gmail.com> wrote:
>
> > Have you tried these?
> > for server-side (GAE)
> > *com.google.appengine.repackaged.org.json.JSONObject.JSONObject(String
> > arg0)*
> > for client *com.google.gwt.json.client.JSONParser*
>
> > I haven't tried japanese or chinese characters though, rather stuff like
> > this: "Pøíli¹ ¾lu»ouèký kuò úpìl ïábìlské ódy"

lineman78

unread,
Jul 15, 2010, 12:49:00 PM7/15/10
to Google Web Toolkit
Jersey is a project for implementing REST services, but would probably
be overkill for your situation. You could simply write a servlet with
a doGet method if you are handling any marshaling yourself. The
Jersey project allows you to expose methods to specific URLs and any
marshaling/unmarshaling is handled under the covers by these things
called providers. It is intended to be used with JAXB to write an XSD
from which you generate POJOs with JAXB and Jersey serializes them to
XML, but custom providers can also serialize to JSON or you can write
custom message body writers for your own protocol. For the simple
case you are talking about I would just write a class that extends
HttpServlet and overrides the doGet method and writes to the response
directly.

http://www.javafaq.nu/java-example-code-1022.html

Alex

unread,
Jul 16, 2010, 4:59:17 AM7/16/10
to Google Web Toolkit
yup im using the doGet method, but i need a parser to extract the json
object out of the json array (string). something like
JSONObject jsonObj=Parser.parse(some json array string).get(some
integer);
String resData=jsonObj.toString();

i just need some parser that can parse json that contains UTF-8 stuff.

Thomas Broyer

unread,
Jul 16, 2010, 6:25:16 AM7/16/10
to Google Web Toolkit


On 16 juil, 10:59, Alex <monsterno...@gmail.com> wrote:
> yup im using the doGet method, but i need a parser to extract the json
> object out of the json array (string). something like
> JSONObject jsonObj=Parser.parse(some json array string).get(some
> integer);
> String resData=jsonObj.toString();
>
> i just need some parser that can parse json that contains UTF-8 stuff.

If the parser accepts a String (or Reader), then there's no character-
encoding entering into play, so there's no "UTF-8 stuff". The "UTF-8
stuff" is only when reading "bytes" into "characters".

That being said, I don't have a solution to your problem (which you
don't explicitly stated: you have an error, or a problem, but which
one? what's the actual result? what's the expected result?), I can
only say that the JSON parser probably isn't wrong.
Also, make sure you send the result back to the client in UTF-8, i.e.
on your HttpServletResponse, call setContentType("application/json;
charset=UTF-8") before calling getWriter().

Jaroslav Záruba

unread,
Jul 16, 2010, 6:43:40 AM7/16/10
to google-we...@googlegroups.com
...and my favorite one (because done couple of times) - don't write() character data, print() it

--
You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group.
To post to this group, send email to google-we...@googlegroups.com.
To unsubscribe from this group, send email to google-web-tool...@googlegroups.com.

Thomas Broyer

unread,
Jul 16, 2010, 8:50:11 AM7/16/10
to Google Web Toolkit


On 16 juil, 12:43, Jaroslav Záruba <jaroslav.zar...@gmail.com> wrote:
> ...and my favorite one (because done couple of times) - don't write()
> character data, print() it

Can you elaborate?

Sun-JDK's PrintWriter#print() ultimately call #write() (which in turn
delegates to write() of the wrapped Writer). The only difference I can
see is that print(null) outputs "null" whereas write(null) throws a
NullPointerException.

Jaroslav Záruba

unread,
Jul 16, 2010, 9:08:32 AM7/16/10
to google-we...@googlegroups.com
oh sry, i meant printing to outputStream instead of writing


--

Alex

unread,
Jul 16, 2010, 10:42:36 AM7/16/10
to Google Web Toolkit
ermm,
this is the situation,
i have a txt file, it contains

[
{
"cmdType":"G",
"row":0,
"col":2,
"ans":"心",
"cmd":"C",
"qNum":19
},
{
"cmdType":"G",
"row":1,
"col":1,
"ans":"心",
"cmd":"C"
}
]

i need to parse that and get one of the json object that inside that
json array and send that json object as string to client. the code
that i posted on the first post only works when there is no chinese
character in it.

this is the exception i get when it has chinese character in it.
Exception in thread "main" com.google.gson.JsonParseException: Failed
parsing JSON source: java.io.BufferedReader@530daa to Json
at com.google.gson.JsonParser.parse(JsonParser.java:57)
at xwp.server.Test.main(Test.java:23)
Caused by: com.google.gson.TokenMgrError: Lexical error at line 1,
column 1. Encountered: "\ufeff" (65279), after : ""
at
com.google.gson.JsonParserJavaccTokenManager.getNextToken(JsonParserJavaccTokenManager.java:
1193)
at com.google.gson.JsonParserJavacc.jj_ntk(JsonParserJavacc.java:635)
at com.google.gson.JsonParserJavacc.parse(JsonParserJavacc.java:10)
at com.google.gson.JsonParser.parse(JsonParser.java:54)
... 1 more

i would like to know how do i get this
{
"cmdType":"G",
"row":1,
"col":1,
"ans":"心",
"cmd":"C"
}

out of the json array and send it as string to client.

Thomas Broyer

unread,
Jul 16, 2010, 11:38:46 AM7/16/10
to Google Web Toolkit
Which means that, either:
- You're giving the BOM to GSON as the very first character, which is
a mistake (you should skip it if there's one in your file; this is
something that unfortunately Java won't do for you [1,2])
- You have a zero-width no-break space in your JSON (but because the
error points line=1;column=1, I'd rather say it's the BOM)

[1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058
[2] http://stackoverflow.com/questions/1835430/byte-order-mark-screws-up-file-reading-in-java
[3] http://www.fileformat.info/info/unicode/char/feff/index.htm

Alex

unread,
Jul 17, 2010, 12:06:24 AM7/17/10
to Google Web Toolkit
Reply all
Reply to author
Forward
0 new messages