Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

How to create Utilities method allow serizalization and deserialization request payload and response payload.

93 views
Skip to first unread message

Cường Nguyễn Mạnh

unread,
Oct 8, 2024, 11:17:31 AM10/8/24
to GWT Users
I read very much document, very much tutorial, but I cannot do it.

My task is: serialize request and deserialize response body without starting a server, without starting UI. I act as client.

All I have is some .jar file

d.png

Please, I would appreciate any help.

Colin Alworth

unread,
Oct 8, 2024, 11:29:50 AM10/8/24
to GWT Users
In theory what you're describing is possible with just the RemoteService interface, the corresponding <hashname>.gwt.rpc policy file, and each of the classes referenced by that policy file - the com.google.gwt.user.client.rpc.impl.AbstractSerializationStream type hierarchy should have the pieces you would need to achieve this.

In practice though, there is some JSNI required, or at least using the embedded rhino library to deal with some of the specifics, and there are no built in utilities to make this easy to do. Can you elaborate on what you're trying to achieve with this - there might be a simpler way to just help populate some simple requests and read some responses, without either a server or client process running.

There are also a few non-Java implementations out there that could help (again, for simple payloads).

I forked GWT-RPC some years ago and rewrote it without reflection or generators - one of my other main goals was to allow arbitrary other clients, and to make the wire format fully symmetrical. It has endpoints for android/jvm clients, websocket clients, and webworker clients. It cannot do 100% of what the original GWT-RPC can do, but it is in use in production and has been for some years, targeting this specific subset of functionality. I'm not sure if changing your RPC implementation is possible, but if it is, consider https://github.com/Vertispan/gwt-rpc/. There are also closed-source c++ and c# clients - obviously I cannot share that code, but I mention it to make the point that this fork is intended to be much friendlier to alternative implementations.

Cường Nguyễn Mạnh

unread,
Oct 8, 2024, 1:03:30 PM10/8/24
to google-we...@googlegroups.com
Thank you because the detail response.

I want to crawl data on a public website, I opened devtools and saw that it was written by GWT RPC.

This is the body of request I saw: 

7|0|10|https://a.b.c.d/e|5C6CDB13D0FD25B266F3C36FA7FF6ED9|a1.a2.a3.DataService|getCourseMembers|java.lang.Long/4227064769|java.lang.String/2004016611|java.util.List|20204524|java.util.Arrays$ArrayList/2507071751|20241|1|2|3|4|3|5|6|7|5|TXbrzIAAA|8|9|1|6|10|

As you can see, no problem with that syntax, I can understand roughly, I know the method is getCourseMembers. I want to build a function should return above body, like: 
                  public static String getBodyEncoded(String methodName, ... String methodBody ...) or something similar, and return the body above to send to server.

I also want to know the last past of request syntax:
                 1|2|3|4|1|5|6|7|7|8|7|9|7|10|7|11|7|12|7|13|7|14|

The next is the response body. This is really the problem. A response is very long, I put it in attached files.

I saw a JsonArray with more than 2000 elements, and I cannot understand what are they. The only thing I understand is the 2042nd element, it contains an unorder list. Maybe some elements before contains data about the order.

I want to build a method to extract/deserialize this response.

I am a newbie, if my question can be completed, can you guide me with more details, please?
Java is good, but other languages are acceptable, I still can deploy it.

response.png
response2.png

Colin Alworth

unread,
Oct 8, 2024, 1:40:08 PM10/8/24
to GWT Users
I'd suggest reading the stream reader/write subtypes of AbstractSerializationStream to understand what all of the values are for - in short, a gwt-rpc response is a payload and a string table, and the payloads elements will reference the string table. You cannot know what the structure is for certain without seeing the original Java types being serialized, but often you can make good guesses.

I'd also suggest reading stackoverflow posts and the like showing how to deserialize other payloads just from context - here's a post that breaks down a payload to understand its contents: https://stackoverflow.com/questions/35047102/serializing-rpc-gwt/35047887#35047887


In short though, your response value is _probably_ be a List of CourseMember types - knowing that class will help you. I can't easily guess more though, as the above doc says, the json array is read backwards, so the important details would be right before and after the string array - you have 1,7,2,1[...strings...] in the second image. From that I can say

1: if this was zero, it would be a null, since it is a positive number, read the (value - 1) entry from the string table, which is ArrayList, so: read a value of type ArrayList from the stream
7: the ArrayList has 7 items
2: first item in the arraylist - as above, if this was 0, it would be null, since it is positive, read the (value - 1) entry from the string table, and decode that type, so: read a CourseMember object from the payload
1: this is _probably_ the number 1 in the first field of the first CourseMember.
...

A parser continuing in this way, with knowledge of the structure of these types could be written to decode this payload. I don't know of an off-the-shelf tool that will do it for you in a truly automated way, but could consult to write one, or guide your project in implementing one by hand.

Since you're scraping anyway, consider just scraping the results of the rendered page? This will likely take substantially more CPU time, but ridiculously less developer time to implement.

Craig Mitchell

unread,
Oct 9, 2024, 12:13:42 AM10/9/24
to GWT Users
I second what Colin says "Since you're scraping anyway, consider just scraping the results of the rendered page? This will likely take substantially more CPU time, but ridiculously less developer time to implement."

GWT RPC is not an API.  It will constantly change as the website updates.

I'd recommend either using a proper API (if one exists), or an off-the-shelf scraper tool.

Cường Nguyễn Mạnh

unread,
Oct 9, 2024, 9:50:59 PM10/9/24
to GWT Users
Thank to Craig and Colin,

I spent all day to know the stackoverflow post and document Colin provided. Yeah, I know some (not all) rules of request payload, and I can use it to replace param. But the response deserialization is hard, you not mention in you answer, my goal still be parse the response receive from server. 

I need crawl 10000 users via GWT RPC, it is single-time crawl (I crawl it once) for my service. So, the performance is not important.

Again, I know some data from server (response2.png) I attached before, There is a Json, start with //OK, next is Array with 2045 elements (0-2044), element from 0 - 2041 is something to confusing, element 2042 is an array list, it is arranged in a jumble, maybe above data (element from 0-2041) contains order of this.

If you need the exactly response payload, please reply and I public it.
Vào lúc 11:13:42 UTC+7 ngày Thứ Tư, 9 tháng 10, 2024, Craig Mitchell đã viết:

Colin Alworth

unread,
Oct 9, 2024, 10:28:40 PM10/9/24
to GWT Users
To reiterate, if performance isn't important, you will likely be far far better off writing a quick screen scraper (such as with selenium, etc). Even if it has to load the page, read 10 rows at a time, click next, wait 3s to load, repeat, it will probably take longer to write than to run.

If you have a specific question about part of the full payload, you should share the details you've worked out so far and the part of the payload you have the question about, but it isn't going to make sense to share the full contents on the mailing list and have us give you a strategy (as it will probably just be what has already been discussed). You may not have all of the details, but you have a lot more than we do, and you can very likely us that context to work out what the format is - consider comparing your guess of what the data is with what you see in the UI.

The order of the data is what the linked post/document suggests - read the payload backwards starting just before the string array. Nothing is jumbled, the order you read values (in reverse) is the order that the data was written, starting from the list that holds the objects. The request order isn't quite the same (the string table is part of the |-delimited array, with a count before the first string, then the payload values in order, rather than reverse), but it is close enough that once you work out the structure of any given type in either request or response, it will be the same for both.

If this is public data, also consider contacting the organization that makes it available, it's always possible they might be able to share more directly?

Finally, depending on the particulars of this, it might be possible to contract with me directly to help write this for you, if that makes sense. If so, please email me at co...@vertispan.com to discuss more.
Reply all
Reply to author
Forward
0 new messages