JSON 2.0 batch

63 views
Skip to first unread message

basti

unread,
Nov 12, 2010, 7:14:00 AM11/12/10
to JSON-RPC
Hi,

I saw the batch in the new proposed specification. And I'm wondering:
what is the benefit? It seems, the server is not required to do
anything differently from the old way of just sending the requests one
by one. The difference I see is only, that the responses are sent only
after all requests have been executed. The specification does not
require any ordering, nor is there any "batch id". I don't see a
benefit, since the client still has to find the correct response for
every request in the batch. There is also no bandwidth improvement,
since every request even has to contain the version number. What is
the point?

Erik

unread,
Nov 12, 2010, 11:05:37 AM11/12/10
to JSON-RPC
In batch mode, you might be eliminating the setup/knock down costs for
each connection depending on the server and client environments. That
is probably not that big a savings.

The server may also process the requests concurrently. If the client
is restricted to sending them serially, this is a big help.

Code may also look cleaner. Setup all of the requests, execute, and
process return.

Batch doesn't hurt you if the server doesn't do concurrent processing
and helps you when it does.

The only time you can't do batch is with dependent calls. Anybody want
to think about request chaining in batch? Anybody involved in the
numerous debates over batch will be cowering under their desks about
now. :)

Read the history of batch in previous discussions. It has been a
contentious issue. My opinion is that the concurrency on the server
side is its biggest pro. It doesn't hurt the spec to include it. It
doesn't make it any harder to implement the server side or client side
in libraries since you can just loop over each request and treat it as
a single request server side.

Robert Goldman

unread,
Nov 12, 2010, 11:31:46 AM11/12/10
to json...@googlegroups.com
On 11/12/10 Nov 12 -10:05 AM, Erik wrote:
> In batch mode, you might be eliminating the setup/knock down costs for
> each connection depending on the server and client environments. That
> is probably not that big a savings.
>
> The server may also process the requests concurrently. If the client
> is restricted to sending them serially, this is a big help.
>
> Code may also look cleaner. Setup all of the requests, execute, and
> process return.
>
> Batch doesn't hurt you if the server doesn't do concurrent processing
> and helps you when it does.
>
> The only time you can't do batch is with dependent calls. Anybody want
> to think about request chaining in batch? Anybody involved in the
> numerous debates over batch will be cowering under their desks about
> now. :)
>
> Read the history of batch in previous discussions. It has been a
> contentious issue. My opinion is that the concurrency on the server
> side is its biggest pro. It doesn't hurt the spec to include it. It
> doesn't make it any harder to implement the server side or client side
> in libraries since you can just loop over each request and treat it as
> a single request server side.

No, that's not entirely accurate. It /may/ be harder to implement
server-side because the server /must/ return an array of responses. You
cannot simply squirt back responses onesy-twosy. That is, the server
/cannot/ treat each component as a single request.

That means at the very least there's additional code, and my impression
is that there's additional hair as well, especially where abnormal
conditions are concerned. That's why I don't favor making this
mandatory. I have a socket-based json-rpc implementation, mostly for
use locally (as an inter-language RPC, rather than for use in web apps).
Batch mode is a nuisance for me, and no payoff, so I probably will keep
my library private and not bother complying with this aspect of the
standard....

cheers,
r

basti

unread,
Nov 12, 2010, 3:00:18 PM11/12/10
to JSON-RPC
On 12 Nov., 17:05, Erik <nedwi...@gmail.com> wrote:
> In batch mode, you might be eliminating the setup/knock down costs for
> each connection depending on the server and client environments. That
> is probably not that big a savings.

You mean when wrapped inside HTTP. Makes sense.

> The server may also process the requests concurrently. If the client
> is restricted to sending them serially, this is a big help.

In socket mode the difference for the client that it has to put commas
between the requests and put brackets around them. I see no reason why
a client suddenly is able to send requests serially, when the
structure doesn't even change. The server can execute them
concurrently anyway (why not?)

> Code may also look cleaner. Setup all of the requests, execute, and
> process return.

See above, the only difference from what I understood is the delimeter
(or the lack of it).

> Batch doesn't hurt you if the server doesn't do concurrent processing
> and helps you when it does.

Right, it doesn't hurt, but it doesnt help either. Ok, with http it
might. But thats it.

> The only time you can't do batch is with dependent calls. Anybody want
> to think about request chaining in batch? Anybody involved in the
> numerous debates over batch will be cowering under their desks about
> now. :)

I already saw the large discussions, but I didnt see anything about
the benefit of the current solution over the old spec.

> It
> doesn't make it any harder to implement the server side or client side
> in libraries since you can just loop over each request and treat it as
> a single request server side.

I don't know if it makes the server implementation "harder", but it
certainly adds code, since this is added functionality it has to
provide. I mean the server still has to support non-batched requests.

Matt (MPCM)

unread,
Nov 12, 2010, 5:32:19 PM11/12/10
to JSON-RPC
On Nov 12, 7:14 am, basti <b.pran...@googlemail.com> wrote:
> what is the benefit? [...]

One point to clarify, batch *is* part of the 2.0 specification. There
is no "proposed" specification anymore.

That aside, doing many single requests vs one (or more) batch requests
is more of a statement of intent by the client. Sometimes the client
does not know that it could have sent a batch request instead, it all
depends on the event model(s) the client is using. Other benefits are
more ethereal, and depend heavily on the server implementation, but
they do exist.

Somewhat like adding the version field, the goal was more clarity than
bandwidth/line optimization.

The benefit of batch is that the client can send many requests,
without incurring the transport related overhead on a per request
object basis. Think about sending 1k requests vs 1 request with a 1k
payload. It makes a difference depending on your transport and the
transport server's need for resources.

It also works well for transports where you do not want to hold the
transport open for long periods of time, or can not hold it open, or
where the transports are inherently client initiated and terminated.

It also detaches the number of request objects that could be processed
from the transport connections.

I think in general the quickest way to implement batch on the server
side is to make the default processing mode be batch. So if you have a
single request come in, put it in as the only item in the array, but
record the fact that you did that. So at the end you don't do any
batch related errors/responses. Implementing batch should be trivial
unless you designed a very procedural json-rpc server, IMO. There are
lots of ways to handle it though.

--
Matt (MPCM)

basti

unread,
Nov 13, 2010, 4:48:31 AM11/13/10
to JSON-RPC
> One point to clarify, batch *is* part of the 2.0 specification. There
> is no "proposed" specification anymore.

Sorry, I saw that in the meantime, the wikipedia entry seems to be out
of date.

> That aside, doing many single requests vs one (or more) batch requests
> is more of a statement of intent by the client.

What intent would that be?

> Somewhat like adding the version field, the goal was more clarity than
> bandwidth/line optimization.

I don't see the additional clarity, even for HTTP. One JSON request
per HTTP request is as clear as it can get IMO.

> The benefit of batch is that the client can send many requests,
> without incurring the transport related overhead on a per request
> object basis. Think about sending 1k requests vs 1 request with a 1k
> payload. It makes a difference depending on your transport and the
> transport server's need for resources.

Ok, so it is a transport dependend optimziation. I see the resource
benefit when used over HTTP.

> It also works well for transports where you do not want to hold the
> transport open for long periods of time, or can not hold it open, or
> where the transports are inherently client initiated and terminated.

So what happens when you cannot hold the transport open for a long
enough period of time? If anything, batch increases the time between
request and response. Am I missing anything?

> I think in general the quickest way to implement batch on the server
> side is to make the default processing mode be batch. So if you have a
> single request come in, put it in as the only item in the array, but
> record the fact that you did that. So at the end you don't do any
> batch related errors/responses.

Agree.

> Implementing batch should be trivial
> unless you designed a very procedural json-rpc server, IMO. There are
> lots of ways to handle it though.

I just wanted to point out that if someone implements a socket only
server, then he can (and should be allowed to) leave out the batch,
because it has no benefit at all. Otherwise I can't figure out the
"Does distributed computing have to be any harder than this? I don't
think so" aspect.

Matt (MPCM)

unread,
Nov 13, 2010, 11:06:02 AM11/13/10
to JSON-RPC
On Nov 13, 4:48 am, basti <b.pran...@googlemail.com> wrote:
> > That aside, doing many single requests vs one (or more) batch requests
> > is more of a statement of intent by the client.
>
> What intent would that be?

The intent is that the client can determine and handle request/
response objects in a block scoping. The difference is subtle, but
gives more control to the client in terms of handling. It is a pattern
that people have constantly tried to put into a special
rpc.multicall() like function... by having it as part of the spec,
people can avoid constantly trying to create a call which basically
does the same thing.

> > The benefit of batch is that the client can send many requests,
> > without incurring the transport related overhead on a per request
> > object basis. Think about sending 1k requests vs 1 request with a 1k
> > payload. It makes a difference depending on your transport and the
> > transport server's need for resources.
>
> Ok, so it is a transport dependend optimziation. I see the resource
> benefit when used over HTTP.

It can result in a transport optimization, but really it is about the
client being able to send and receive in blocks.

> > It also works well for transports where you do not want to hold the
> > transport open for long periods of time, or can not hold it open, or
> > where the transports are inherently client initiated and terminated.
>
> So what happens when you cannot hold the transport open for a long
> enough period of time? If anything, batch increases the time between
> request and response. Am I missing anything?

I try not to link the response generation time to the transport life
time, at least conceptually. json-rpc could be used on intermittent
and possibly over multiple and varying transports. json-rpc by pigeon
should work as an example. Or up on one, down on another... so up on
pigeon, but back by fox.

It is not about processing time, but the nature of the request
object(s) and/or the transport(s). If the request is going to take 2
days to finish, then it needs to do so... if either the transport or
the nature of the request have a certain amount of latency included,
then batch really makes a lot of sense if you know that ahead of time
as a client.

> > Implementing batch should be trivial
> > unless you designed a very procedural json-rpc server, IMO. There are
> > lots of ways to handle it though.
>
> I just wanted to point out that if someone implements a socket only
> server, then he can (and should be allowed to) leave out the batch,
> because it has no benefit at all. Otherwise I can't figure out the
> "Does distributed computing have to be any harder than this? I don't
> think so" aspect.

I disagree, at least with the desire for an exception. Batch is not
hard to implement, and it is part of 2.0. If people want to write 2.0
implementations without batch, that is up to them, but they should
note that their implementation does not fully comply with the spec. No
one says it has to... same with the transport support.. the spec is
the spec, implementation is another matter and not something we can or
should try to enforce.

--
Matt (MPCM)

Rasjid Wilcox

unread,
Nov 14, 2010, 5:11:12 AM11/14/10
to json...@googlegroups.com
On 13 November 2010 03:31, Robert Goldman <rpgo...@gmail.com> wrote:
<snip>

> It /may/ be harder to implement
> server-side because the server /must/ return an array of responses.  You
> cannot simply squirt back responses onesy-twosy.  That is, the server
> /cannot/ treat each component as a single request.
>
> That means at the very least there's additional code, and my impression
> is that there's additional hair as well, especially where abnormal
> conditions are concerned.  That's why I don't favor making this
> mandatory.  I have a socket-based json-rpc implementation, mostly for
> use locally (as an inter-language RPC, rather than for use in web apps).
>  Batch mode is a nuisance for me, and no payoff, so I probably will keep
> my library private and not bother complying with this aspect of the
> standard....

Hi Robert.

I still have it as one of my goals to write up some notes on
socket-based implementations and the issues around compatibility. For
that I need as many real socket-based implementations as possible. So
while I sympathise with your view that it makes certain
implementations harder (my implementation required some significant
refactoring to deal with batch), I hope you can release your library
with the caveat that it does not support batch, rather than not
release it at all.

Cheers,

Rasjid.

Rasjid Wilcox

unread,
Nov 14, 2010, 5:39:50 AM11/14/10
to json...@googlegroups.com
On 13 November 2010 09:32, Matt (MPCM) <ma...@mpcm.com> wrote:
> I think in general the quickest way to implement batch on the server
> side is to make the default processing mode be batch. So if you have a
> single request come in, put it in as the only item in the array, but
> record the fact that you did that. So at the end you don't do any
> batch related errors/responses. Implementing batch should be trivial
> unless you designed a very procedural json-rpc server, IMO. There are
> lots of ways to handle it though.

I agree that if one designed one's library from the outset to handle
batch requests, it is all pretty straightforward, since (as you note)
a single request is just a batch request of length one, with the
brackets removed. But for socket based servers (plain tcp etc), it
shifts the processing from:

(a) parse request -> process request -> send response

to

(b) parse request (or batch) -> process request(s) -> collate
responses -> send response

For transports like http, you need to link the response to the
specific request anyway, so there is almost no additional work. But
for socket based servers it can mean the addition of a extra layer of
complexity which is otherwise avoided.

So while I agree it is indeed trivial if one starts one's design with
batch in mind, it can be a bit of a pain to retrofit into an (a) style
design.

The above being said, the 2.0 spec is now final, and I think that is a
good thing.

Cheers,

Rasjid.

Vladimir Dzhuvinov

unread,
Nov 17, 2010, 1:55:58 AM11/17/10
to JSON-RPC
In my Java JSON-RPC 2.0 implementation I chose not to implement
batching. Not that it's that hard to code or that it doesn't have
merits, but for the reason that it sometimes confuses people. I guess
posts about batching in this list will never stop :]

I've got several JSON-RPC 2.0 web services in production, the most
notable Json2Ldap - a service for accessing LDAP directories via JSON-
RPC. My observations of applications coded against it (and other JSON-
RPC services we use here) is that there are no use cases of batching.
Yes, I have to admit that. The logic of many remote calls is such that
they often return something and the client needs to store, process or
make a decision based on the response result/error before proceeding
with the next call. Or sometimes the result of one call becomes a
parameter in the next.

For example, many Json2Ldap clients go through an initial ldap.connect
-> ldap.bind (authenticate) sequence of calls. The first call,
ldap.connect, returns a connection identifier (CID) that needs to be
stored by the client (for subsequent requests) and is passed as a
parameter to ldap.bind. Batching obviously cannot be applied here. So
what I did is change the API and create an overloaded ldap.connect
that also performs optional authentication.

If I come across a real batching use case I'll gladly report about it
here.


I've mentioned it elsewhere, but I'll repeat it here again: To me the
most effective features of 2.0 (over the original JSON-RPC) are the
addition of named parameters and standardising the response error
object format.


Vladimir

--
Vladimir Dzhuvinov :: software.dzhuvinov.com

Antony Sequeira

unread,
Dec 21, 2010, 4:08:38 AM12/21/10
to json...@googlegroups.com
The main reason for batching is performance.

Say I have a screen that needs to be rendered (web page or a flex UI
or whatever)
Initial display might need data from a set of apis.
Once initial display is done I might need to update only some data.

There are three ways to do that
1. Call a number of apis for each part and then render the page
(rendering incrementally or as a whole is irrelevant for server). Call
the individual apis for the individual parts to be refreshed later.
2. Have a composite api for the whole data and provide separate api for parts.
3. Provide only separate parts apis for parts only and have a batching
mechanism. Client decides what compositions it wants to exercise.

Option 1 and 3 unlink the server design/dev from the client(s) product dept :)
Option 1 is generally slower when you have a sufficiently complex (in
terms of number of distinct parts and the ways they are
combined/organized by the various clients).

Option 3 also allows better request level caching (Sorry, I don't have
the patience to explain).
Option 3 also allows for transactions containing arbitrary set of api calls.


Hence the need for batching.

It should be obvious that option 1 is slower than option 3. If you
quesion whether the slowness is sufficient to warrant the batching
complexity, that's a valid question. If you are disagreeing that 1 is
slower than 3, then I don't have anything to say.

I have run into homegrown protocols that did not have batching and had
to create. Standard protocols that don't provide batching (such as MS
WCF) and had to go for option 2.

If you are still not convinced, ask yourselves why pretty much all
database servers provide for stored procedures.
I have run into that too :) Developed a whole app calling straight
SQL. Found it to be dog slow and had to write stored procs. Suddenly
the app was usable. So whats the difference between stored procs
called from client and individual SQL queries called from client?
answer - batching, yeah may be compilation too :)

A batch mechanism is not needed from a computational power point of
view. It's needed for performance.

-Antony

Reply all
Reply to author
Forward
0 new messages