How does batch processing occur?

32 views
Skip to first unread message

Tomás Rossetti

unread,
Apr 16, 2024, 5:03:00 PMApr 16
to OpenTripPlanner Users
Hi everyone,

I'd like to know more about how OTP handles parallel requests.

If I understood the documentation correctly, OTP supports multithreading for one request but does not explicitly allow parallelizing. The opentripplanner package for R allows parallel requests, but based on the source code what it's doing is simply sending many requests in parallel. This is faster than sending them sequentially.

How exactly does OTP process requests sent in parallel? Does the OTP instance wait for a few seconds to batch process requests? Does it save previous results to make future requests faster? Or is there something else going on?

I want to understand this to be sure that I'm using all its potential in terms of multicore support.

Thanks!

Andrew Byrd

unread,
Apr 19, 2024, 1:54:59 AMApr 19
to opentrippl...@googlegroups.com
Hello,

Concurrency and parallelism are complex topics. Unfortunately, people who've worked on this part of OTP may find it hard to provide you an answer, because "how does it process in parallel" is a very broad question. Depending on what you mean, this could be the topic of entire books.

Essentially, OTP is an HTTP server (web server) that serves structured descriptions of routes instead of human-readable pages of text. It's relying on the various approaches to handling concurrent requests that are built into the HTTP server library it employs.

In the same way you and I and thousands of other people might be asking a single web server for newspaper articles at the same time, we can all ask OTP for a route at the same time. While providing newspaper articles might mean returning the same static page of text over and over (which is generally "IO bound") returning a route relies mostly on computation (it's "CPU bound") so different mechanisms will be used to handle the two cases. But in general, the HTTP server library, Java runtime, and the operating system all contribute to ensuring multiple such tasks run smoothly at the same time. In large deployments such as the one in Norway, there will also be many separate OTP instances with load balancers spreading the requests out evenly across the instances.

As a simple summary, OTP can compute one response on each CPU core in your machine (or cluster of machines), handling one HTTP request on each core. It is capable of handling even more simultaneous requests, but once you're hitting it with more requests than cores, at some point the average throughput and response time will get worse, not better.

To respond to some specific points:

I'm not sure what you mean by "wait a few seconds to batch". You seem to have some intuition here about how doing multiple things at once would be faster than doing them independently, but it's hard to guess what you're referring to. In the event that you're issuing multiple requests with the same origin point (same geographic start point), there are optimizations that could make this thousands or millions of times faster. OTP2 is focused on public transit passenger information, so doesn't make any particular effort to apply these analytics-oriented optimizations. Other projects (such as Conveyal R5) exist that are entirely designed just for that use case.

I think that by "save previous results to make future requests faster" you may mean what is usually called cacheing. I believe OTP will cache some information internally that is specific to the day or time you're searching, but full responses for specific requests are not cached because it is not expected for many people to request the exact same route repeatedly. Each request is usually somewhat different.

This mailing list is used mostly for announcements. If you have further questions about OTP, please consider joining the Gitter chat room. This is where most discussion occurs these days: https://gitter.im/opentripplanner/OpenTripPlanner

Regards,
Andrew
This communication and any attachments may contain confidential information and are intended to be viewed only by the intended recipients. If you have received this message in error, please notify the sender immediately by replying to the original message and then delete all copies of the email from your systems.



--
You received this message because you are subscribed to the Google Groups "OpenTripPlanner Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opentripplanner-...@googlegroups.com.

Reply all
Reply to author
Forward
0 new messages