How to perform parallel fan-in and fan-out efficiently to/from external batch web service?

964 views
Skip to first unread message

Constantine Vasil

unread,
Oct 7, 2013, 12:28:11 PM10/7/13
to golan...@googlegroups.com
How to do that efficiently in Golang?

A third party web service B. is working in batch with up to 100 Records.
Each Record is uniquely identified with RecordID.

  1. A websocket Golang App A. is listening at https://MyApp.com/ws's MyApp handler.
  2. Multiple parallel User requests with one Record are sent to MyApp.com. Each request should generate separate Go process MyAppRequest.
  3. Each record from 2. is uniquely identified with RecordID which is actually the UserID associated with each request.
  4. All Requests to MyApp collected within one second in 1. must be combined together and sent in a batch to B. in Go routine MyAppCallWebService.
  5. The batch response from B. in MyAppCallWebService must be split by RecordID and sent back to each requestor's MyAppRequest in 2. 

=================================================================

A. Websocket Golang App:
=================================
Request from User:0000010:
Go process MyAppRequest associated with 0000010
=================================
{
"RequestRecords": [
{
"RecordID": "0000010",
"RequestData": "M4V2T2"
}
],
"TotalRecords": 1"
}

=================================
Response to User:0000010:
Go process MyAppRequest associated with 0000010
=================================
{
"ResponseRecords": [
{
"RecordID": "0000010",
"ResponseData": "processed M4V2T2"
}
],
"TotalRecords": 1"
}


=================================
Request from User:0000020:
Go process MyAppRequest associated with 0000020
=================================
{
"ResponseRecords": [
{
"RecordID": "0000020",
"RequestData": "M4V2T3"
}
],
"TotalRecords": 1"
}

=================================
Response to User:0000020:
Go process MyAppRequest associated with 0000020
=================================
{
"RequestRecords": [
{
"RecordID": "0000020",
"ResponseData": "processed M4V2T3"
}
],
"TotalRecords": 1"
}
B.  Third party web service:
=================================
Request:
Go process MyAppCallWebService
=================================
{
"RequestRecords": [
{
"RecordID": "0000010",
"RequestData": "M4V2T2"
}, {
"RecordID": "0000020",
"RequestData": "M4V2T3"
}
],
"TotalRecords": "2"
}

The web service performs processing of up to 100 RequestRecords in batch
and returns results in batch of ResponseRecords.

=================================
Response:
=================================
{
"ResponseRecords": [
{
"RecordID": "0000010",
"ResponseData": "processed M4V2T2"
}, {
"RecordID": "0000020",
"ResponseData": "processed M4V2T3"
}
],
"TotalRecords": "2"
}
=================================



Kevin Gillette

unread,
Oct 7, 2013, 5:51:47 PM10/7/13
to golan...@googlegroups.com
http://play.golang.org/p/Ew-4DYftK6

In that snippet, there are two alternatives for the fan-in: FanIn and FanInTick, each with their own benefits. Don't wait for the output to finish, as that could be a very long time -- you should be able to see what's going on within the first 10 or so "requests".

Constantine Vasil

unread,
Oct 7, 2013, 6:41:50 PM10/7/13
to golan...@googlegroups.com
Kevin

Thank you. I will look at it.

--Constantine

Constantine Vasil

unread,
Oct 7, 2013, 7:26:35 PM10/7/13
to golan...@googlegroups.com
The code is excellent. 

Need to figure out how to do this:

=======
In FanIn
=======
Right now it checks for the 1 s expiration and runs HandleResults
case <-t:
HandleResults(s)
=======

Needs one more check to see if number of results received with <-ch
do not exceed 100, if they do then execute  HandleResults(s) 
before expiration and reset the timer



On Monday, October 7, 2013 2:51:47 PM UTC-7, Kevin Gillette wrote:

Paddy Foran

unread,
Oct 7, 2013, 9:02:57 PM10/7/13
to Constantine Vasil, golang-nuts
Something like this should work: http://play.golang.org/p/zl_6yzX_Kr

The idea is that you're just checking the length of the slice of results, and if it's past your threshold, you're prematurely triggering the same functionality that would occur if the window was closed by the timers.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Kevin Gillette

unread,
Oct 7, 2013, 10:37:49 PM10/7/13
to golan...@googlegroups.com, Constantine Vasil
Indeed. Here's a version taking the max result set size as a parameter to FanIn, and with duplicate code consolidated: <http://play.golang.org/p/4NeDp8NRAF>.

Go, by design, makes these kinds of application-level scheduling problems very simple to write. I've done systems like this with both low/high size bounds and low/high timeouts without being much more code or complexity than what you already see. Doing something like this in a non-concurrent language isn't nearly as painless.

Constantine Vasil

unread,
Oct 8, 2013, 11:56:15 AM10/8/13
to golan...@googlegroups.com, Constantine Vasil
When FanOut generates the separate Go routines when "max" OR timeout is reached HandleResults(s) 
will perform external web service call. 

Here with Go routines it is simulated different users calling our app.

During this time the Go routines must wait for results. When up
to "max" results come they must be separated and returned back to waiting Go routines so they to return
results to each user.

Need to find efficient way to do this. I don't know if it does not got too complex to be robust in practice. 

--Constantine

Kevin Gillette

unread,
Oct 8, 2013, 3:53:56 PM10/8/13
to golan...@googlegroups.com, Constantine Vasil
You're saying that in addition to the behavior in the playground snippets, the "FanOut" goroutines (representing incoming client web requests) must be informed when, and with details of which batch their request ended up in, rather than simply completing the request early (not waiting for the batch to finish spooling)?

Constantine Vasil

unread,
Oct 8, 2013, 4:17:59 PM10/8/13
to golan...@googlegroups.com, Constantine Vasil
Basically, yes.

1) if one FanOut Go routine represent one Request by one user
and
2) FanIn collects in 100/timeout basis individual users' requests
and sends them in batch the web service. 
then
3) every user should get back the Response to his Request from the batch
web service.

Kevin Gillette

unread,
Oct 8, 2013, 4:20:48 PM10/8/13
to golan...@googlegroups.com, Constantine Vasil
That's not difficult. I'll post a followup snippet after lunch (which I just started).

Kevin Gillette

unread,
Oct 8, 2013, 6:58:24 PM10/8/13
to golan...@googlegroups.com, Constantine Vasil
http://play.golang.org/p/57V3Z_Ebbu

The main changes actually involved renaming all instances of "result" to "Req" (request) for clarity. In a situation where the client does not need any server feedback besides acknowledgement of initial receipt (which is what the early snippets assumed), you'd want to move all request-specific (parallel-safe) processing out of the client request handler so that the client connection can be closed/reused as quickly as possible. However, since your revised goal already has clients waiting for batched results, it's better to push as much independent pre-processing as possible into the client handler (or into their own intermediate goroutines). If you did do preprocessing, then the current snippet assumes that requests are batched based on when they finish their independent preprocess steps (which minimizes wait); if you need grouping to be based on when the requests were initially received, that is still doable, but takes a little bit more work, which I'll leave as an exercise for the reader.

Constantine Vasil

unread,
Oct 8, 2013, 7:29:37 PM10/8/13
to golan...@googlegroups.com, Constantine Vasil
Revisiting goals is a fact of life.

Here is my revised code:

1) I changed HandleResults to run it its own Go routine: go HandleResults. That way FanOut
generated routing will keep coming and the HandleResults processing is multiplexed.

2) After each FanOut Go routine there is a waiting for chResponse from HandleResults.
Each user can recognize its record by virtue of: UserID|RecordID

3) FanIn is combining UserID|RecordID in RecordID so HandleResults to be able to spilt them to each user.

4) HandleResults is waiting for web results to split them back and sent to each waiting Go routine

Seems to run correctly. 

New revisited goal - right now FanOut sends only one record as separate Go routine to FanIn - 
some users can sent multiple records.

Anyway after this reworks the code seems working correctly.

Thanks,
--Constantine
Reply all
Reply to author
Forward
0 new messages