Pull
- When a client connects, I don't care if it's a new client or an old one, the client requests a subscription to the stream of events after X.
- I create a stream of all events from X up to the current tail of events and publish it into the clients subscription channel. (Greg EventStore example uses AtomPub for this matter).
I haven't follow Greg's tutorial all way yet, so I don't know if AtomPub does some sort of long polling here, or if the Http connection closes once I finished publishing that particular stream of events and then the client is supposed to pull again from that point on, potentially getting an empty stream if no new events have arrived.Would it be stupid to consider implementing this feature using things like HTTP Long Polling, Web Sockets or HTTP Streams?
"The UDP broadcast hint only reduces the wait time to approximately
almost 0, it does not change the semantics or ordering or high water
mark mechanism."
Guess you are using some kind of UDP that provides ordering
assurances? I have yet to see that one. You can build ordering
assurances on top of UDP (message sequence numbers as example) but
they are certainly not 0 cost.
ES already supports a similar push model. It does it over TCP though
instead of UDP as UDP is often not routeable. Especially in the cloud.
I'm trying to understand the need of the UDP solution that you used to wake up a waiting subscriber.
I think it might be related to a couple of questions I had:What can I do in the publishing side to decide when to publish more events into the subscription stream.
If you are doing DDD+CQRS+ES, the traditional answer is that at the end of an aggregate processing a command, it should emit a batch of zero..N events into the event system of record. However there are also many useful cases for CQRS+ES without much DDD. In which case you emit the events after whatever thing has happened has happened.
Obviously when a client requets that it wants events from X, I first read the events from the event store up to X, but eventually I would like to continue publishing real time events in that stream. Is there a clever way to do the transition? How do I join the real time stream?I
In the solution we implemented, and I think it is similar for others, there is no real explicit transition here. It is more or less completely seamless because a subscriber is simply running in a loop consuming more events. there is no separate real-time versus old stream. There is only a single log structured list of events.
By the way, someone else who worked with me wrote about a bunch of this in much detail. There are five blog posts in the link below, I think you might enjoy several of them very much.
https://blog.oasisdigital.com/category/cqrs/
Also, I suppose I will need a way to buffer how many events to send in order not to oveload the client (somewhat like reactive streams backpressure mechanism, or pagination in the AtomPub example from Greg).
Actually I think with both would I have done and what Greg is done, something like pagination does the job. Subscribers pull (and poll) data, it is not sent to them. So there is no buffer overload concern. Nor any back pressure. Subscribers just process the data as fast as they are able.
Above all if the client is making a temporal query and it does not want to reach the end of the stream of events, I assume I will also need a limit, which is not set is infinite.
Yes, right, since the subscriber is simply consuming chunks of data linearly from a very long list, it’s easy to tell it to stop at item N.
The loops you wrote look somewhat complicated. The actual implementation we have done has some complexity certainly. But it is a different kind of complexity, it is around getting the error handling and batching right and efficient. But in the implementation we have, there is no concept of transition between catching up and running current data. I think that is also the case if you write a subscriber using Greg’s ES.
Kyle, I am understanding that your wait/notify UDP mechanism is to force an anticipated wakeup of the waiting thread. Am I on right the track here?
Yes, that is correct. That’s all it does. Subscribers that are asleep waiting for the next polling interval, get woken up so they go grab the newly published events immediately. If they happen to be listening. If they happen to be running. But you could certainly stop one and started later and it would catch up from wherever it happened to be. Only the subscriber keeps track of a high watermark.
Is this a valid solution, are there other valid ways, possible better ways, to do it?
On January 27, 2017 at 11:44:57 AM, Greg Young (gregor...@gmail.com) wrote:
A common optimization is to also support pushed results that are live.
To do this maintain a buffer of size b (circular buffer) as you read
based on pages check if that event is in buffer. If so then switch to
live subscription. This logic is implemented in CatchUpSubscription in
ES.
Do you wake disinterested subscribers? or are you using the information sent via UDP to specify which subscribers should wake?