blocking until a sorted set is fully populated

Ted Naleid

unread,

Feb 10, 2011, 11:57:15 PM2/10/11

to redi...@googlegroups.com

I have a background thread that populates a sorted set. The sorted set can't be used by clients until it's fully populated with all values. To prevent a client from using the sorted set before it's completed, I'm thinking about doing something like this:

The producer of the sorted set will ZADD it's items to the set in a transaction, at the end of the transaction it will RPUSH a value onto a different key that denotes that the sorted set is ready to use:

MULTI

ZADD user:1:product-price 100 1

ZADD user:1:product-price 320 2

...

ZADD user:1:product-price 40 3

RPUSH user:1:product-price:completed 1

EXEC

Clients will first try to get the sorted set:

ZRANGE user:1:product-price 0 -1

If the sorted set doesn't exist, they'll do a blocking pop/push for the marker key that lets us know when it's ready:

BRPOPLPUSH user:1:product-price:completed user:1:product-price:completed 1

I'm using BRPOPLPUSH to push the completed indicator back onto the same list for any other clients that might also be waiting for the sorted set to be ready.

Since the sorted set was created in a transation, it shouldn't exist until the completed indicator also exists.

Is this how most people are using Redis to wait and get notified when an operation is complete? Is there some other easier way of doing it that I'm not aware of?

Thanks,

Ted

Derek Williams

unread,

Feb 11, 2011, 10:01:47 AM2/11/11

to redi...@googlegroups.com

On Thu, Feb 10, 2011 at 9:57 PM, Ted Naleid <con...@naleid.com> wrote:
> Is this how most people are using Redis to wait and get notified when an
> operation is complete? Is there some other easier way of doing it that I'm
> not aware of?

If the entire sorted set is populated inside a MULTI/EXEC block, then
another client will never see a partially populated sorted set, so
blocking in this manner is not needed.

If you were not doing the entire sorted set in a MULTI/EXEC though, I
would use the same technique that you are using.

The only reason I can see to use both methods is if it is possible to
not have any records at all being in the sorted set. The separate
completion indicator would let clients know that they don't have to
wait for data that isn't coming.

--
Derek

Derek Williams

unread,

Feb 11, 2011, 10:04:55 AM2/11/11

to redi...@googlegroups.com

On Thu, Feb 10, 2011 at 9:57 PM, Ted Naleid <con...@naleid.com> wrote:

> Is this how most people are using Redis to wait and get notified when an
> operation is complete? Is there some other easier way of doing it that I'm
> not aware of?

Woops, I think I misunderstood part of that. If you needed the clients
to actually block until the data is ready then yes, the method you are
using should work fine.

--
Derek

Josiah Carlson

unread,

Feb 11, 2011, 10:31:52 AM2/11/11

to redi...@googlegroups.com

You could do this without multi/exec, but still use the list blocking pop operation (or you could use publish/subscribe). Add your items to a zset named using a temporary key, and when you are done, use RENAME to rename the key. It prevents other clients from accessing the zset until it is full, doesn't stall any other commands that might be operating on other data, and will have less overhead because Redis won't need to buffer all of the operations before executing them.

Regards,

- Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Ted Naleid

unread,

Feb 11, 2011, 12:01:37 PM2/11/11

to redi...@googlegroups.com

On Friday, February 11, 2011 at 9:31 AM, Josiah Carlson wrote:

You could do this without multi/exec, but still use the list blocking pop operation (or you could use publish/subscribe). Add your items to a zset named using a temporary key, and when you are done, use RENAME to rename the key. It prevents other clients from accessing the zset until it is full, doesn't stall any other commands that might be operating on other data, and will have less overhead because Redis won't need to buffer all of the operations before executing them.

Thanks, I had thought about doing a rename like that but wasn't sure of the performance differences between using a transaction vs the additional rename. I'm betting that you're right that the cost of buffering the transaction and blocking the rest of the clients outweighs a simple key rename.

You mention using publish/subscribe instead of a list blocking pop. I was trying to think that through and couldn't come up with any easy way to avoid race conditions using pub/sub without always having to subscribe.

If the client checks to see if the sorted set exists and finds that it doesn't and then subscribes to the "complete" notification topic to be notified if it exists, there's a race condition between trying to get the sorted set and subscribing to the topic. The producer could finish the sorted set and publish the "complete" message on the topic before the client is subscribed.

The only way around the race condition that I can see is if all clients subscribe to the "complete" topic every time before trying to retrieve the sorted set, then unsubscribing once the sorted set is retrieved.

Is there another way to solve this using pub/sub without the potential for a race condition where the client might miss a notification? Can you optionally subscribe to a topic in a transaction if a key doesn't exist yet?

-Ted

Josiah Carlson

unread,

Feb 11, 2011, 1:49:03 PM2/11/11

to redi...@googlegroups.com

On Fri, Feb 11, 2011 at 9:01 AM, Ted Naleid <con...@naleid.com> wrote:

On Friday, February 11, 2011 at 9:31 AM, Josiah Carlson wrote:

You could do this without multi/exec, but still use the list blocking pop operation (or you could use publish/subscribe). Add your items to a zset named using a temporary key, and when you are done, use RENAME to rename the key. It prevents other clients from accessing the zset until it is full, doesn't stall any other commands that might be operating on other data, and will have less overhead because Redis won't need to buffer all of the operations before executing them.

Thanks, I had thought about doing a rename like that but wasn't sure of the performance differences between using a transaction vs the additional rename. I'm betting that you're right that the cost of buffering the transaction and blocking the rest of the clients outweighs a simple key rename.

It definitely does. Whenever I can get away with it, I do try to pipeline requests without MULTI/EXEC (many of the clients support collecting requests together for a small number of round trips), and that lets me get both throughput and low latency.

You mention using publish/subscribe instead of a list blocking pop. I was trying to think that through and couldn't come up with any easy way to avoid race conditions using pub/sub without always having to subscribe.

If the client checks to see if the sorted set exists and finds that it doesn't and then subscribes to the "complete" notification topic to be notified if it exists, there's a race condition between trying to get the sorted set and subscribing to the topic. The producer could finish the sorted set and publish the "complete" message on the topic before the client is subscribed.

The only way around the race condition that I can see is if all clients subscribe to the "complete" topic every time before trying to retrieve the sorted set, then unsubscribing once the sorted set is retrieved.

Is there another way to solve this using pub/sub without the potential for a race condition where the client might miss a notification? Can you optionally subscribe to a topic in a transaction if a key doesn't exist yet?

You are right about the race between the client subscribing after the server points out that it is ready. Publish/subscribe would work if your clients were always waiting for those messages to do processing. If you are specifically spawning the clients to look for the data, then using a list is the cleaner of the two.

- Josiah

Reply all

Reply to author

Forward