Server replies suck, new features, and users using the latest source code

38 views
Skip to first unread message

Salvatore Sanfilippo

unread,
Mar 23, 2009, 7:13:59 AM3/23/09
to redi...@googlegroups.com
Hello guys!

this email is about three things. One is about a protocol change
needed, please don't be hungry with me ;) That was my fault but it's
time to fix it, and some client-libraries hacking is required. So
let's start with the first issue:

SERVER REPLIES SUCK

yeah. Ludovico Mangocavallo saw this error in the early stage, warned
me, but I was blinded by my search for simplicity, I already had
developed a protocol draft and implementation, this turned out to bee
cool but far too complex, actually a full serialization syntax capable
of doing what JSON or YAML can do. So I was in a stage with a
repulsion for everything organic, but Ludo noticed the smell of the
bad design in the Redis protocol, a good quality for a programmer.

Basically we have two problems:
- Error reporting is reply-type dependent. This is a huge problem, for
example, what about giving an error 'database is out of memory' to
every kind of query if we are low on memory? We need to parse the
command and check what kind of return code it sends to send the right
format of error messages currently. -7 (for instance) if it's LLEN,
-ERR something for status code replies, -<len> .... for bulk. It's a
mess.

- If there are to read N replies at once, one need to store the reply
type beforehand, it is not possible just to code "read four replies".
This is useful in a lot of conditions, especially for this new command
I want to implement. It's trivial to implement after the protocol
change, now it's a bit a nightmare:

PREPARE SET x 10
PREPARE EXPRE x 1000
COMMIT

and all the prepared commands will be processed at once after the
COMMIT command, or discarded after the ROLLBACK command (or if the
connection closes before the COMMIT). So the user can be sure that
both, or no one, is executed. It can't happen that after the SET your
connection drops and the EXPIRE is not set, so the key will live
forever on the server. Now after a commit you have to read all the
replies. With the current protocol this is a nightmare for clients,
they must remember the type of reply of every PREPAR-ed command, in
order to call the right functions after COMMIT.

So what's missing is a prefix in every reply that is different for
every query type.

PROPOSAL OF CHANGE

1) Error messages are unified, and start with the "-" byte. "-ERR
foobar" or '-this is an error" are all errors. The error message is
what follows "-".
2) Return code replies, and single line replies, are unified in the
form of "+<string>". SET used to return a return code message and will
continue to return "+OK", but probably everybody will discard the
mssage "OK" since it's not important (the only thing that matters if
that SET didn't reported an error), while RANDOMKEY will reply with
"+arandomkey" and so on. TYPE will reply with "+list" or "+set" ...
3) Bulk replies will be exactly like they are in the current protocol,
but prefixed by the byte "$"
4) Multi bulk replies will be exactly like before too, but prefixed by
the byte "*"
5) Integer replies will be exactly like now just a number, but
prefixed by the ":" byte. For example LLEN may return ":334\r\n" and
so forth. You may wonder why integer replies are not just single line
replies (point number 2 of this proposal). in theory the protocol is
the same, but the information that there is an integer stored will
allow library clients to return the right type back. For example a
Ruby client will call the .to_i method against the string before
returning it.

Now what is needed is that the client libraries will be in sync with
the new protocol when the new version of Redis implementing it will be
on Git. How to handle this? I think I'll implement this features in my
local computer, send a tar.gz link here in the mailing list, and wait
until all the developers of client libraries will sync with the
changes. They only way I can help about this is to take care of PHP or
Ruby libs changes if needed, I don't know Python and Erlang enough.

NEW FEATURES

I want to implement the following features in the next days. I want to
give details about this features in order for you to send "design
warnings" or alike.

EXPIRE

That's easy: EXPRE mykey milliseconds. Yes, here we can have
milliseconds granularity at no cost! Some detail on how it works:
a) Keys will be, from the point of view of the user, deleted after the timeout.
b) It's possible to set a new EXPIRE in an already expiring key. The
special value '0' means don't expire. So EXPIRE foo 1000, followed by
EXPIRE foo 0 will undo the expire.
c) Expires are saved on the DB dump. You can stop and restart the
server and the expires info will still be associated with the keys.
And the time the server was down is counted in order to expire a key.
d) Unlinke memcached expires are not lazy. They are lazy in a sense,
that is, if I perform a GET against a key and the timeout reached the
end, it will be deleted from the server memory just in this exact
moment. But actually Redis will check few keys every second even if
they are not requested at all. This is absolutely required since Redis
is a DB and not a cache. And what if you sent keys with timeouts and
then NEVER access this keys? They may stay in the DB forever if the
expire is only lazy.
e) If there are no expires set, the server will be as fast as it is
today. Using expires GETs and other operations will be a bit slower.
f) Expires are automatically deleted from a key if: a key is SET to a
new value. A key changes value type.

TRANSACTIONS

This one is pretty simple too. Every PREPARE COMMAND ARG ARG ARG...
will go into the client prepare buffer.
ROLLBACK will reset the buffer. COMMIT will instead execute all the
commands one after the other: it's guaranteed that all the commands
will be executed, and that they will be executed as a whole atomic
operation. COMMIT will, of course, free the pepared commands buffer.
COMMIT will reply with all the replies from the single commands
prepared, one after the other.

LOCKS

That's locking with key-granularity:

LOCK key timeout-in-seconds
UNLOCK key
TRYLOCK key timeout-in-seconds

LOCK and TRYLOCK are the same locking operation, but while LOCK is
blocking until the resource is available, TRYLOCK will return 1 or 0
(1 if the locking succeeded, 0 if the resource is busy) and return
immediately. UNLOCK unlocks, of course. If the client does not UNLOCK
the key in the amount of seconds specified the server will force the
unlock and close the connection. LOCK can't be used with PREPARE.
PREPARE LOCK mykey 5 is invalid and will return an error.

That's all about new features.

PEOPLE USING Redis-git

ok now I'm going to implement all this important changes in the next
days. The server will be pretty unstable, but I noticed that a lot of
people are using the SVN/Git version. What should I do instead, to
create an unstable branch? Any hint is appreciated.

Thanks for your patience!
Salvatore

Salvatore 'antirez' Sanfilippo
http://antirez.com

Organizations which design systems are constrained to produce designs
which are copies of the communication structures of these
organizations.

Conway's Law

Pedro Melo

unread,
Mar 23, 2009, 8:21:16 AM3/23/09
to redi...@googlegroups.com
Hi,

On Mar 23, 2009, at 11:13 AM, Salvatore Sanfilippo wrote:

> this email is about three things. One is about a protocol change
> needed, please don't be hungry with me ;)

I also do hope that cannibalism is not an option here. :)


> Now what is needed is that the client libraries will be in sync with
> the new protocol when the new version of Redis implementing it will be
> on Git. How to handle this? I think I'll implement this features in my
> local computer, send a tar.gz link here in the mailing list, and wait
> until all the developers of client libraries will sync with the
> changes. They only way I can help about this is to take care of PHP or
> Ruby libs changes if needed, I don't know Python and Erlang enough.

Why not create a new git branch for the new protocol?

Easy to accept contributions.


> PEOPLE USING Redis-git
>
> ok now I'm going to implement all this important changes in the next
> days. The server will be pretty unstable, but I noticed that a lot of
> people are using the SVN/Git version. What should I do instead, to
> create an unstable branch? Any hint is appreciated.

Use a topic branch for unstable development. It easy and cheap. And
you keep master available for interim releases (high-priority bug
fixes and whatnot).

Just

git checkout -b expire-feature

and hack away. You can have a topic per feature (thats my personal
preference) or a single unstable branch, your call.

You should keep master clean and as stable as possible, and only merge
or rebase topic branches into master when they are stable enough.

Best regards,
--
Pedro Melo
Blog: http://www.simplicidade.org/notes/
XMPP ID: me...@simplicidade.org
Use XMPP!


Salvatore Sanfilippo

unread,
Mar 23, 2009, 8:31:11 AM3/23/09
to redi...@googlegroups.com
On Mon, Mar 23, 2009 at 1:21 PM, Pedro Melo <me...@simplicidade.org> wrote:

>> needed, please don't be hungry with me ;)
>
> I also do hope that cannibalism is not an option here. :)

Haha :) sorry I mean angry of course.


> Why not create a new git branch for the new protocol?
>
> Easy to accept contributions.

Ok I'll try to figure how to do this. That's different from a topic
branch I guess. I currently know only very basic git stuff, like 'git
add', 'git push', 'git commit -a', and so on. It is not clear to me
how I can switch from a branch to another one once I branch, or of a
new branch is just another directory and another object inside the
repository and how to merge with the master branch later.

> Use a topic branch for unstable development. It easy and cheap. And
> you keep master available for interim releases (high-priority bug
> fixes and whatnot).
>
> Just
>
>    git checkout -b expire-feature
>
> and hack away. You can have a topic per feature (thats my personal
> preference) or a single unstable branch, your call.

Ok, I'll try this, thanks

> You should keep master clean and as stable as possible, and only merge
> or rebase topic branches into master when they are stable enough.

What if I need to fix master while I'm hacking after 'git checkout
-b'? What's the command to return back to master, and then to return
again to my feature branch? Thank you very much.

Cheers,
Salvatore

> Best regards,
> --
> Pedro Melo
> Blog: http://www.simplicidade.org/notes/
> XMPP ID: me...@simplicidade.org
> Use XMPP!
>
>
>
> >
>



--

themcgruff

unread,
Mar 23, 2009, 8:33:18 AM3/23/09
to Redis DB
Hey Salvatore,

Not sure what I think about all of the other changes (if implemented
exactly as described). However, you minimize the interruption by
using a (git) branch and pushing it up to Github. Then when
everything is ready to go, that branch just gets merged back in to
master. (Alternatively you could start tagging releases and having
people point to those tags. Probably also a good idea.)

Since you are new to git / Github: http://github.com/guides/push-a-branch-to-github.

--Taylor
> Salvatore 'antirez' Sanfilippohttp://antirez.com

Salvatore Sanfilippo

unread,
Mar 23, 2009, 8:37:47 AM3/23/09
to redi...@googlegroups.com
On Mon, Mar 23, 2009 at 1:33 PM, themcgruff <themc...@gmail.com> wrote:
>
> Hey Salvatore,

Hola Taylor,

> Not sure what I think about all of the other changes (if implemented

Please feel free to bash every part of the new commands if you don't
like the semantic or the interface.

> exactly as described).  However, you minimize the interruption by
> using a (git) branch and pushing it up to Github.  Then when
> everything is ready to go, that branch just gets merged back in to
> master.  (Alternatively you could start tagging releases and having
> people point to those tags.  Probably also a good idea.)
>
> Since you are new to git / Github: http://github.com/guides/push-a-branch-to-github.

Ok thank you very much, this was really needed.

Ciao,
Salvatore

--
Salvatore 'antirez' Sanfilippo

Salvatore Sanfilippo

unread,
Mar 23, 2009, 8:50:46 AM3/23/09
to redi...@googlegroups.com
On Mon, Mar 23, 2009 at 1:31 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:

> What if I need to fix master while I'm hacking after 'git checkout
> -b'? What's the command to return back to master, and then to return
> again to my feature branch? Thank you very much.

Ok this one is saving me:
http://www-cs-students.stanford.edu/~blynn/gitmagic/ch04.html

Thanks :)
Salvatore

Pedro Melo

unread,
Mar 23, 2009, 8:56:29 AM3/23/09
to redi...@googlegroups.com
Hi,

On Mar 23, 2009, at 12:31 PM, Salvatore Sanfilippo wrote:

>
> On Mon, Mar 23, 2009 at 1:21 PM, Pedro Melo <me...@simplicidade.org>
> wrote:
>> Why not create a new git branch for the new protocol?
>>
>> Easy to accept contributions.
>
> Ok I'll try to figure how to do this. That's different from a topic
> branch I guess.

No, a branch is a branch. All the same thing. With git "branch" is
just a name pointing to a specific commit.

Just

git checkout -b protocol-ng

and hack. When you are ready to merge that to the stable release there
are several ways to do it.

Personally I usually do a test integration branch:

git checkout -b intg master
git merge protocol-ng

This allows me to test the (at the time) current master with the new
protocol. If all goes well, you can do the real thing:

git checkout master
git merge protocol-ng

The intg branch is just a staging place, to test the merge. You can
delete it with

git branch -D intg

(the -D forces the delete. -d is a safer option but won't let you
delete if the intg branch is not merged into the current branch).


> I currently know only very basic git stuff, like 'git
> add', 'git push', 'git commit -a', and so on. It is not clear to me
> how I can switch from a branch to another one once I branch,

Assuming that your workdirectory is clean (that is, git status shows
no modified files), you can swith branches with

git checkout branch_name

You can use

git branch -a

to list all branches.


> or of a
> new branch is just another directory and another object inside the
> repository and how to merge with the master branch later.

See merge above. Branches are not implemented as directories.


>> You should keep master clean and as stable as possible, and only
>> merge
>> or rebase topic branches into master when they are stable enough.
>
> What if I need to fix master while I'm hacking after 'git checkout
> -b'? What's the command to return back to master, and then to return
> again to my feature branch? Thank you very much.

Simple, but there are two scenarios at the start:

1) you are in "branch_X", with a clean workdir (modified files), and
want to fix a bug in master;
2) you are in "branch_X", with a dirty workdir (modified files), and
want to fix a bug in master;

The first one is simple. You just switch to master (git checkout
master), do your bug fixing, commit, push, whatever. When you are
done, you can go back to your branch with git checkout branch_X.

if you are in 2), then you need to stash away your modified files. The
easiest way to do this (I'm assuming a recent, > 1.6.x, release; I
strongly recommend using 1.6 at least, I'm using 1.6.2.1 right now) is
using the stash. Just run:

git stash

This will save your modified files and give you a clean workdir (new
untracked files are not stashed though, but they should be a problem).
Then you do the process above, like 1). After, when you are back to
branch_X, run:

git stash pop

This will pop the latest stash entry and apply it.

Keep on working...

I'm new to Redis, just found out about it last week, and I'm still
learning, but I'm using git for a couple of years now. If you get into
trouble feel free to ask me for help. Direct contacts in the signature.

Salvatore Sanfilippo

unread,
Mar 23, 2009, 9:00:04 AM3/23/09
to redi...@googlegroups.com
On Mon, Mar 23, 2009 at 1:56 PM, Pedro Melo <me...@simplicidade.org> wrote:

> Simple, but there are two scenarios at the start:

Pedro, thank you very much! Your hints are invaluable. I'm sorry for
not reading / learning more about Git but for now my main goal is to
continue to hack on Redis and your help is very appreciated.

Cheers,
Salvatore

András Bártházi

unread,
Mar 23, 2009, 9:32:18 AM3/23/09
to redi...@googlegroups.com
Hi,

> That's easy: EXPRE mykey milliseconds. Yes, here we can have

Just two ideas:
What about an expire type that resets the time every time you _read_ a
key? Actually you can do this by setting the expire time again and
again on read, so maybe that'll be enough.

What about setting an exact time for expiring? This can be
workarounded as well, calculating the difference between the current
time and the future time, and setting the value based on this.

Is it a typo, or why it is not "EXPIRE mykey milliseconds"?

Bye,
Andras

Valentino Volonghi

unread,
Mar 23, 2009, 12:20:20 PM3/23/09
to redi...@googlegroups.com

On Mar 23, 2009, at 4:13 AM, Salvatore Sanfilippo wrote:

> PROPOSAL OF CHANGE
>
> 1) Error messages are unified, and start with the "-" byte. "-ERR
> foobar" or '-this is an error" are all errors. The error message is
> what follows "-".

Why do you want to keep ambiguity of -1 being both a value
that can be returned and -1 as error message? Why not simply
have a line (a unit that redis already uses) without any length
prefixes (useless for line based communication) that starts with
-ERR where you can do anything that you want? It's by far the
simplest thing you can do, and actually I wouldn't need to change
anything in my client.

> 2) Return code replies, and single line replies, are unified in the
> form of "+<string>". SET used to return a return code message and will
> continue to return "+OK", but probably everybody will discard the
> mssage "OK" since it's not important (the only thing that matters if
> that SET didn't reported an error), while RANDOMKEY will reply with
> "+arandomkey" and so on. TYPE will reply with "+list" or "+set" ...
> 3) Bulk replies will be exactly like they are in the current protocol,
> but prefixed by the byte "$"
> 4) Multi bulk replies will be exactly like before too, but prefixed by
> the byte "*"

Ok, so I may not need to know which parser to use before the answer,
I need to review the code first, This would allow me to avoid the
parsers
stack for pipelining. I'm actually ok with this.

> 5) Integer replies will be exactly like now just a number, but
> prefixed by the ":" byte. For example LLEN may return ":334\r\n" and
> so forth. You may wonder why integer replies are not just single line
> replies (point number 2 of this proposal). in theory the protocol is
> the same, but the information that there is an integer stored will
> allow library clients to return the right type back. For example a
> Ruby client will call the .to_i method against the string before
> returning it.

I don't see how this was hard before. LLEN would anyway return
an int. For most of the other places instead redis wouldn't know if
something is an int or a string anyway.

> Now what is needed is that the client libraries will be in sync with
> the new protocol when the new version of Redis implementing it will be
> on Git. How to handle this? I think I'll implement this features in my
> local computer, send a tar.gz link here in the mailing list, and wait
> until all the developers of client libraries will sync with the
> changes. They only way I can help about this is to take care of PHP or
> Ruby libs changes if needed, I don't know Python and Erlang enough.

Just release the new version of redis on git and wait a day or two for
erldis to be synced.

> LOCKS
>
> That's locking with key-granularity:
>
> LOCK key timeout-in-seconds
> UNLOCK key
> TRYLOCK key timeout-in-seconds
>
> LOCK and TRYLOCK are the same locking operation, but while LOCK is
> blocking until the resource is available, TRYLOCK will return 1 or 0
> (1 if the locking succeeded, 0 if the resource is busy) and return
> immediately. UNLOCK unlocks, of course. If the client does not UNLOCK
> the key in the amount of seconds specified the server will force the
> unlock and close the connection. LOCK can't be used with PREPARE.
> PREPARE LOCK mykey 5 is invalid and will return an error.

LOCKs are redundant when you have transactions. Forget about them
and go with the transactions alone.

> That's all about new features.
>
> PEOPLE USING Redis-git
>
> ok now I'm going to implement all this important changes in the next
> days. The server will be pretty unstable, but I noticed that a lot of
> people are using the SVN/Git version. What should I do instead, to
> create an unstable branch? Any hint is appreciated.

The people you see on github are just following development.

--
Valentino Volonghi aka Dialtone
Now running MacOS X 10.5
Home Page: http://www.twisted.it
http://www.adroll.com

PGP.sig

Salvatore Sanfilippo

unread,
Mar 23, 2009, 12:34:25 PM3/23/09
to redi...@googlegroups.com
On Mon, Mar 23, 2009 at 5:20 PM, Valentino Volonghi <dial...@gmail.com> wrote:

Hello Valentino,

>> 1) Error messages are unified, and start with the "-" byte. "-ERR
>> foobar" or '-this is an error" are all errors. The error message is
>> what follows "-".
>
> Why do you want to keep ambiguity of -1 being both a value
> that can be returned and -1 as error message? Why not simply

There is no longer ambiguity, since every integer reply starts with ':'.
Basically every kind of reply has a one-byte prefix, that is, a byte
indicating the type of reply that will follow. The errors happen to
start with byte '-'. a "-1" reply can only be an error, since an
integer reply with value -1 will be ":-1"

>> 5) Integer replies will be exactly like now just a number, but
>> prefixed by the ":" byte. For example LLEN may return ":334\r\n" and
>> so forth. You may wonder why integer replies are not just single line
>> replies (point number 2 of this proposal). in theory the protocol is
>> the same, but the information that there is an integer stored will
>> allow library clients to return the right type back. For example a
>> Ruby client will call the .to_i method against the string before
>> returning it.
>
> I don't see how this was hard before. LLEN would anyway return
> an int. For most of the other places instead redis wouldn't know if
> something is an int or a string anyway.

Sure, but know you don't even need to know the type returned.
Every request will be implemented in this way:

(In pseudo-ruby-alike-code)

def mycommand:
send "MYCOMMAND\r\n"
get_reply
end

get_reply will know what to return just reading from the server,
without the need to know what the command that issued the request was.
Also get_reply knows that if it gets any kind of reply starting with
"-" it's an error message and the client must issue and error and
quit. Examples:

LLEN foobar
:100

LLEN thisIsNotAListKey
-Operating against key not holding a List value

We can even have commands that reply one type or another one, in a
polymorphic way.

GET foo
$3
bar

GET mylist
*2
3
foo
3
bar

Indeed the first thing I'll do is allow this. Possibly even removing
commands like SMEMBERS.
Also LRANGE mylist 0 -1 is now just GET mylist.

This change is so important it reflects not only in the protocol, but
in the client implementation.

For instance the ruby client overloads the [] operator (talking in C++
language, in Ruby of course it's just syntax glue and things are
unified). Now it will be possible to use r['mylist'] to get the list.
That's cool I think.

> Just release the new version of redis on git and wait a day or two for
> erldis to be synced.

Great, thank you

>> LOCKS

> LOCKs are redundant when you have transactions. Forget about them
> and go with the transactions alone.

Not really:

Redis.lock('mykey')
if (Redis.llength('mykey') > 10) {
....
}
Redis.unlock('mykey')

Transactions are unconditional.
Not only this, but Redis can be used as a distributed locking system
with LOCK! There will be people ignoring all the rest, the Db part, to
use Redis just as a synchronization server. This seems like a good
idea.

> The people you see on github are just following development.

Yeah for now yes, but with the SVN I noticed many people just used to
checkout the SVN source instead to get the tarball. Anyway even with
people following the development, I want to provide everybody with a
decent quality with the master, it will not be rock solid but should
not be broken :)

Thanks for your comments,
Salvatore

>
> --
> Valentino Volonghi aka Dialtone
> Now running MacOS X 10.5
> Home Page: http://www.twisted.it
> http://www.adroll.com
>
>

--

Valentino Volonghi

unread,
Mar 23, 2009, 1:17:15 PM3/23/09
to redi...@googlegroups.com

On Mar 23, 2009, at 9:34 AM, Salvatore Sanfilippo wrote:

> There is no longer ambiguity, since every integer reply starts with
> ':'.
> Basically every kind of reply has a one-byte prefix, that is, a byte
> indicating the type of reply that will follow. The errors happen to
> start with byte '-'. a "-1" reply can only be an error, since an
> integer reply with value -1 will be ":-1"

I still don't see it... Why is it so hard to have a single unified way
of
representing errors? Why are you so opposed to have every error
message start with -ERR: ?

>> LOCKs are redundant when you have transactions. Forget about them
>> and go with the transactions alone.
>
> Not really:
>
> Redis.lock('mykey')
> if (Redis.llength('mykey') > 10) {
> ....
> }
> Redis.unlock('mykey')
>
> Transactions are unconditional.

Transactions can be implemented through the use of locks, but for
the application's sake they are pretty equivalent and actually
transactions
give you more room to remove locks in the future (to use
MVCC for example).

Also for the kind of job that redis needs to do eventual consistency is
a lot better than actually locking all the keys you need and then act on
them.

If it helps your case, transactions can be thought of multiple locks at
the same time, you can still have them expire with a rollback after a
certain amount of milliseconds/seconds.

> Not only this, but Redis can be used as a distributed locking system
> with LOCK! There will be people ignoring all the rest, the Db part, to
> use Redis just as a synchronization server. This seems like a good
> idea.

LOCKing is never a good idea, it might happen that sometimes it's the
only one but it doesn't make it right. distributed locking is just
shockingly
hard. Locks in these cases can be removed by making better use of the
flow of information in your system.

>> The people you see on github are just following development.
>
> Yeah for now yes, but with the SVN I noticed many people just used to
> checkout the SVN source instead to get the tarball. Anyway even with
> people following the development, I want to provide everybody with a
> decent quality with the master, it will not be rock solid but should
> not be broken :)

That's because the project is till early in development so new features
and fixes are much welcome. Until there's a "stable" non beta release
this will always be the case I believe.

PGP.sig

Salvatore Sanfilippo

unread,
Mar 23, 2009, 1:37:55 PM3/23/09
to redi...@googlegroups.com
On Mon, Mar 23, 2009 at 6:17 PM, Valentino Volonghi <dial...@gmail.com> wrote:
>
> On Mar 23, 2009, at 9:34 AM, Salvatore Sanfilippo wrote:
>
>> There is no longer ambiguity, since every integer reply starts with ':'.
>> Basically every kind of reply has a one-byte prefix, that is, a byte
>> indicating the type of reply that will follow. The errors happen to
>> start with byte '-'. a "-1" reply can only be an error, since an
>> integer reply with value -1 will be ":-1"
>
> I still don't see it... Why is it so hard to have a single unified way of
> representing errors? Why are you so opposed to have every error
> message start with -ERR: ?

Ok sorry I think I was not clear enough. The change involves exactly
this! There is now an unified way of representing errors, that is, a message
starting with "-".

This will be the only way to report errors:

LLEN notalist
-Wrong type of object

LINDEX foo 29344234234
-Out of range

And so on. If the first byte is "-" what follows is an error.
If the first byte is a ":" what follows is an integer. "$" is a bulk,
"*" is a multi bulk, and so on.

> Transactions can be implemented through the use of locks, but for
> the application's sake they are pretty equivalent and actually transactions
> give you more room to remove locks in the future (to use
> MVCC for example).

I think they fix different stuff.

Transactions in Redis fix the problem that you may want, even in case
of network errors, that N commands are sent all, or none of them.

So that if you send:

SET x foo
EXPIRE x 1000

It will never happen that your connection hungs after "SET" and you
have a key without the EXPIRE set. Another example is if you have
lists composed of two element paris. You may want to push two
elements, or nothing. And so on.

How locks handle this? The guarantee that Transictions are also
atomically executed one after the other is just a plus, but the
problem they are trying to address is another one mainly.

> Also for the kind of job that redis needs to do eventual consistency is
> a lot better than actually locking all the keys you need and then act on
> them.

Locking are not about consistency in the use case of Redis I think,
they are more about letting the user to create atomic operations that
are missing, and too specific for appear as commands.

> If it helps your case, transactions can be thought of multiple locks at
> the same time, you can still have them expire with a rollback after a
> certain amount of milliseconds/seconds.

To implement rollbacks is hard, we need to know how to undo every operation.
How to undo after X milliseconds a Push against a List? Maybe other pushed
already in the same list, if you pop you are popping other things.

This is why I think the model of transactions used to guaranteed serialization
and actual executions of commands (and not rollbacks, the ROLLBACK
command in Redis just deletes the prepared commands, it's not an UNDO
for the operations), and locking used to guarantee serialization of
commands, is the best we can hope to have.

We can serialize things. We can also make sure groups of related
commands are executed or not. But we don't need the ability to UNDO
operations, that's very hard, complex and possibly slow.

> LOCKing is never a good idea, it might happen that sometimes it's the
> only one but it doesn't make it right. distributed locking is just
> shockingly
> hard. Locks in these cases can be removed by making better use of the
> flow of information in your system.

Right, I think that Redis primitives make you able to write a lot of
locking free code.
This is The Right Thing to do, when it's possible. Sometimes it's just
not possible. What's wrong with locking with key-granularity when you
need it? I know that I can implement locking with key granularity in
O(1) in Redis. So we'll have a fast, simple to use locking system,
that does not slow down multiple clients like it happens in relational
DB since locking a given key has no effects at all on other keys, and
Locking is an in-memory business.

I think it's really worth it for complex problems. For simpler ones
it's better to invent a locking free way to solve it. But Redis don't
want to force a given design. It will be up to the programmer to
design the best Redis usage possible to model the given problem.


> That's because the project is till early in development so new features
> and fixes are much welcome. Until there's a "stable" non beta release
> this will always be the case I believe.

Yeah probably I just need to care less about this until we have some
kind of stable release. Once there is a Redis-stable of course I'll
take care of fixing critical bugs in the stable branch while hacking
in the new one.

Cheers,
Salvatore

>
> --
> Valentino Volonghi aka Dialtone
> Now running MacOS X 10.5
> Home Page: http://www.twisted.it
> http://www.adroll.com
>
>

--

Valentino Volonghi

unread,
Mar 23, 2009, 2:43:25 PM3/23/09
to redi...@googlegroups.com

On Mar 23, 2009, at 10:37 AM, Salvatore Sanfilippo wrote:

> I think they fix different stuff.
>
> Transactions in Redis fix the problem that you may want, even in case
> of network errors, that N commands are sent all, or none of them.
>
> So that if you send:
>
> SET x foo
> EXPIRE x 1000
>
> It will never happen that your connection hungs after "SET" and you
> have a key without the EXPIRE set. Another example is if you have
> lists composed of two element paris. You may want to push two
> elements, or nothing. And so on.
>
> How locks handle this? The guarantee that Transictions are also
> atomically executed one after the other is just a plus, but the
> problem they are trying to address is another one mainly.

Not really, transactions serve the purpose of executing a set of
instructions
atomically, durably, isolately (? is this s word?), consistently
across a given
set of data (table, timestamp-table-key bucket, rows, columns, keys,
etc).
Locks don't guarantee anything except for occasional deadlocks and race
conditions.

Also transactions don't need to be executed one after the other,
that's exactly
why MVCC and eventual consistency exist. Most of the transactions
actually
don't work on the same datasets (or overlapping datasets) and this
means that
most of the time pessimistic locking (that is what you want to
implement)
imposes a very high overhead by locking without a real need to.

Sending all operations or none has exactly the same effects, by all
reasonable cases, of executing all or none of them since I'm actually
interested in the execution of such commands and not their simple
delivery.

What I'm trying to explain here is that locks are a __low__ (very very
low)
level primitive for concurrency and as every low level API they are
better
when taken care of in the lowest levels of your system and not in
application
code (can you imagine an RDBMS where locking was always explicitly done
at application level instead of being dealt with only when the
application
actually needed this particolar feature?)

Let's be clear:

If I can't set EXPIRE x 10000 after SET x then I have a problem.

> Locking are not about consistency in the use case of Redis I think,
> they are more about letting the user to create atomic operations that
> are missing, and too specific for appear as commands.

And why would you want atomic operations if not for consistency?

>> If it helps your case, transactions can be thought of multiple
>> locks at
>> the same time, you can still have them expire with a rollback after a
>> certain amount of milliseconds/seconds.
>
> To implement rollbacks is hard, we need to know how to undo every
> operation.
> How to undo after X milliseconds a Push against a List? Maybe other
> pushed
> already in the same list, if you pop you are popping other things.

If you plan to implement transactions then you must also provide a
rollback
mechanism. I think the actual problem here is that you are trying to
make redis
too similar to an rdbms, and the sort with all the extra options is
already way
too similar to a:
SELECT ... FROM pattern, pattern2, pattern3 WHERE condition ORDER BY...

as much as I think that it probably makes sense to actually use this
syntax
instead.

At the very least anyway, if you plan to not have rollback then don't
call this
feature 'transactions' because that's not what they are, they are some
form
of guaranteed delivery of the content. The keyword for the command might
as well be: DELIVER.

> This is why I think the model of transactions used to guaranteed
> serialization
> and actual executions of commands (and not rollbacks, the ROLLBACK
> command in Redis just deletes the prepared commands, it's not an UNDO
> for the operations), and locking used to guarantee serialization of
> commands, is the best we can hope to have.

Nope, locking by itself will only be useful to make users of redis
more prone
to making stupid mistakes and bad design decisions.

> We can serialize things. We can also make sure groups of related
> commands are executed or not. But we don't need the ability to UNDO
> operations, that's very hard, complex and possibly slow.

How exactly do you plan to implement a transaction when the first N
operations work but N+1 fails? You just leave the database in a broken
state? This is equivalent to corruption because data in it is not well
formed,
you might as well get rid of it and save space.

>> LOCKing is never a good idea, it might happen that sometimes it's the
>> only one but it doesn't make it right. distributed locking is just
>> shockingly
>> hard. Locks in these cases can be removed by making better use of the
>> flow of information in your system.
>
> Right, I think that Redis primitives make you able to write a lot of
> locking free code.

Yes, and that's why I really like those primitives and why I chose to
write
the erlang client in first place.

> This is The Right Thing to do, when it's possible. Sometimes it's just
> not possible. What's wrong with locking with key-granularity when you
> need it? I know that I can implement locking with key granularity in
> O(1) in Redis. So we'll have a fast, simple to use locking system,
> that does not slow down multiple clients like it happens in relational
> DB since locking a given key has no effects at all on other keys, and
> Locking is an in-memory business.

If N clients access the same key they will all wait until the first
one has
finished its business, how can locking not slow things down? Also this
is a simple read-write lock, what about those clients that are only
interested
in reading? Are you going to lock on them too? What happens when in the
future you'll add multiple listeners on the socket/multiple sockets?

> I think it's really worth it for complex problems. For simpler ones
> it's better to invent a locking free way to solve it. But Redis don't
> want to force a given design. It will be up to the programmer to
> design the best Redis usage possible to model the given problem.

The problem is not to not force a design, it's to not make bad designs
easy and misuse common. If "you" (as in redis user than wants locking
or similar complex primitives) still want to go down a 'bad' path
then feel free to, but let's just not have everyone think that it's
among
the possible choices.

PGP.sig

Salvatore Sanfilippo

unread,
Mar 23, 2009, 3:36:24 PM3/23/09
to redi...@googlegroups.com
On Mon, Mar 23, 2009 at 7:43 PM, Valentino Volonghi <dial...@gmail.com> wrote:

> Not really, transactions serve the purpose of executing a set of
> instructions
> atomically, durably, isolately (? is this s word?), consistently across a
> given
> set of data (table, timestamp-table-key bucket, rows, columns, keys, etc).
> Locks don't guarantee anything except for occasional deadlocks and race
> conditions.

Ok sorry, this is probably a misunderstood due to wrong usage of words.
I don't mean transactions in the common accepted sense, but just the
feature I described.
"Delivery" to use your word. With the ability to undo the delivery queue using a
command (I called it ROLLBACK but it's better to find a new name I guess).

> Sending all operations or none has exactly the same effects, by all
> reasonable cases, of executing all or none of them since I'm actually
> interested in the execution of such commands and not their simple
> delivery.

Yep DELIVERY want to just make sure that all the commands or none gets
executed. What's the alternative proposal to get this effect not
involving a rollback? A rollback is more or less impossible to
implement into Redis, without an incredible raise in complexity.

> What I'm trying to explain here is that locks are a __low__ (very very low)
> level primitive for concurrency and as every low level API they are better
> when taken care of in the lowest levels of your system and not in
> application
> code (can you imagine an RDBMS where locking was always explicitly done
> at application level instead of being dealt with only when the application
> actually needed this particolar feature?)

I think that there are low level lockings and higher level ones.

For example in Redis it does not make sense to lock a read. locking is
only a serialization primitive.

I mean, assume you have N thread incrementing a 64 bit value in a 32
bit processor.
Now you don't have an assembler instruction that increments this
value. You need to implement write locks and read locks to increment
or just read that value.

This is not the case with Redis. The granularity is the Redis
instruction. you need to lock only when there is the need to serialize
a number of operations like check this value, set this, increment if
this is less then, and so on.

Basically this is how LOCKS works in Redis:

LOCK key timeout, right? but "key" is actually a token. If you want to
lock key "a" and "b" you can write:

LOCK a,b 5

it's just a string for redis. There is nothing about low-levelness
involved. It is just a simple primitive you can use in order to
serialize different clients doing complex operations. An higher level
operation that is not about keys at all, but string tokens.

> Let's be clear:
>
> If I can't set EXPIRE x 10000 after SET x then I have a problem.

I don't get it. If just your connection drops?

>> Locking are not about consistency in the use case of Redis I think,
>> they are more about letting the user to create atomic operations that
>> are missing, and too specific for appear as commands.
>
> And why would you want atomic operations if not for consistency?

ok, again a 'wording' problem I guess.

You want locking to serialize, in order to avoid race conditions. You
may want to call this consistency or not. Basically locking is not
about problems with dropped connections and so on, but with clients
issuing multiple commands against a same key needing the abstraction
this key, or keys, will not change in the meantime.

For example:

LOCK uid1000-writes
LAPPEND uid1000:message foo
LAPPEND uid1000:message_time 192929343
UNLOCK uid1000-writes

Assuming you have two lists and want to be sure that element N of a
list is related to element N of the other list.

> If you plan to implement transactions then you must also provide a rollback
> mechanism. I think the actual problem here is that you are trying to make
> redis
> too similar to an rdbms, and the sort with all the extra options is already

Again, probably transactions was a bad way to say it. I want to
implement a "guaranteed delivery" primitive.

> way
> too similar to a:
> SELECT ... FROM pattern, pattern2, pattern3 WHERE condition ORDER BY...
>
> as much as I think that it probably makes sense to actually use this syntax
> instead.

I don't like SORT either, but I think that it is simply needed. Too
much to be ignored.
People will end trying to sort stuff they have in Redis. Without SORT
it's a "get all this data, sort" business. Basically you can't use
Redis at all in all this scenarios. SORT may be a ugly hack for a
key-val DB, but the power you get from this will allow to use Redis
for a lot of different things.

An example? If you want a clone of Hacker News, let's take a TRIMmed
queue of the last 1000 news.
Have have a client updating the news_weights according to scores and
news age. Use SORT to gen the front page. Sorting a 1000 elements list
against weights takes... less than 1 millisecond on a slow box.

Try to do this without SORT. It's ugly, but very powerful.

> At the very least anyway, if you plan to not have rollback then don't call
> this
> feature 'transactions' because that's not what they are, they are some form
> of guaranteed delivery of the content. The keyword for the command might
> as well be: DELIVER.

100% agree on this.

Now that it's called DELIVER, and does just a single thing, that is
guarantee the deliver, I wonder if you thing it's a bad feature or
not, and how to deal with the SET/EXPIRE problem.

rations, that's very hard, complex and possibly slow.

> How exactly do you plan to implement a transaction when the first N
> operations work but N+1 fails? You just leave the database in a broken
> state? This is equivalent to corruption because data in it is not well
> formed,
> you might as well get rid of it and save space.

You get all the replies back. It's possible to check what happened.
This is one reason the protocol changes are needed. After DELIVERY you
get from the client library an array of stuff. Btw DELIVERY is mosty
useful against commands that will not fail, like SET+EXPIRE.

> Yes, and that's why I really like those primitives and why I chose to write
> the erlang client in first place.

LOCK will just give you the ability to model the missing ones.
Redis is already locking for you, in a way you can't see it, to
implement every of this primitives.
LOCK is just an extension to this. It's not a low level key-locking
primitive, but just a serialization primitive.

> If N clients access the same key they will all wait until the first one has
> finished its business, how can locking not slow things down? Also this
> is a simple read-write lock, what about those clients that are only
> interested
> in reading? Are you going to lock on them too? What happens when in the
> future you'll add multiple listeners on the socket/multiple sockets?

Ok Locking is not really about keys. It's about tokens.

Basically if you access without to use the LOCK command Redis will not
care and give you back everything even if there are clients locking
that key. Also LOCK and UNLOCK are very fast. So what will happen is
that most of the times, even with a lot of clients, there will be no
difference. A client will only wait if it is accessing *while* another
one requested to lock the same token. In a web application this will
happen only if you have zillion of pageviews per second. Or if the
Redis user is dumb and will use LOCK / BLABLA/ UNLOCK where BLABLA is
a very very slow operation.

One client will only wait if, some other client issued LOCK "string",
and it will also issue LOCK "string", but the first client didn't
called UNLOCK "string" because it's busy.

Valentino I want stress one thing that you already said: most of the
times your design should NOT involve locking. But what if you need it?
Maybe to lock a resource that is not about Redis, but external but
needs some form of serialization. If guys will use Locking in a bad
way... who cares? It's their business, we can give tools, but Redis is
not Java ;)
Sensible people will not use locking most of the time, but only when
it's really needed and useful.

> The problem is not to not force a design, it's to not make bad designs
> easy and misuse common. If "you" (as in redis user than wants locking
> or similar complex primitives) still want to go down a 'bad' path
> then feel free to, but let's just not have everyone think that it's among
> the possible choices.

Ok, basically this is what I want to do: since EXPIRE will be enough
to hack against, I'll not add the other features for now but start
from this one. Then I'll try to work with Redis more and more about
application development with it.

I think that after few weeks of writing real code against it both mine
and of other users it will be clear if there is the need. Basically if
we'll have requests in the ML on the form of: I require this new
atomic primitive, and this requests are often about primitives that
are not general, we know we are going to need locking (if it's not
just a metter of a wrong design decisions this guy did to end with the
need for the new primitive).

Otherwise maybe the primitives we already have are enough, even
without locking, to do all the kind of nice things.

Another signal that we need locking is if people will star to use
RENAME as a locking primitive a lot, since it is absolutely possible
to do it in theory, but just it's slower, does not provide blocking
(It's only like to have TRYLOCK and UNLOCK), and it's more difficult
to use.

Thanks for your email,
Salvatore

András Bártházi

unread,
Mar 23, 2009, 3:57:35 PM3/23/09
to redi...@googlegroups.com
Hi,

> "Delivery" to use your word. With the ability to undo the delivery queue using a
> command (I called it ROLLBACK but it's better to find a new name I guess).

Actually there's no need for a real rollback in Redis. If the server
fails during an atomic "delivery", then it will fail to write the data
next time to the disk. As I see Redis doesn't support other type of
failing at the moment (even not supporting server failure as well).

> LOCK a,b 5

If the separator is a space for multiple values at MGET, it may be a
bad idea to use commas here. Or with other words, it would be better
to use commas for MGET as well.

Bye,
Andras

Salvatore Sanfilippo

unread,
Mar 23, 2009, 4:03:30 PM3/23/09
to redi...@googlegroups.com
2009/3/23 András Bártházi <barthaz...@gmail.com>:
>
> Hi,
>
>> "Delivery" to use your word. With the ability to undo the delivery queue using a
>> command (I called it ROLLBACK but it's better to find a new name I guess).
>
> Actually there's no need for a real rollback in Redis. If the server
> fails during an atomic "delivery", then it will fail to write the data
> next time to the disk. As I see Redis doesn't support other type of
> failing at the moment (even not supporting server failure as well).

Well, near or not, it's also almost impossible. How to roll back a
PUSH operation reliably without a long lock to the specified key?

>> LOCK a,b 5
>
> If the separator is a space for multiple values at MGET, it may be a
> bad idea to use commas here. Or with other words, it would be better
> to use commas for MGET as well.

LOCK as designed takes "strings", it's just a token
you can write

LOCK my,token----for-key-a
DOSOMETHINGWITH a
UNLOCK my,token----for-key-a

the next clinet will block while this ons is doing DOSOMETHINGWITH
only if it will use again the same "token"
It's not really key related. It is a general, higher level, locking primitive.

Cheers,
Salvatore

>
> Bye,
>  Andras
Reply all
Reply to author
Forward
0 new messages