Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Redis for Windows is now stronger
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  8 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Dušan D. Majkić  
View profile  
 More options Apr 30 2012, 5:53 am
From: Dušan D. Majkić <dmaj...@gmail.com>
Date: Mon, 30 Apr 2012 11:53:45 +0200
Local: Mon, Apr 30 2012 5:53 am
Subject: Redis for Windows is now stronger
I was told it was never meant to be.
I was told that it will get me a lonesome, laughable spot.

Yet, I took that kid under my repo, showed him a trick
or two, like how to live without a fork. Made him simple,
constant and present.

I had a honor to communicate with Salvatore, Claudio,
and also with other people who contributed code, or
pointed at glitches.

Now the time has come to let the kid move on.

The new home for Redis on Windows is fully open,
developed in open; contributions welcomed. There are
more capable hands on the deck. There is a fork
alternative, perhaps even better than fork itself.
It will be tested more and deployed.

Redis on Windows is now stronger.

So, rethink your project. Get involved. It is here:

https://github.com/MSOpenTech/redis

Thank you.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Salvatore Sanfilippo  
View profile  
 More options Apr 30 2012, 6:04 am
From: Salvatore Sanfilippo <anti...@gmail.com>
Date: Mon, 30 Apr 2012 12:04:07 +0200
Local: Mon, Apr 30 2012 6:04 am
Subject: Re: Redis for Windows is now stronger
Hi Dušan,

thank you for your email. I wonder if you have a solid understanding
of the current win32 user-space COW patch, I would love to discuss it
here in order to understand how it works, if we can use this system in
Redis, and so forth.

At a first glance (I did a diff) It looks a lot like ideas that passed
here from time to time, but not as a replacement to fork(), but
actually as an alternative persistence method that was not
point-in-time, but just coherent at key-level.

So a few questions:

1) Is the current win32 implementation point-in-time as a whole, like
normal redis RDB files?
2) What happens when a big key is copied, does the main thread stops?

Thanks,
Salvatore

On Mon, Apr 30, 2012 at 11:53 AM, Dušan D. Majkić <dmaj...@gmail.com> wrote:

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dušan D. Majkić  
View profile  
 More options Apr 30 2012, 7:11 am
From: Dušan D. Majkić <dmaj...@gmail.com>
Date: Mon, 30 Apr 2012 13:11:56 +0200
Local: Mon, Apr 30 2012 7:11 am
Subject: Re: Redis for Windows is now stronger

> I would love to discuss it here in order to understand how
> it works, if we can use this system in Redis, and so forth.

I'm also a lurker there.

I believe that Claudio will share more as soon as it satisfies
his team expectations.

> At a first glance (I did a diff) It looks a lot like ideas that passed
> here from time to time, but not as a replacement to fork(), but
> actually as an alternative persistence method that was not
> point-in-time, but just coherent at key-level.

There are two branches with different implementations.
There is the "bksavecow" branch which is an approach for
a fork() replacement at application level.

> 1) Is the current win32 implementation point-in-time as a whole, like
> normal redis RDB files?

Yes. The old value is preserved by copying key/value pair to list.
(If background saving is active, and the value changes)

> 2) What happens when a big key is copied, does the main thread stops?

This happens in background saving thread, under critical section:

  newval->ptr = zrealloc(newval->ptr, bytes);
  memcpy(newval->ptr, val->ptr, bytes);

It takes as much time as hardware/os takes to allocate memory
and copy old value.

It should block main thread only when it wants to access
that criticall section. On the other hand - memory copy should
be fast.

This does look interesting.
I also would love to see more info from Claudio.

BTW One can git-clone, open in VS and build.
That makes it extremely easy to play with Redis.

Regards.
Dusan Majkic


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Salvatore Sanfilippo  
View profile  
 More options Apr 30 2012, 10:35 am
From: Salvatore Sanfilippo <anti...@gmail.com>
Date: Mon, 30 Apr 2012 16:35:21 +0200
Local: Mon, Apr 30 2012 10:35 am
Subject: Re: Redis for Windows is now stronger
On Mon, Apr 30, 2012 at 1:11 PM, Dušan D. Majkić <dmaj...@gmail.com> wrote:

>> 1) Is the current win32 implementation point-in-time as a whole, like
>> normal redis RDB files?

> Yes. The old value is preserved by copying key/value pair to list.
> (If background saving is active, and the value changes)

Ok, so if the saving thread is active, and a key is modified, it gets copied.
Similarly if a key is deleted the old value is preserved, so that it
can be used to persist.

>> 2) What happens when a big key is copied, does the main thread stops?

> This happens in background saving thread, under critical section:

>  newval->ptr = zrealloc(newval->ptr, bytes);
>  memcpy(newval->ptr, val->ptr, bytes);

Not sure how a Redis value, like a list, can be copied using memcpy(),
I'm probably missing something.
Maybe the win32 port uses different data structures that can be copied
in this way?

> BTW One can git-clone, open in VS and build.
> That makes it extremely easy to play with Redis.

I don't have windows or VS, but I think that just looking closer at
the source code should make the implementation clear.
The problem is finding the time, but I hope to do it in the next weeks.

Cheers,
Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dušan D. Majkić  
View profile  
 More options Apr 30 2012, 11:01 am
From: Dušan D. Majkić <dmaj...@gmail.com>
Date: Mon, 30 Apr 2012 17:01:48 +0200
Local: Mon, Apr 30 2012 11:01 am
Subject: Re: Redis for Windows is now stronger

> Not sure how a Redis value, like a list, can be copied using memcpy(),
> I'm probably missing something.
> Maybe the win32 port uses different data structures that can be copied
> in this way?

There is a call to cowEnsureWriteCopy() on every key change/delete.
That function, in critical section does the COW. Here it is:

https://github.com/MSOpenTech/redis/blob/bksavecow/src/win32_cow.c#L467

Complex types are converted to simple array like here:

https://github.com/MSOpenTech/redis/blob/bksavecow/src/win32_cow.c#L235

It looks like only keys and values pointers are copied without ref
count increase,
and the original key is deferred from deletion.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Claudio Caldato  
View profile  
 More options Apr 30 2012, 3:27 pm
From: Claudio Caldato <ccald...@hotmail.com>
Date: Mon, 30 Apr 2012 19:27:10 +0000
Local: Mon, Apr 30 2012 3:27 pm
Subject: RE: Redis for Windows is now stronger

I’ll jump on this discussion with some details that may help to understand how the code works.

There is an overview of how the code work in the win32_cow.c file (included below).

The code defers deletion of objects during the save – if the ref count is going to be 0, the reference is added to a list. After save is done, it decrements the ref count of all objects on the list.

Some objects are modified in place rather than deleted. Typically collection objects such as sets, lists, hash tables. For these objects the code makes a read-only version of them before they are modified. The read-only version is a more efficient storage as it doesn’t need to support modifications. The save code will use the read-only version if it exists –otherwise it uses the normal collection.

The code uses special iterators for the saving code.

    - If the collection being iterated is not modified, it acts as a normal iterator.

    - If a read-only version copy, it iterates on the read-only copy.

   - If a read-only copy is created while the save is in the middle of iterating on the original collector, the iterator is modified to switch to the read-only copy. Locks are used to ensure this is done safely.

As a result, if no updates are done during the save, there is very little extra overhead (no copy).

If some updates are done, there is some copying required, but only for the collection being modified.

If the same collection is modified n times during a save, only the first modification results in a copy.

The special read-only encoding of the collection is used to reduce the cost of allocating, copying and then freeing the copy. A collection with thousands of entries would normally require thousands of allocate and frees. With the read-only encoding, it requires 1. It also uses less memory.

This is the best we came up with to replicate a COW-like behavior without changing drastically Redis code and hence make very difficult any future integration.

Salvatore, Dusan, All: Suggestions and other ideas on how we can make it better are welcome.

Claudio /************************************************************************

* This module implements copy on write to support

* saving on a background thread in Windows.

*

* Collection objects (dictionaries, lists, sets, zsets)

*  are copied to a read-only form if a command to modify the

*  collection is started. This is triggered via lookupKeyWrite().

*

* Objects which are modified in place - ziplist, zipset, etc.

*  are copied before being modified.

* Strings are normally copied before being modified.

*

* In addition deletion of objects is deferred until the save is completed.

*  This is done by modifying the dictionary delete function, and also

*  by modifying the decrRefCount function.

*

* To allow conversion of collections while the save is iterating on them

*  special iterators are used. These iterators can be migrated

*  from their normal mode to iterating over a read-only collection.

*  Locking is used so that iterator can be used from 2 threads.

*  For migration to work properly, only one save at a time may run.

*   (this restriction was already imposed in the Redis code)

*

************************************************************************/

Sent from my Windows 8 PC

From: Dušan D. Majkić
Sent: Monday, April 30, 2012 8:02:15 AM
To: redis-db@googlegroups.com
Subject: Re: Redis for Windows is now stronger

> Not sure how a Redis value, like a list, can be copied using memcpy(),

 > I'm probably missing something.
 > Maybe the win32 port uses different data structures that can be copied
 > in this way?

There is a call to cowEnsureWriteCopy() on every key change/delete.
 That function, in critical section does the COW. Here it is:

https://github.com/MSOpenTech/redis/blob/bksavecow/src/win32_cow.c#L467

Complex types are converted to simple array like here:

https://github.com/MSOpenTech/redis/blob/bksavecow/src/win32_cow.c#L235

It looks like only keys and values pointers are copied without ref
 count increase,
 and the original key is deferred from deletion.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
 To post to this group, send email to redis-db@googlegroups.com.
 To unsubscribe from this group, send email to redis-db+unsubscribe@googlegroups.com.
 For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Salvatore Sanfilippo  
View profile  
 More options Apr 30 2012, 5:03 pm
From: Salvatore Sanfilippo <anti...@gmail.com>
Date: Mon, 30 Apr 2012 23:03:34 +0200
Local: Mon, Apr 30 2012 5:03 pm
Subject: Re: Redis for Windows is now stronger
Claudio thank you a lot!

This is an excellent explanation, a few remarks and questions follow.

1) I think that the idea of making the read-only efficient copy is
great, very cool!
2) I also like the iterator trick, this really is a good abstraction
to split complexity into different pieces.
3) If I understand correctly, the read-only copy is performed in the
main thread blocking it, but this only matters for big objects.
4) Based on what I read you'll have little troubles rebasing your
changes to 2.6, that's good.

so "3" is perfectly reasonable for an usage of Redis like an object
store (many small hashes) and many other users.
I'm more concerned about this:

+            /* make clone with modified cow destructors for db dict */
+            server.cowSaveDbExt[db->id].dictArray = copyReadonly_dictobj(..

This seems to block copying the whole dictionary. With many keys this
can be a problem causing a big amount of latency.

I've a few ideas for an alternative implementation of user space COW,
no free lunch... but I'll try to think a bit more about this ideas in
the next days and will try to submit it to you, in the hope they can
help or at least that can be excluded as non-promising.

Btw there are tree main ways I can see:

1) To relax requirements, for instance save with just key-consistency,
no point-in-time. This means that transactions will not be guaranteed
in the dump, but otherwise may make sense.
2) To use a separate thread that takes a duplicated copy of the whole
dataset, possibly optimized in some way to take less space for the not
actively used keys. So when you have to persist this other thread
saves the RDB, and the first serves queries accumulating the changes.
It's like doing it already with a master and a slave, but optimized
and automatic.
3) What I think it's the gooood thing: segmented AOF with off-line
rewriting. WE ARE INTERESTED IN THIS.

What is "3"?

It's easy, imagine if we segment AOF in aof.1, aof.2, aof.3, aof.N, so
that every file is at max K Megabytes.
That's trivial... still an append-only business, we need just to
switch file when the limit is reached.

Then we can analyze all the AOF files but the currently active one,
rewriting them offline.

For instance if we find "DEL foo" after "INCR foo", "INCR foo", we
know in the rewritten version we can remove all those three commands.
Of course we can make sure that DELs are issued in many interesting cases.

For instance: SREM makes a set empty? Emit a DEL instead.
Trace all the keys deleted by scripts and emit DELs, and so forth.

It's still not clear if this can really work well. Imagine for
instance a Redis instance used only to store a single never deleted
sorted key (very real use case).

But still... would be very good for many use cases.

Cheers,
Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Claudio Caldato  
View profile  
 More options Jun 1 2012, 2:25 pm
From: Claudio Caldato <ccald...@hotmail.com>
Date: Fri, 1 Jun 2012 11:25:13 -0700
Local: Fri, Jun 1 2012 2:25 pm
Subject: RE: Redis for Windows is now stronger

Hi Salvatore, Quick update on our progress so far. we have been looking into the latency issue: > +            /* make clone with modified cow destructors for db dict */

> +            server.cowSaveDbExt[db->id].dictArray = copyReadonly_dictobj(..

We haven't found a good solution yet but we are still working on it. Less interesting for you but we are ready to release the code that will allow Redis to run as Windows Service.The next step for us is to start looking into your proposal of using segmented AOF with off-line rewriting. If I understand what you are proposing, we can use fixed-size AOF files that are optimized and rewritten offline. It is not clear to me if you are proposing an AOL-only solution or if we should consider also to use the AOF files (all minus the latest one) to write/update the RDB file.  Anyway it would help if you can give us more details on how you envision this model to work, the goal for me is to prototype this feature and check how it compares with the current solution we have on Windows. Thanks a lot
Claudio > From: anti...@gmail.com

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »