Binary Key/Values storage length, using Redis 2.2.12 and Jedis 2 with Scala 2.9.1

737 views
Skip to first unread message

Ali

unread,
Oct 5, 2011, 3:49:09 AM10/5/11
to Jedis
Hi All,
I am using hash data structure for storing key value pairs. Keys are
always integers and values are always double.
In my first setup I wrote values in their string representation into
redis. In order to save space, I thought of converting the values into
their binary representations (4 bytes for Int and 8 bytes for Double).
When I ran this test, I saw the storage space increased !

Using CLI, I checked the keys and values

a key is like "\x00\x00\xc2\xff"
a value is like "@\xe8_\xe0\x00\x00\x00\x00"

I am suspecting jedis may convert byte arrays to strings before
storing the values (SafeEncoder class in Jedis).

I was wondering, could you please let me know what are my options to
store binary key/value pairs to save storage in Redis.

Thanks,
-A

Ingvar Bogdahn

unread,
Oct 5, 2011, 6:57:59 AM10/5/11
to jedis...@googlegroups.com
Hi Ali,

I'm not sure if you really save space, because afaik, sometimes redis
interpretes strings as doubles (i.e. when doing incrBy). But chances
are good, you are right.

For each command, you have two versions, a String and one byte[]
variant. If you feed jedis with byte[] data, you use the byte[]
variant, and in that case, SafeEncoder is not used.
All depends how you convert int/double to/from byte[].
You can test this, independently of redis.
Just encode your int / double to byte[]s. Print out their respective sizes. I.e.
int key = 5; double value =6;
byte[] keyBA = Converter.toBA(key);
System.out.println("Converter encoded key length: " + keyBA.lenght);
byte[] valueBA = ...

Then create the byte[] representation using SafeEncoder:
byte[] keySE = SafeEncoder.encode(String.valueOf(key))
System.out.println("SafeENcoder encoded key length: " + keySE.lenght);
byte[] valueSE = ...

you should normally find that SafeEncoders byte[] are longer, but you
will probably find that there is something wrong with your way of
encoding int/doubles to byte[].

If not, I'm not sure....

Try with this:

public static int byteArrayToInt(byte[] b)
{
int value = 0;
for (int i = 0; i < 4; i++) {
value = (value << 8) | (b[i] & 0xFF);
}
return value;
}

public static byte[] intToByteArray(int a)
{
byte[] ret = new byte[4];
ret[3] = (byte) (a & 0xFF);
ret[2] = (byte) ((a >> 8) & 0xFF);
ret[1] = (byte) ((a >> 16) & 0xFF);
ret[0] = (byte) ((a >> 24) & 0xFF);
return ret;
}


hope that helps,
Ingvar

( and please report back what you find, that would be interesting for all)

2011/10/5 Ali <sale...@gmail.com>:

Message has been deleted

Ali

unread,
Oct 5, 2011, 7:38:59 AM10/5/11
to Jedis
Hi Ingvar,
Actually I checked conversion. Here is how I do the conversion:
def fromInt(secIdx:Int) =
ByteBuffer.allocate(4).putInt(secIdx).array()

def fromDouble(value:Double) =
ByteBuffer.allocate(8).putDouble(value).array()

def toInt(in:Array[Byte]) = ByteBuffer.wrap(in).getInt

def toDouble(in:Array[Byte]) = ByteBuffer.wrap(in).getDouble

I also though the problem might be from the way scala calls jedis API,
so I proxied the call to a java method like below:

public static void insert(Jedis j,byte [] key, Map<byte[],byte[]>
values) { j.hmset(key,values); }

but the results are exactly the same.

Cheers,

On Oct 5, 9:57 pm, Ingvar Bogdahn <ingvar.bogd...@googlemail.com>
wrote:
> 2011/10/5 Ali <saleh...@gmail.com>:

Ali

unread,
Oct 5, 2011, 7:53:25 PM10/5/11
to Jedis
Hi again,

here is more information about the issue.

The following java code inserts a million integer pairs into redis.

public class JedisInsertion {
public static byte[] fromInt(int v) {
return ByteBuffer.allocate(4).putInt(v).array();
}
public static void main(String args[]) {
Jedis j = new Jedis("localhost");
for (int i = 0;i<1000*1000;i++){
j.set(fromInt(i),fromInt(i));
}
}
}

here is redis info output
...
used_memory:89319664
arch_bits:64
...
89319664 implies ~ 89 bytes per key value pair. I was expecting
something around 12 MB instead.

I also compiled redis in 32 bit mode (still running the test on a 64
bit machine). The results for 32-bit version of redis:

used_memory: 68831664 => 68 bytes per key value pair.

Both results are several times higher than what I was expecting.

I appreciate your feedback :)

Cheers,

Ingvar Bogdahn

unread,
Oct 6, 2011, 2:03:30 AM10/6/11
to jedis...@googlegroups.com
hi Ali,

I speculate the used_memory is not purely taken by actual data, but
also some redis housekeeping. I also have 80MB of redis database with
much less data. I suspect that the field used_memory_human might be
the value.
There is also the dbSize method from jedis.
It would be interesting to know: What are the values when you use
SafeEncoder / Strings instead of Bytes?

Ingvar


2011/10/6 Ali <sale...@gmail.com>:

grdmitro

unread,
Oct 6, 2011, 4:42:47 AM10/6/11
to Jedis
Ali,
I just encountered same problem with integers
apparently redis recognises numeric types and stores them efficently,
so no optimisation needed on client side:
http://groups.google.com/group/redis-db/browse_thread/thread/19bab57ffdab0cd0
Reply all
Reply to author
Forward
0 new messages