On Thursday 14 Feb 2013 08:00:11
adr...@gmail.com wrote:
> Hello,
>
> as I'm trying to make a db_hanoidb backend from db_toke I have some
> questions about keys, values and the interfaces to them.
>
> In some stores (like for example Redis and hanoidb) keys are arbitrary byte
> arrays. This type is erlang's <<"binary">>. Values types vary.
>
> Trying scalaris through api_tx:req_list(Tlog, List) I've been able to use
> any erlang term as value, not only binaries (with binaries as keys).
> In scalaris doc main.pdf §4.1.1 "Supported types" I see:
> Keys are always strings. In order to avoid problems with different
> encodings on different systems, we suggest to only use ASCII characters.
> For values we distinguish between native, composite and custom types.
please note that this doc describes the client data types, i.e. in the client
APIs like api_tx
-> internally, at the scope of a DB implementation, things are a bit different
(ref. db_beh.hrl):
* a DB-key is defined as ?RT:key(), created by ?RT:hash_key/1 - you cannot
make assumptions on the exact type as this depends on the RT implementation!
* a DB-value is atom() | boolean() | number() | binary() - basically a
rdht_tx:encoded_value() as created by rdht_tx:encode_value/1
-> you should not make any assumptions on the value type though (if possible)
since the encoding might change in future
> Q1: Are keys arbitrary arrays of ASCII characters ? or are they erlang
> lists of 4 or 8 bytes with values ranging from 0 to 127?
client keys (client_key() type) are defined as string() - without enforcing
any restrictions on the range (see scalaris.hrl)
> Q2: As far as arbitrary byte arrays are (easyly?) universaly ordered, can
> we use them as native keys to have an ordered map without having to think
> about what those bytes encode and leave this to applications makers?
> Wouldn't that broaden the use of scalaris?
Client keys are hashed by the hash function provided by the routing table
which is able to spread the items in the key space (and thus among different
nodes) or can enforce a certain order.
I don't quite understand how changing the client keys would broaden the use...
it is just an identifier for a value.
> Q3: I find it very cool that scalaris has rich value types. I have
> successfully natively stored tuples (very usefull for erlang apps) even if
> it is not explicitly written in the manual and "The use of them is
> discouraged".
"discouraged" in terms of: "not aupported by all APIs"
> I like very much the ability to store integers and floats,
> and the "add to number" API. Why not harden non erlang APIs to do their
> conversions well and let erlang apps store usefull erlang terms? or except
> precise cases like integer+add_to_number() let every value be an arbitrary
> byte array and its API be app_to_binary() and binary_to_app()?
Any API can always read any value and needs to know the type it is expecting
it to be. If it is not, an appropriate error will be thrown.
If you only use e.g. the Erlang API, you can store whatever Erlang allows you
to - but don't expect other types than those described in the user/dev guide
to work in the other APIs.
> Q4: About implementing an alternative backend. As I understand
> api_tx:req_list(Tlog, List) is the erlang client interface for apps. Types
> are there ruled by §4.4.1. My candidate backend storage stores only
> binaries (as far as I understand) hence a need of term_to_binary to put in
> store and binary_to_term after retrieving from store. But what are keys and
> values at the db_store.erl level just before the backend API? I mean since
> api_tx:req_list(Tlog, List) there seems to be some walk in modules and
> types and I feel lost in a maze ;-)
see above - but yes, you need term_to_binary and binary_to_term just like
db_toke if you depend on binaries at the DB level
> Q5: Are there some key space management hints/tips? feedback from real
> cases/apps?
what do you mean by "key space management hints/tips"?
Nico