Hi Miguel,
The database layout basically depends on how your indices shall look like. If you want to have separate indices for order-id, customer-id and amount then you will need to create a database for each index. If you would like to have compound indices then that's also possible - i have outlined suggestions for such a compound index below.
In Pseudo-SQL your ORDERS table would look like this:
CREATE TABLE ORDERS (
uint64 order_id, -- primary key, indexed, auto-increment
uint64 customer_id,
double amount,
varchar other_info
);
and if i understood correctly you want to run queries like:
SELECT * FROM Orders WHERE customer_id = 12 AND amount > 100.0;
** The ORDERS table **
The Orders-database can be set up for auto-incrementing keys (use the flag HAM_RECORD_NUMBER when calling ham_env_create_db), and all other fields will be stored in the record.
** Separate Indices **
This is a simple solution and should be "fast enough" for most use cases.
database 1 (Orders): as outlined above
database 2 (Customers): could be created similar to Orders, with an auto-incremented id
database
3: n:m-mapping for Customers to Orders; see the sample env2.c for a
simple Customer/Orders database
(
https://github.com/cruppstahl/hamsterdb/blob/master/samples/env2.c)
.
For
running a query (... WHERE customer_id = 12 and amount > 100.0) you
need a nested loop over the customer-to-orders database (#4); then
retrieve the Orders structure and filter out all those orders with an
amount > 100. This is simple to implement and does not require a
separate index for "amounts"; for most use cases it is fast enough. If
not then a compound index (see below) should be faster. There are other
solutions, but they would be difficult to implement (i.e. you would have
to write code to perform optimized JOINs, like a DBMS would do).
** The Compound Index **
If both fields (customer_id, amount) should have a compound index, then you could store both fields in a single key and use a custom compare function for the sort order. The keys would be stored in a structure like the following:
struct CompoundIndex {
uint64_t customer_id;
double amount;
};
When creating the database, set the following parameters:
HAM_PARAM_KEY_TYPE: HAM_TYPE_CUSTOM
HAM_PARAM_KEY_SIZE: sizeof(CompoundIndex)
HAM_PARAM_RECORD_SIZE: 8 (will contain the uint64-id of the order)
Immediately after creating (and opening) the database, call ham_db_set_compare_func to install a custom compare function which sorts first by customer_id, and second by amount.
For lookups use cursors (ham_cursor_find) in combination with HAM_FIND_GEQ_MATCH or HAM_FIND_LEQ_MATCH; these flags will position the cursor on the exact key, or on the next ("Greater or Equal") or previous ("Less or Equal") key if there's no exact match. Then use ham_cursor_move to iterate over all keys till you reach the first key with a different customer id.
I recommend to always set HAM_PARAM_RECORD_SIZE, especially if the records are so small. Databases with a fixed length record of 8 bytes will be fast because the record is stored directly in the Btree leaf and not in a separate "blob".
I hope that helped - if not then don't hesitate to ask. I can also point you to code which implements some of these things.
Best regards
Christoph