hi, i am working on....

113 views
Skip to first unread message

Richard Boettner

unread,
Apr 8, 2026, 7:28:21 PM (6 days ago) Apr 8
to Pick and MultiValue Databases
i am writing multi-value db in c.  is anyone interested in answering my question as they come along/pop up. i have a basic layout already. its not able to create a fully hashed db yet. i'm not interested in sql.
how do i determine when to grow the scheme data file to accommodate more data and by how much. also was the original pick db held in memory or disk?

thx, richard

Jay LaBonte

unread,
Apr 8, 2026, 8:14:52 PM (6 days ago) Apr 8
to mvd...@googlegroups.com
Which MV database are you using? Unidata, universe, D3, jbase, openQM, etc…?

On Apr 8, 2026, at 7:28 PM, Richard Boettner <star...@gmail.com> wrote:

i am writing multi-value db in c.  is anyone interested in answering my question as they come along/pop up. i have a basic layout already. its not able to create a fully hashed db yet. i'm not interested in sql.
how do i determine when to grow the scheme data file to accommodate more data and by how much. also was the original pick db held in memory or disk?

thx, richard

--
You received this message because you are subscribed to
the "Pick and MultiValue Databases" group.
To post, email to: mvd...@googlegroups.com
To unsubscribe, email to: mvdbms+un...@googlegroups.com
For more options, visit http://groups.google.com/group/mvdbms
---
You received this message because you are subscribed to the Google Groups "Pick and MultiValue Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mvdbms+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/mvdbms/41dbf762-14ab-4183-b9d4-6b229d16c53fn%40googlegroups.com.

Brian Speirs

unread,
Apr 9, 2026, 1:59:01 AM (5 days ago) Apr 9
to Pick and MultiValue Databases
Hi Richard,
My question would be ... why are you doing this when there are already well established open-source MV products to work on?
Search this forum for ScarletDME and StringDB (or SDB).
ScarletDME is a fork of the GPL OpenQM product (updated to 64-bit); and String DB is in turn a fork of ScarletDME.
Both projects need contributions - and that seems a better thing to do than re-invent the MV database (wheel).
Cheers,

Brian

Will Johnson

unread,
Apr 9, 2026, 6:33:04 AM (5 days ago) Apr 9
to Pick and MultiValue Databases
The original Pick database was held on disk with paged memory.  Pages were swapped in and out of memory when needed, but also pages that had been changed were written back to disk as soon as possible to avoid issues with power.

The user was allowed to set the initial size of the database, but each group had a header to tell it where the next frame in that group resided.  So if the last frame in a group had to have additional data added, a new linked frame was added to it, anywhere else on the disk, and that frame number was put into the header of the last frame to tell the system, where to look for this "overflow" frame.

Wol

unread,
Apr 9, 2026, 3:08:41 PM (5 days ago) Apr 9
to mvd...@googlegroups.com
In answer to Brian, I guess you're doing it "because it's there"? There
are other attempts at open source MV databases out there - I started
MaVerick, someone else started Winter - I'm sure there are more.
Although I would concur with Brian somewhat - we want to coalesce on one
database.

Being forks of each other, there's no reason why SDB and ScarletDME
couldn't merge back, it's been a friendly fork and I don't think either
side is averse to the other nicking good ideas - I certainly want to get
SDB's python bootstrap compiler into ScarletDME.

But if you are doing it "just for fun", as far as UV (and probably most
others) goes, you should split a group and grow a file when it reaches a
nominal 80% full, and shrink it when it's nominally 50% full. Nearly
all (if not all) implementations now almost exclusively use linear or
dynamic hashing - they're two different variations on a theme. A file
will grow or shrink one bucket at a time - you feed the key into some
hash algorithm like MD5 or whatever, and then you do mod(power-of-two).

So let's say your file is currently 13 buckets big (0-12). You try
mod(16) and if that gives you a valid bucket, you use it. Otherwise you
try mod(8) which definitely gives you a valid bucket. This has the
extremely useful property that, if you want to add another bucket, all
the records that will go into bucket 14 are currently in bucket 5 (13
mod 8). Or if you want drop bucket 13, they all go into bucket 4 (12 mod
8). So growing and shrinking the file is not proportional to the size of
the file as in typical hash algorithms, but is proportional to the file
delta.

Cheers,
Wol

Jim Idle

unread,
Apr 10, 2026, 12:49:37 PM (4 days ago) Apr 10
to mvd...@googlegroups.com
Given the questions you are asking, I think you will have a difficult time doing this in any way that performs well. But if you are creating a standard hashing file, you need to detect that you will overflow the bucket and have links to chain the pages together. You need an allocation pool, locking, sync, etc. and you should use memory mapping. 

You would be better off using an existing open source data system that can model key/value or OS in fact key/value. 

Sorry to discourage you, but this is by no means a simple task. 

Jim


--

Richard Boettner

unread,
Apr 10, 2026, 1:09:52 PM (4 days ago) Apr 10
to Pick and MultiValue Databases
i'm trying follow pick db. the book i have is not detailed. 'exploring pick operating system.'  i wrote a crude db in basic years ago when line numbers were required on a data general. when i came across the pick db idea recently i picked my programing again this time in c. at first, i thought maybe forth but c is just as fast and a bit easier to understand..
richard.

pick = MV database 

On Wednesday, April 8, 2026 at 5:28:21 PM UTC-6 Richard Boettner wrote:

Richard Boettner

unread,
Apr 10, 2026, 1:13:35 PM (4 days ago) Apr 10
to Pick and MultiValue Databases
was pick db held in memory or disk?

i found a youtube video and learned that each entry was written to disk nothing was help in memory because there wasnt enough. thx everyone. richard


On Wednesday, April 8, 2026 at 5:28:21 PM UTC-6 Richard Boettner wrote:

bdeck...@gmail.com

unread,
Apr 10, 2026, 4:05:21 PM (4 days ago) Apr 10
to mvd...@googlegroups.com

Your fault for making it look easy Jim 😊

-BD

Brian Speirs

unread,
Apr 10, 2026, 7:51:02 PM (4 days ago) Apr 10
to Pick and MultiValue Databases
Hi Richard,
In case you aren't aware of who is replying to you, Jim Idle was the architect of the jBASE database, so he DOES know what he is talking about.
There have also been a number of other replies that conflate different characteristics of MV databases - for example, the bit about memory being mapped to disk. That is true, but that relates to user workspace rather than how to maintain the data files. As far as that goes, some of the major points have been covered, but here are some of things that you need to consider:

File creation: A file consists of a header block (frame/group) + a 1 to n data frames. You will need to define what goes into the header block. You probably also need to create the overflow file at this point.
Adding an item: Hash the ID to derive the block where the item is to be saved. Read the block, add the item, and save the block. Save to the overflow area as required (and maintain links to the overflow area). Add new groups to the file as required
Updating an item: Similar to "Adding an item" but you will need to find the item within the block and decide what to do when the updated item is a different length than the existing item - you could need to re-write the whole block and overflow chain.
Deleting an item: Similar to "Updating an item". Remove groups from the file as required.
Clearing the file: I think most systems just mark the file as empty in the header block.
Adding/removing groups from the file: You need to maintain a "data load factor". When the data load factor exceeds a defined level, then you need to split a group. When the data load factor drops below a certain level, you need to consolidate a group back into a preceding group. Both of those actions will also impact on the overflow area.

None of the above touches on locking - but you will need to consider that if you are writing a multi-user system.

Have fun.

Brian

Kevin King

unread,
Apr 13, 2026, 5:48:31 PM (15 hours ago) Apr 13
to mvd...@googlegroups.com
Rather than creating a solution based completely on MV - which was been done numerous times as these fine folks have pointed out - a fun exercise is to build a similar hashed data file structure but using JSON as the record encoding rather than high order delimiters.  It's far more flexible, you can find JSON parsers anywhere (and writing one isn't exactly rocket science) and it's still a helluva lot of fun.

Reply all
Reply to author
Forward
0 new messages