RDBMS For Versioned Backend

95 views
Skip to first unread message

Jenil Shah

unread,
Oct 30, 2020, 6:19:54 AM10/30/20
to projectnessie
Hello Nessie Team,

You are providing support for DynamoDB,JGIT and InMemory as versioned backend. In future are you envisioning to provide RDBCM(postgresql) support for versioned backend.

Thanks and regards,
Jenil Shah

Jacques Nadeau

unread,
Nov 2, 2020, 12:26:14 PM11/2/20
to projectnessie, Jenil Shah
I just realized that my response and Jenil's somehow lost the Google Group. Adding back here. Will respond to the message shortly.

--
Jacques Nadeau
CTO and Co-Founder, Dremio


On Sun, Nov 1, 2020 at 10:14 PM jenil shah <jenil....@gmail.com> wrote:
Hello Jacques,

First of all thanks for immediate response.

At a high level our system manages a large amount of data(ingestion, enrichment, tiered storage, indexing, aggregation) and their "table like" schema is managed via Drill's Iceberg. Now there are "long running" data processing pipelines/jobs running on these which require some kind of "transnational" view data until the WHOLE pipeline/job is done as otherwise we are required to "stage the data for a pipeline" and "commit" once we are fully completed.(these "commits" were done using "HDFS directory movements" )

Now as you can understand as these processing gets complex with a lot of reference data/tables to lookup and longer pipelines while business wants "near real-time" visibility. The natural way to go about it is to have "versioned tables where a commit" represents point in time data/reference-data and let the "processing job" refer to everything by "commit-id". We also want user level control on versioned snapshots of data. That's where Nessie's branching model seems natural fit to us. 

At a high level
  • "main'' branch is "system" branch where a business user's visible global data set is being committed as and when ready.
  • "human users" get a branch forked when they start their "analytics" (which can run for hours to month) for consistent ML without "staging/copying in personal space"

We need guidance which can align better our efforts with Nessie immediate API/SPI changes:
  1. Reliable versioned storage for Nessie (other than dynamodb,we can use postgres here): We are okay to depend on PG specific features to start with, we can contribute but need some guidance on final SPI we should implement. We have done some analysis on how data is stored in dynamodb using Nessie. Found that there are five main structures used to create git kind of branching structure.
    • Ref (where branch,hash is stored) : This structure contains commit history reference from L1 and current hash which refers to key of L2.
    • L1 : Here all commit operations are stored like (adding,removing or mutation of key). In the L1,corresponding L2 key as well, the parent L1 pointer list is stored.
    • L2 : L2 will point to corresponding L3
    • L3 : In which current state of key and their respective value pointer is stored
    • Value : In which value information is stored
  2. We need fast versioned storage ( a commit under 100ms for <512Kb meta per commit, on 100 branches concurrently ) : We believe Nessie can do this at core, this is just our internal SLA.
  3. Keep iceberg interface compliance : Frankly we are still in-progress to finalise how meta-data plugin in DRILL can be made version aware, commitID in jdbc-session by driver? . (Yes this is more of a DRILL question but I am fully aware whom I am talking to here.)

Thanks and regards,
Jenil Shah

On Fri, Oct 30, 2020 at 11:37 PM Jacques Nadeau <jac...@dremio.com> wrote:
I’ve definitely been thinking a lot about it. One of the challenges is the right way to minimize roundtrips to achieve similar commit rates to dynamo. It’s easier with rdbms like Postgres that have first class support for arrays than systems like MySQL which don’t. 

Can you share more about your preferences and environment to understand different options that might work?

Thanks
Jacques
 

--
You received this message because you are subscribed to the Google Groups "projectnessie" group.
To unsubscribe from this group and stop receiving emails from it, send an email to projectnessi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/projectnessie/301d82cb-8386-4ba5-88fc-86b474e9d813n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
--
Jacques Nadeau
CTO and Co-Founder, Dremio


--
JEnil Shah

Jacques Nadeau

unread,
Nov 3, 2020, 12:28:16 AM11/3/20
to projectnessie, Jenil Shah
Hey Jenil,

Yes, it seems like you've been able to decompose the core concepts in the tree based VersionStore fairly well. You can see a diagram of the highlights here at [1].  

Postgres is probably a good choice to use instead of DynamoDB. (FYI, at Dremio, we'll probably also contribute a Mongo impl at some point.) For Postgres, you would want to implement a new implementation of the Store interface[2]. You'll need to start by figuring out how to map Entity[3] patterns to a Postgres schema.  To make things simple, To start I'd put each ValueType[4] in a different table. For the IdMap objects, I'd flatten them into separate named columns for Postgres (e.g. id0, id1... id150). I think that will likely be more performant, easier to express updates & conditions in sql and simpler to debug.

Once you get basic object mapping, then you'll need to figure out how to implement the key complexities in the Update and Condition expressions. The complexities are primarily array appending and purging, related to the Branch object. I'm also actively looking at ways to make those interfaces simpler/more straight-forward to implement.

In terms of your question #2: I would assume this is doable. You'll probably need to do some optimizations. For example, it may make sense to ultimately collapse all the non-ref objects into a single table so you can do a bulk insert of all those in a single sql operation.

For question #3, for simplicity I'd start with the read path. You can start the same way we made the iceberg catalog impl and hms bridge impl work. In those cases you can do queries like select * from t1@mybranch without having the extend the underlying system. For writes, you'll likely need to create a new session property that can be set to change the default context of the operations within nessie (change to a different branch, tag, hash, etc).




--
Jacques Nadeau
CTO and Co-Founder, Dremio

jenil shah

unread,
Nov 4, 2020, 5:44:32 AM11/4/20
to Jacques Nadeau, projectnessie
Hello Jacques,

Thank you very much for showing proper direction.This is really very much helpful. I think it is enough to start with. Will update you once we have something concrete.

Thanks and regards,
Jenil Shah


Reply all
Reply to author
Forward
0 new messages