SQL Replication on 3.0.3800 to a Postgres table with an array of integers using Npgsql 3.4.1

100 views
Skip to first unread message

Bruno Lopes

unread,
Jun 16, 2016, 2:45:37 PM6/16/16
to RavenDB - 2nd generation document database
Here's the scenario:

- RavenDB 3.0.3800
- Postgresql 9.3.4.3

RavenDB is configured to "sql replicate" to a Postgresql table with a column typed integer[].

Previously this worked with:

replicateToTable({
   this_is_an_array: '{'+this.NumbersAsString.join(',')+'}',
});

We've upgraded Npgsql from 2.1.3 to 3.1.4 and now replication fails with:

 Exception: Npgsql.PostgresException (0x80004005): 42804: column "this_is_an_array" is of type integer[] but expression is of type text

In this case, I might expect it to work by changing the replicationScript to:

replicateToTable({
   this_is_an_array: this.NumbersAsString.map(function(e){ return parseInt(e,10); }),
});

But it fails with JArray not being convertable to something or other (paraphrased because I'm upgrading the client code to 3.0.30143 to see if that helps and can't run the test ATM).

Am I missing a way to get the array to npgsql as an array instead of a string?

Thanks,
Bruno




Chris Marisic

unread,
Jun 16, 2016, 3:12:23 PM6/16/16
to RavenDB - 2nd generation document database
I assume you'll probably have to fork the sql replication.

Seeing as arrays are a nonstandard data type, i don't see how raven would support that across platform.

Bruno Lopes

unread,
Jun 16, 2016, 8:24:00 PM6/16/16
to ravendb
If it's an array or a list of objects, it might work by passing just an object instead of going with .Value<string>() on the writer. I think npgsql might be able to work with that. Other providers would just error on arrays.

I've spiked it at my fork. Npgsql tries to infer the type and chokes on object, but I think I might be able to peek into the
items to get the types.

I'll also investigate if there's an alternate way to send parameters to arrays, perhaps as strings as before.

As for forking, I'm not even sure how to go about replacing *just* the sql replication bits. And having a "personal" fork of ravendb just for that seems overkill.


--
You received this message because you are subscribed to a topic in the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/B3mqn7u8x8A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chris Marisic

unread,
Jun 17, 2016, 11:39:11 AM6/17/16
to RavenDB - 2nd generation document database
It's a bundle, you should be able to have all your changes constrained to the single bundle. Then you can deploy your custom bundle.

Since it looks like the bundle is just part of the main repo, you technically are maintaining your own fork of ravendb as a whole but it should be easy to stay in sync since you're not touching outside the bundle.

Chris Marisic

unread,
Jun 17, 2016, 11:40:48 AM6/17/16
to RavenDB - 2nd generation document database
if you reach good success, you might want to consider publishing this bundle as NpgSqlReplication

Bruno Lopes

unread,
Jun 17, 2016, 1:30:14 PM6/17/16
to ravendb
So, I think I managed to get to a workable solution that doesn't depend on npgsql, works with arrays of any type, and is similar to another solution already implemented. Prototype is here. I'd love to get feedback from hibernating rhinos to see if it's something that could be worked into master with some proper tests.

Replication sets up an object like
   Quantities: {$ArrayType: 'System.Int32', Values: this.OrderLines.map(function(l) {return l.Quantity;})}

This creates a new List of the $ArrayType, which npgsql uses to infer which type the column's supposed to be.

A better solution could be making the RelationalDatabaseWriter aware of the provider, so we use other provider specific column types, but that's a much larger change.


Oren Eini (Ayende Rahien)

unread,
Jun 20, 2016, 10:58:45 AM6/20/16
to ravendb
There is the {Type: ..., Value: ... } option, which is what is typically used

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.

Bruno Lopes

unread,
Jun 20, 2016, 11:04:32 AM6/20/16
to ravendb
Oren: Type is parsed as a DbType, and there's no way to map a DbType to a postgres/npgsql array.

Chris Marisic

unread,
Jun 20, 2016, 11:17:41 AM6/20/16
to RavenDB - 2nd generation document database
The is crux of the whole problem, none of the generic provide apis have any clue about this.

You would need some way to pass data out of band around this such as a dictionary / custom resolvers that could apply more specific behavior that's database specific

Bruno Lopes

unread,
Jun 20, 2016, 5:21:35 PM6/20/16
to ravendb
I wouldn't mind much doing that, if it meant adding code to support postgres' non-standard types. 
What I would mind doing is replacing the entire bundle.

Currently with this upgrade we lose embedded tests (which use the same npgsql version as the rest of the app), and if nothing changes, in the future we'll lose that support in the standalone server since I can expect to need to upgrade npgsql sometime.

So we either expect support for this in RavenDB in the near(ish?) future (which might be in the shape of the spike I've shared), or we'll have to rewrite our replication tables (and the queries that use those tables) to not use arrays or any other non-standard table construct. Which really sucks since some of these datatypes are quite useful and a large plus to using pgsql.


Chris Marisic

unread,
Jun 21, 2016, 4:37:02 PM6/21/16
to RavenDB - 2nd generation document database
Why do embedded tests care about sql replication?

Bruno Lopes

unread,
Jun 21, 2016, 4:44:54 PM6/21/16
to ravendb
Those are automated tests that use an embedded ravendb server. Some hit the sql replication feature.

1) regression testing of issues with replicated data (wrong names in the columns, typos, a couple of more exotic cases)
2) smoke tests for replication, like "push doc into raven, replicate, read row from postgres, are the values consistent?"
3) a couple larger tests that hit the entire pipeline (call the "create endpoint" on the api, check the "analytics enpoint" on the api, are there any errors and are the values correct).

Chris Marisic

unread,
Jun 21, 2016, 5:42:55 PM6/21/16
to RavenDB - 2nd generation document database
I would say that smoke test doesn't really belong in your application test suite but i completely understand why you'd really want a metal test like that

Throwing out wild ideas, assuming you inherit from the existing bundle. Since you're running your own process. You theoretically could bust into the app domain and replace the original type with your modified type. As long as you adhered to the existing contract, this might just be almost reasonable

Bruno Lopes

unread,
Jun 22, 2016, 4:57:05 AM6/22/16
to ravendb
What I want is for my tests to run as fast as possible and in isolation. Having an embedded ravendb server as the document store gives me that.

Monkeypatching the sql replication bundle would not be easy or simple. There aren't that many virtual members, so I'd probably have to duplicate a lot of code just to replace the method I want.
I might look into that if we know that we'll be getting support for custom types/postgres arrays in the future, but if not, then I think the less bad option would be to either not use anything that's not standard/supported natively, or figure out a way to interpret the data in postgres before inserting it into the tables. Calculated columns, perhaps.

Bruno Lopes

unread,
Jun 23, 2016, 5:48:07 AM6/23/16
to ravendb
@hibernating-rhinos: is supporting this scenario of "sql replicate this array of strings/ints/booleans into a postgresql column of type text[]/int[]/bool[]" something that's expected to be supported now/soon/in the future?

Here's a starting point: https://github.com/ravendb/ravendb/compare/v3.0...brunomlopes:postgres-arrays which I might be able to work on to get into a PR for you guys.

I really don't want to fork raven/sql replication, and I feel like arrays are so useful in psql that it would be a shame to not be able to use them.


Oren Eini (Ayende Rahien)

unread,
Jun 24, 2016, 12:36:22 AM6/24/16
to ravendb
We would like to support it, yes.
But it seems it would be simply to add a "don't convert" flag, that would just shove the data as it to the provider, no? Do you really need a typed list there?

Bruno Lopes

unread,
Jun 24, 2016, 2:41:44 AM6/24/16
to ravendb

These are ravenjobjects, which npgsql doesn't know how to handle.
There's a npgsqldbtype, like dbtype but with the extra stuff, but that requires references to npgsql (or reflection stuff ).
The typed list was the most straightforward way I could find to do it

Oren Eini (Ayende Rahien)

unread,
Jun 25, 2016, 1:09:12 PM6/25/16
to ravendb
Can we just put the full npgsqldbtype type name, instead?

Bruno Lopes

unread,
Jun 25, 2016, 2:01:29 PM6/25/16
to ravendb
For arrays it's not just the type name, it's OR'ed together:
"""
Notes when using Range and Array, bitwise-or NpgsqlDbType.Range or NpgsqlDbType.Array with the child type.
For example, to construct the NpgsqlDbType for a int4range, write NpgsqlDbType.Range | NpgsqlDbType.Integer.
To construct the NpgsqlDbType for an int[], write NpgsqlDbType.Array | NpgsqlDbType.Integer.
"""
From http://www.npgsql.org/doc/types.html

Oren Eini (Ayende Rahien)

unread,
Jun 26, 2016, 4:06:28 AM6/26/16
to ravendb
That is a valid type name for enum, I think.
Or, we can do Type and then EnumValue ?

Bruno Lopes

unread,
Jun 29, 2016, 9:12:12 AM6/29/16
to ravendb
That's set on `NpgsqlDbType`, which is a property of class Npgsql.NpgsqlParameter.
The vanilla SqlParameter doesn't have that.

Unless the sql replication bundle starts referencing Npgsql directly, or we reflect over the type and property, we can't use that.

With reflection it would be something like:
{
 Type : "NpgsqlTypes.NpgsqlDbType",
 EnumValue: "NpgsqlTypes.Array | NpgsqlTypes.Bit",
 EnumProperty: "NpgsqlDbType",
 Value: this.OrderLines.map(function(l) {return l.Quantity;})}
}

?


Oren Eini (Ayende Rahien)

unread,
Jun 29, 2016, 12:09:04 PM6/29/16
to ravendb
We don't want to reference it directly (the versioning issues are a killer), so I think a PR with this option for the dynamic mapping is probably better.
Can you send it against 3.5 ?

Bruno Lopes

unread,
Jun 30, 2016, 5:25:13 AM6/30/16
to ravendb
Sure, I can give it a go.

I'd want to include some tests, so I'll add npgsql as a dep to the tests, ok?

Oren Eini (Ayende Rahien)

unread,
Jun 30, 2016, 6:30:03 AM6/30/16
to ravendb
That is fine, just make sure that it has the same behavior as SQL Server, that it is skipped if it isn't there.

Bruno Lopes

unread,
Jun 30, 2016, 10:59:09 AM6/30/16
to ravendb
Ends up being a bit verbose, had to use Double instead of Integer because numbers from javascript are all doubles, and special case lists (with Values instead of Value)
I'd like to clean it up a bit, avoid quoting members on the script and other stuff like that, before merging.

Would this approach be acceptable for a the release of 3.5?

Oren Eini (Ayende Rahien)

unread,
Jun 30, 2016, 1:53:38 PM6/30/16
to ravendb
Thanks, we'll review it early next week
Reply all
Reply to author
Forward
0 new messages