[Haskell-cafe] ANNOUNCE: hdbi-1.0.0 and hdbi-postgresql-1.0.0

45 views
Skip to first unread message

Alexey Uimanov

unread,
Jul 30, 2013, 11:45:50 PM7/30/13
to haskel...@haskell.org
Hello, haskellers. This is the first release of HDBI (Haskell Database Independent interface). It is the fork of HDBC. 
HDBI has some improvements in design, it also has better testing and performance (mainly because of using Text instead of String anywhere).
HDBI designed to be more flexible and correct database interface.

You can find out more information in the documentation.

I am planning to implement MySql and Sqlite drivers as well

Issues and pull requests are welcome, as well as collaboration.

Bardur Arantsson

unread,
Jul 31, 2013, 1:11:36 AM7/31/13
to haskel...@haskell.org
Looks interesting.

Just a couple of comments from skimming the documentation. Apologies, if
these are already addressed -- didn't see any mention of it from my
read-through.

Regard parameterized SQL: It might be worth using named parameters (e.g.
":foo" and ":bar" or something like that) rather than "?" as
placeholders in SQL/prepared SQL. This will make it slightly more
flexible if you need to provide different SQL strings for different
databases, but want to reuse the code which does the actual running of
the SQL. It's also more flexible if you need to repeat parameters -- the
latter is typical with PostgreSQL if you want to emulate
"update-or-insert" in a single SQL statement.

Regarding migrations: If you haven't already, please have a look at
Liquibase (http://www.liquibase.org/documentation/index.html) before
attempting to implement migrations. The most important attributes of
Liquibase are:

a) migrations are *manually* specified deltas as opposed to deltas
automatically derived from two snapshots of the database. Anything
based on automatically getting from snapshot A to snapshot B *will*
break or do undesirable/unpredictable things to data at some point.
b) It deliberately does NOT provide migrations in a Turing Complete
language -- IME, if you need TC migrations on a routine basis
you're already in deep trouble and need to think more about what
you're doing (at a "process" or "engineering" level). If you
*really* need to, you *can* extend it do custom migrations via
code, but the barrier to entry is sufficiently high that you'll
rarely be tempted unless you *really* need it and the migration
step is likely to be reusable.
c) the concept of "contexts" (see the documentation) which allows
you to vary the migrations across different environments as
needed. (This is definitely open to abuse, but when you need it
you *really* need it, IME.)
d) It can also generate full SQL for "changes-about-to-be-applied"
so that you can audit and/or apply them manually -- some devops
teams need this, others may not care. A minor but important
detail is that the generated SQL includes all the metadata for
tracking the application of the migrations themselves.

Liquibase is the only system *I've* ever seen that is even close to
doing migrations the Right Way According To Me(TM). (I've used it since
the 1.x days using the XML format and still haven't come across anything
that can really compete.) Liquibase certainly has flaws(*), but one
should think *really* hard about whether there isn't some existing
migrations system which is "good enough" before foisting yet another
migrations system on the world since it's highly likely to be at least
as flawed as all the existing systems in one way or another.

(Of course, being implemented in Java, there is one aspect of Liquibase
which is annoying if using it for Haskell projects: It's not very
convenient to auto-apply all migrations at startup time without some
sort of "run-this-executable-script-before-running-the-main-program"
hack. That, and you need a Java Runtime Environment installed.)

Anyway, just throwing that out there...

Regards,

/b

(*) Sub-par internal consistency checking across combinations of
contexts for changelog files being one of them. There also the fact that
it was until recently XML-based and that the the new syntaxes all have
their own problems. Maybe a Haskell DSL could lead to some real
improvements here...



_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Alexey Uimanov

unread,
Jul 31, 2013, 3:22:42 AM7/31/13
to haskel...@haskell.org
Regard parameterized SQL: It might be worth using named parameters (e.g.
":foo" and ":bar" or something like that) rather than "?" as
placeholders in SQL/prepared SQL. This will make it slightly more
flexible if you need to provide different SQL strings for different
databases, but want to reuse the code which does the actual running of
the SQL. It's also more flexible if you need to repeat parameters -- the
latter is typical with PostgreSQL if you want to emulate
"update-or-insert" in a single SQL statement

Named parameters might be more flexible, but it is need to think hard about how to implement this.
If you are using named parameters you need to pass not just list [SqlValue] as parameters, 
but Map Text SqlValue or something. So named parameters will not be compatible with unnamed and will need
separate query parser.
 
Regarding migrations: If you haven't already, please have a look at
Liquibase (http://www.liquibase.org/documentation/index.html) before
attempting to implement migrations. The most important attributes of
Liquibase are:

What I am trying to implement is not a new migration system, but just the common interface for 
simple schema actions, here is my in-mind draft:

newtype TableName = TableName Text

data TableDescription = TableDescription 
                        {tableName :: TableName 
                        ,tableFields :: [FieldDescription]
                        }

class (Connection con) => Introspect con where
  getTableNames:: con -> IO [TableName]
  describeTable :: con -> TableName -> IO TableDescription
  getIndexes :: con -> [IndexDescription]

class (Connection con) => SchemaChange con where
  createTable :: con -> TableDescription -> IO ()
  dropTable :: con -> TableName -> IO ()
  addColumn :: con -> TableName -> FieldDescription -> IO ()
  ...............

This typeclasses must provide database-independent schema introspection and changing. 
Migration system can be anything you want.

I also have the idea do not throw the exceptions in IO but return  (Either SqlError a) from 
all the Connection and Statement methods for safe data processing. What do you think about ?

Tom Ellis

unread,
Jul 31, 2013, 4:12:16 AM7/31/13
to haskel...@haskell.org
On Wed, Jul 31, 2013 at 09:45:50AM +0600, Alexey Uimanov wrote:
> Hello, haskellers. This is the first release of HDBI (Haskell Database
> Independent interface).

Hi, thanks for this Alexey. It's great that there is continued development
of this really important infrustructure for Haskell.

I have a question about variable interpolation, that is, using "?" parameter
placeholders in the query strings, as documented here:

http://hackage.haskell.org/packages/archive/hdbi/1.0.0/doc/html/Database-HDBI.html

I know postgresql-simple does this, and presumably database access libraries
in other languages do this too.

What is the rationale for this when in Haskell we have safer methods of
interpolation at our disposal (for example HoleyMonoid)? Is it simply a
matter of using the most familiar interface, or is there a deeper reason
this is necessary?

Thanks,

Tom

Kirill Zaborsky

unread,
Jul 31, 2013, 4:25:45 AM7/31/13
to haskel...@googlegroups.com, haskel...@haskell.org
Alexey,
Regarding named parameters - another option is to use numbered parameters like :1, :2 etc. It will help with repeated parameters at least. I didn't understandthe first Bardur's point about  "different SQL strings" though.

Kind regards,
Kirill Zaborsky


2013/7/31 Alexey Uimanov <s9gf...@gmail.com>
_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

--
You received this message because you are subscribed to a topic in the Google Groups "Haskell-cafe" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/haskell-cafe/9X2J65gXGXs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to haskell-cafe...@googlegroups.com.
To post to this group, send email to haskel...@googlegroups.com.
Visit this group at http://groups.google.com/group/haskell-cafe.
For more options, visit https://groups.google.com/groups/opt_out.




Bardur Arantsson

unread,
Jul 31, 2013, 5:10:03 AM7/31/13
to haskel...@haskell.org
On 2013-07-31 09:22, Alexey Uimanov wrote:
>>
>> Regard parameterized SQL: It might be worth using named parameters (e.g.
>> ":foo" and ":bar" or something like that) rather than "?" as
>> placeholders in SQL/prepared SQL. This will make it slightly more
>> flexible if you need to provide different SQL strings for different
>> databases, but want to reuse the code which does the actual running of
>> the SQL. It's also more flexible if you need to repeat parameters -- the
>> latter is typical with PostgreSQL if you want to emulate
>> "update-or-insert" in a single SQL statement
>>
>
> Named parameters might be more flexible, but it is need to think hard about
> how to implement this.
> If you are using named parameters you need to pass not just list [SqlValue]
> as parameters,
> but Map Text SqlValue or something. So named parameters will not be
> compatible with unnamed and will need
> separate query parser.
>

The use case I'm thinking of it something like this:

reportSQL :: DatabaseType -> SQL
reportSQL MySQL = "
SELECT ... custName = :custName ...
INTERSECTION
SELECT ... custName = :custName
"
reportSQL PostgreSQL = "
SELECT ... AS cust
WHERE cust.custName = :custName
FROM SELECT ... AS foo
WHERE foo.custName = cust.custName
"

For this fictitious example we imagine that PostgreSQL can handle a
nested query of some particular shape where we need an INTERSECTION
query in MySQL. Obviously this is a made up example, but you get the
idea. The point is that the MySQL query may need to refer to the
":custName" parameter multiple times whereas the PostgreSQL one doesn't.
Similarly the positions in the SQL may need to be different.

You perhaps still want to have a way to run both variants using the
exact same code:

runReport :: DatabaseType -> Text -> IO whatever
runReport databaseType customerName = do
result <- runSQLWithParameters (reportSQL databaseType)
[("custName", customerName)]
... do stuff with result ...
return whatever

Of course this being Haskell you can always use higher-order functions
(e.g. a function DatabaseType -> Text -> IO QueryResult which
encompasses the runSQLWithParameters *and* reportSQL function, but then
you're mixing up the running of the query with the query itself) for
similar purposes, but I tend to find named parameters of this type to be
quite useful for readability.

As Kirill mentioned, you can also use numbered parameters, but I tend to
like named parameters for readability.

Implementation should be reasonably simple: Replace all ":xyz" (or
whatever syntax you choose) parameters in input with "$n" and maintain a
map which tells you which parameter "$1", "$2", etc. correspond to.

Anyway, this is an issue I've sometimes run across with JDBC (which uses
?) in particular and it can be very annoying. Perhaps the best thing in
Haskell would be to just avoid raw SQL entirely in favor of combinators,
but then you often end up with suboptimal SQL which can't really exploit
all the features of your chosen database. Even so it would be nice to
have a DB interface/library that can hit that sweet spot where you can
write your own SQL but your program won't be too tied to a single DB
backend (modulo the concrete SQL).

>
>> Regarding migrations: If you haven't already, please have a look at
>> Liquibase (http://www.liquibase.org/documentation/index.html) before
>> attempting to implement migrations. The most important attributes of
>> Liquibase are:
>>
>
> What I am trying to implement is not a new migration system, but just the
> common interface for
> simple schema actions, here is my in-mind draft:
>
> newtype TableName = TableName Text
>
[--snip--]
> This typeclasses must provide database-independent schema introspection and
> changing.
> Migration system can be anything you want.
>

Ah, OK, I see I just misinterpreted the bit in the package description
about migrations then :).

You might end up having a little trouble reconciling metadata from the
different database backends, but certainly there must be *some* useful
common subset of table/index/etc. metadata :).

> I also have the idea do not throw the exceptions in IO but return (Either
> SqlError a) from
> all the Connection and Statement methods for safe data processing. What do
> you think about ?
>

I don't think I'm qualified to have an opinion either way, but perhaps


http://www.randomhacks.net/articles/2007/03/10/haskell-8-ways-to-report-errors

and particularly

http://hackage.haskell.org/package/errors

can serve as insipration :).

Regards,

Tom Ellis

unread,
Jul 31, 2013, 5:25:03 AM7/31/13
to haskel...@haskell.org
On Wed, Jul 31, 2013 at 01:22:42PM +0600, Alexey Uimanov wrote:
> I also have the idea do not throw the exceptions in IO but return (Either
> SqlError a) from all the Connection and Statement methods for safe data
> processing. What do you think about ?

I feel very strongly that you should use Either. One of the things I find
worst about postgres-simple is the exceptions it throws. The benefit of
Haskell is that we can do things in a Haskelly way!

Tom

Alexey Uimanov

unread,
Jul 31, 2013, 7:28:02 AM7/31/13
to Tom Ellis, haskel...@haskell.org
The rationale is that the low-level database interface accepts parameters directly instead of 
inserting them inside the query manually (like HoleyMonoid would do). Postgresql-simple 
also does parameter substitution on haskell side. This is not safe and may cause to 
http://en.wikipedia.org/wiki/SQL_injection because of not properly done backquoting. Low-level 
database interface knows better how to work with parameters, so the driver must pass them to it instead
of parameters substitution. 

hdbi-postgresql just replace "?" to "$1" sequence properly parsing and ignoring question marks inside the doublequoted identifiers, quoted literals and even dollar quoted literals  4.1.2.2. Dollar-Quoted String Constants


2013/7/31 Tom Ellis <tom-lists-has...@jaguarpaw.co.uk>

Tom Ellis

unread,
Jul 31, 2013, 7:47:28 AM7/31/13
to haskel...@haskell.org
On Wed, Jul 31, 2013 at 05:28:02PM +0600, Alexey Uimanov wrote:
> The rationale is that the low-level database interface accepts parameters
> directly instead of inserting them inside the query manually.
[...]
> Low-level database interface knows better how to work with parameters, so
> the driver must pass them to it instead of parameters substitution.

Letting the low-level database interface (I'm guessing you're talking about
a C library provided by the database vendor) do the escaping certainly makes
a lot of sense.

However, it would still be possible to make sure that the *number* of
parameters supplied matches the number of placeholders in the query string.
That would make sense, don't you think?
Reply all
Reply to author
Forward
0 new messages