Let's start with N3886

Johann Anhofer

unread,

Feb 11, 2014, 3:04:37 AM2/11/14

to data...@isocpp.org

I'm the author of N3886 and I want do know if my design is basically accepted and if I should invest more time to push it any further.

It turns out that in the next four months I've got enough free time, but I want some ideas and feedback from this group, to steer it, into the right direction.

Johann

N3886.pdf

Ben Hanson

unread,

Feb 15, 2014, 1:53:06 PM2/15/14

to data...@isocpp.org

I would like to see a demo implementation and try this out. I'm happy to help out with this if I can. I wrote a C++ database library that uses ODBC so this is the part that interests me the most.

I introduced SQL parameters to my library last year and I would say named parameters (i.e. "@var") are much better than just ODBC style question marks. I've also got a SQL specific sprintf routine that we use absolutely everywhere now.

I don't mind at all if the interface is a bit of a moving target at this stage - I'm just keen to see what works for everyone and share ideas.

Regards,

Ben

Johann Anhofer

unread,

Feb 15, 2014, 2:50:51 PM2/15/14

to data...@isocpp.org

You can find a working implementation at GitHub:

https://github.com/JohannAnhofer/cpp_db

At this time there are only drivers for SQLITE and Firebird SQL,

so I would be very pleased if you can implement the ODBC driver.

A blank set of classes for the ODBC-Driver already exists.

Johann

Ben Hanson

unread,

Feb 16, 2014, 7:33:24 AM2/16/14

to data...@isocpp.org

Thanks Johann.

I've now got a build (the Windows build was broken) and I have implemented database::open() and database::close() for ODBC. I will fill in the blanks for odbc_transaction next and see how I go from there (it shouldn't take long to fill in the code needed for samples.cpp).

Regards,

Ben

Johann Anhofer

unread,

Feb 16, 2014, 2:49:36 PM2/16/14

to data...@isocpp.org

Sorry for that, I've fixed the cpp_db project and checked it in into github.

Thanks for help, Johann

Ben Hanson

unread,

Feb 17, 2014, 2:51:27 PM2/17/14

to data...@isocpp.org

I've filled out odbc_parameters::bind() except for blob and is_null().

In my experience with SQLBindParameter() in ODBC, null requires a datatype. Can the internals of the library be changed to all the datatype for null to be specified?

Thanks,

Ben

Johann Anhofer

unread,

Feb 17, 2014, 3:37:51 PM2/17/14

to data...@isocpp.org

Is this really necessary? In my opinion null shouldn't have a type because it means undefined (type and value). I've got a lot of problems with QtSql and the QVariant which uses typed nulls.

Is there a way to retrieve the underlying types from the database?

Johann

Ben Hanson

unread,

Feb 17, 2014, 3:58:43 PM2/17/14

to data...@isocpp.org

It is rather annoying to have to specifiy the type - I can't see what possible gain there is for ODBC to enforce this personally.

The trouble is, even if you agree that querying the database for the data type is an acceptable overhead (hey, this is a C++ library we're talking about!), how do you know what to query? You'd need a full blown SQL parser, surely. Those pesky question marks could be anywhere in an arbitrarily complex SQL statement after all.

Regards,

Ben

Johann Anhofer

unread,

Feb 17, 2014, 4:15:25 PM2/17/14

to data...@isocpp.org

You're right, querying is no option here because their can be any database behind the odbc interface,

so we need another solution. Maybe the type doesn't really matter in case of null?

The last resort would be to change the null_type to carry an additional type with it which is void by default and can be ignored by implementation which doesn't need it. But in case of ODBC one would write null_type<int>() for instance to pass a null for int parameters.

Hth, Johann

Roland Bock

unread,

Feb 17, 2014, 4:22:56 PM2/17/14

to data...@isocpp.org

FWIW, in my experience with the mysqlclient library it is also more straight forward if you have a type in addition to the "is_null" information.

I am curious about the problems you mentioned with typed null in QtSql. What kind of problems?

--
You received this message because you are subscribed to the Google Groups "SG11 - Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to databases+...@isocpp.org.
To post to this group, send email to data...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/databases/.
For more options, visit https://groups.google.com/a/isocpp.org/groups/opt_out.

Johann Anhofer

unread,

Feb 17, 2014, 4:26:12 PM2/17/14

to data...@isocpp.org

Ok, if more libs need this type, we should add it.

The most annoying I remember is: QVariant(QVariant::Int) == QVariant(0)

The first is a Variant holding a Null of type Int and the latter is a Variant holding an Integer of value zero. Both are equal and they are not able to change this misbehavior, because to much code relies on this.

Johann

Roland Bock

unread,

Feb 17, 2014, 4:40:59 PM2/17/14

to data...@isocpp.org

On 2014-02-17 22:26, Johann Anhofer wrote:

Ok, if more libs need this type, we should add it.

The most annoying I remember is: QVariant(QVariant::Int) == QVariant(0)

The first is a Variant holding a Null of type Int and the latter is a Variant holding an Integer of value zero. Both are equal and they are not able to change this misbehavior, because to much code relies on this.

Ooops, yes, that's going to lead to trouble in many cases.

Hmm. In my library I simply added a flag to each result/parameter type that indicates whether or not a value is NULL. It makes checks, comparisons and also binding of results/parameters quite elegant, IMO.

Regards,

Roland

Ben Hanson

unread,

Feb 18, 2014, 3:41:05 AM2/18/14

to data...@isocpp.org

I was worried I was giving up too easily and so did a search this morning: http://msdn.microsoft.com/en-us/library/ms709448(v=vs.85).aspx

I can't say I'm happy to be proven 'right' in this case.

Regards,

Ben

Johann Anhofer

unread,

Feb 18, 2014, 3:50:27 AM2/18/14

to data...@isocpp.org

Ok, that is what I expected.

I'm already trying to extend the null type, but I've no satisfying solution now.

Johann

Johann Anhofer

unread,

Feb 19, 2014, 5:26:31 AM2/19/14

to data...@isocpp.org

Finally, I came up with a solution. Please fetch the latest changes from my github repository.

Now null_type carries a tag type with it.

Use the function make_tagged_null to created null's for a type.

e.g. cpp_db::make_tagged_null<int>() or cpp_db::make_tagged_null(0) or cpp_db::null_type{0}

Since a template constructor cannot be called with < ... > I've added the make_tagged_null function.

In your driver you can query for the type by accessing the tagged type directly: e.g. null.tag_type

if can be compared to with typeid(...) expressions.

If you need the tag type and there is none provided please throw an exception (e.g. ODBCException)

Hth, Johann

Ben Hanson

unread,

Feb 20, 2014, 9:02:59 AM2/20/14

to data...@isocpp.org

Thanks Johann, will do.

Will you add me as a user to your github project (my username is BenHanson), or should I branch?

Regards,

Ben

Johann Anhofer

unread,

Feb 20, 2014, 9:49:37 AM2/20/14

to data...@isocpp.org

I've added your user as collaborator.

Thanks, Johann.

Leo Carreon

unread,

Mar 26, 2014, 5:28:47 AM3/26/14

to data...@isocpp.org

Hi Johann,

I have been reading your document N3886 together with your sample implementation in github and comparing it with ODBC which I have worked on for 10 years.

Please note that the comments below are a bit revised from the one I emailed you.

Anyway, these are my revised comments:

Is result_interface::is_column_null() really necessary? Isn't null just one of the possible values of a column therefore value::is_null() makes more sense. In an ODBC implementation, the null value of a column can only be obtained after having retrieved the value of the column.
I can see that in the possible types of a value, you have taken into account integers and floating point values. What about date, time and timestamp which most databases support? It might also be a good idea to provide support for boolean although it might just be represented as a single character column in a table.
In my experience, BLOB and CLOB columns are used to store large binary files and large text files, respectively. Maybe you should have special functions like save_column_value() for these types of columns. Also in ODBC, BLOB and CLOB parameters are a different kind of beasts because continuation data is handled as part of the execution of a statement which means they don't fit into the current architecture.
In your sample implementation, you have used a driver_factory class which is not required if you defined a driver_registry class instead of a driver_registry namespace. I'm just trying to reduce the number of classes/source files required.
You also mentioned in your document the need for adding feature queries. FYI, ODBC also has the ability to set/alter features. The only time I needed to set features was enabling/disabling transaction autocommit and setting the cursor name. Now that reminds me, do we have a need for cursors?
Maybe a more appropriate name for is_eof() is has_next().
BTW, it looks like SQLAPI++ (www.sqlapi.com) is well advance in this area because they are now in version 4.x of their product which means they already have a lot of experience in this area. They even have support for BLOB and CLOB. It might be worth inviting them into the standardization table.

I hope the above help.

Kind regards,

Leo

Johann Anhofer

unread,

Mar 26, 2014, 5:43:00 AM3/26/14

to data...@isocpp.org

Hello Leo!

Thanks for your feedback.

* is_column_null isn't absolutely necessary, but, in my opinion, it allows optimizations for some databases.

* The supported types are not complete yet, especially date/time types are really needed.

I'm not sure which type to use, std::tm seems to be not enough, because we need also timezone information.

So this is the reason that no time/date is supported now.

* The blob interface isn't defined yet, because I'm not really sure how to do it.

Maybe we should define a new blob class which handles data transport, e.g. like file operations read/write/seek.

I have to investigate the blob operations for the different databases to define this class and it's public and private interfaces.

* driver_factory is only an implementation detail of driver_registry, so it's not described in the document.

The registry could also be implemented without the factory.

* set/alter features this is an interesting point, I will keep this in my mind.

* For this version, I have decided not to support cursors.

* has_next() seems to be the better name, so I will change the name in future versions.

* SQLAPI++: The library looks well evolved, maybe they are willing to help us here.

Kind regards, Johann

Cleiton Santoia

unread,

May 21, 2014, 10:00:06 AM5/21/14

to data...@isocpp.org

Hi Johann,

I was developing a small SQL framework and somehow an idea pop-up : try to make a "execute" function put directly into result bean, by passing a bean and member ponters:

First try :

class statement {
  Connection conn_;
  std::string sql_;
public:
  template<class _Bean, typename... _Member> 
  void execute_into(std::initializer_list<Parm> parms, _Bean& bean, _Members...mem){...}
};

struct Bean {
  int cod;
  std::string nm;
};

Bean b;
statement X(conn1,"select cod, nm from tb where cod = ?");
X.execute_into({2},b,&Bean::cod,&Bean.nm);

This worked, but we need to put the members in same order then fields from SQL, mostly all sql are aimed to one Bean type so it's not a big problem to us, anyway, we create a

map_element class to bind the member to a name or index of the field and pass the map_element member instead of direct pointer to map.

Second try :

class map_element {
public:
   template<typename _Mem>
   map_element(std::string field_name, _Mem mem_foo) {...}
   template<typename _Mem>
   map_element(int field_index, _Mem mem_foo) {...}
};

and pass s set of map_element to execute_into function:

map_element cod_mapped("cod",&Bean::cod);
map_element nm_mapped(2,&Bean::nm);

Bean b;
X.execute_into({2},b,cod_mapped,nm_mapped);

Now the problem is how to handle nulls and casts, if the result of SQL does not fit exactly in your bean.

Third try:

enum null_handle { // this should be done by policies instead of enum, but this way is easier to understand
  DEFAULT_WHEN_NULL,
  ACCEPT_NULL,
  THROW_EXCEPTION_WHEN_NULL
};

enum cast_handle {
   TRY_BEST_CAST,
   THROW_EXCEPTION_WHEN_DIFFERENT
};

template<typename _Mem>
class map_element {
public:
   map_element(std::string field_name, _Mem mem_foo, null_handle null_h, cast_handle cast_h ) {...}
};

struct Bean {
  int cod;
  nullable<int> index;
  std::string nm;
};

auto cod_mapped = map_element ("cod",&Bean::cod, DEFAULT_WHEN_NULL,  THROW_EXCEPTION_WHEN_DIFFERENT);
auto idx_mapped = map_element ("index",&Bean::index, ACCEPT_NULL,  THROW_EXCEPTION_WHEN_DIFFERENT);
auto nm_mapped = map_element (3,&Bean::nm,DEFAULT_WHEN_NULL, TRY_BEST_CAST );

Bean b;
X.execute_into({2},b,cod_mapped,idx_mapped, nm_mapped);

It's almost good, now we want to use this map idea to iterate over a resultset with standard algorithms, so resultset must be iterable :)

template<class _Bean>
class result {
   _Bean current_; // we maintain the last copy of bean here
public:
   template<typename ..._Maps>
   result(_Maps... maps);

   iterator begin();   // here, the first use may move to first, then subsequent '++' calls go next
   iterator current(); // or we can get a current iterator, pointing to current_ bean
   iterator end();

   void move/next/first/last common result functions 
};

class statement {
  template<typename _Bean, typename ..._Maps> 
  result<_Bean> execute(std::initializer_list<parm> parms, _Maps...maps) 
}; 

now we use:

Bean b;
result<Bean> rs = X.execute({2},b,cod_mapped,idx_mapped, nm_mapped); // here we can derive Bean type from mapped
std::vector<Bean> v;

std::copy( rs.current(), rs.end(), back_inserter(v) );

what you think ?

Johann Anhofer

unread,

May 21, 2014, 3:09:42 PM5/21/14

to data...@isocpp.org

Hi!

Please start a separate thread, this thread is for N3886.

IMHO: it looks rather complicated to use, but if you use a code generator it may be ok.

As described in N3886 iterators a dangerous thing in combination with result sets.

One may write: std::sort(rs.begin(), rs.end()); or std::unique(rs.begin(), rs.end()) which is perfectly legal for normal containers, but result sets are not containers, so I don't like iterators for them.

Maybe you should generate the iterators differently, e.g. std::resultset_iterator i(rs);

to point out, not to abuse them with code that should better be written in SQL.

Regards, Johann

Mateusz Łoskot

unread,

May 21, 2014, 4:07:53 PM5/21/14

to Johann Anhofer, data...@isocpp.org

On 21 May 2014 21:09, Johann Anhofer <johann....@gmail.com> wrote:
>
> As described in N3886 iterators a dangerous thing in combination with result
> sets.
> One may write: std::sort(rs.begin(), rs.end()); or std::unique(rs.begin(),
> rs.end()) which is perfectly legal for normal containers, but result sets
> are not containers, so I don't like iterators for them.

Iterators are not reserved to containers only.

Think of a result set as a view of your data.
A view can have iterators.
This is a very well established concept in C++:

http://www.zib.de/weiser/vtl/
http://www.drdobbs.com/stl-sequences-the-view-concept/184401789

Best regards,
--
Mateusz Łoskot, http://mateusz.loskot.net

Johann Anhofer

unread,

May 21, 2014, 4:21:51 PM5/21/14

to data...@isocpp.org, Johann Anhofer

Am Mittwoch, 21. Mai 2014 22:07:53 UTC+2 schrieb Mateusz Łoskot:

On 21 May 2014 21:09, Johann Anhofer <johann....@gmail.com> wrote:
>
> As described in N3886 iterators a dangerous thing in combination with result
> sets.
> One may write: std::sort(rs.begin(), rs.end()); or std::unique(rs.begin(),
> rs.end()) which is perfectly legal for normal containers, but result sets
> are not containers, so I don't like iterators for them.

Iterators are not reserved to containers only.

For sure, but they work best for in-memory data structures.

Think of a result set as a view of your data.
A view can have iterators.
This is a very well established concept in C++:

Well, but they are not a golden Hammer!

SQL result sets are already a view to data, why squeeze over a different view concept?

Only to std::copy rows out of the result set?

Cleiton Santoia

unread,

May 21, 2014, 4:30:12 PM5/21/14

to data...@isocpp.org

IMHO: it looks rather complicated to use,

Yes, may be complicated, mostly when you involve pointer to members, but it commonly get faster and static typed, this way you don´t even need the "any" type , if you have a map you can "search" for and db low level functions can be optimized to read from net stream directly into the bean members saving some copies ( I think ).

also, would be nice to just ask "fill this vector here"

but if you use a code generator it may be ok.

Not exactly with code generators, but with attribute reflection.

As described in N3886 iterators a dangerous thing in combination with result sets.

Yes, but I stand the effort to overcome this problem, we could use it as a forward iterator ... or even another type "forward not copyable (only movable) iterator"

Regards

Cleiton

Cleiton Santoia

unread,

May 21, 2014, 4:37:38 PM5/21/14

to data...@isocpp.org

Yes, may be complicated, mostly when you involve pointer to members, but it commonly get faster and static typed, this way you don´t even need the "any" type , if you have a map you can "search" for and db low level functions can be optimized to read from net stream directly into the bean members saving some copies ( I think ).

Sorry I forgot a small piece in this line

Yes, may be complicated, mostly when you involve pointer to members, but it commonly get faster and static typed, this way you don´t even need the "any" type , if you have a map you can "search" at first iteration the names of parameters and change them to numbers (or pointers to buffer offset), so you don´t need to search getInt("cod") once per iteration, just at the first and db low level functions can be optimized to read from net stream directly into the bean members saving some copies ( I think ).

That´s it.

Mateusz Łoskot

unread,

May 21, 2014, 4:51:22 PM5/21/14

to Johann Anhofer, data...@isocpp.org

On 21 May 2014 22:21, Johann Anhofer <johann....@gmail.com> wrote:
> Am Mittwoch, 21. Mai 2014 22:07:53 UTC+2 schrieb Mateusz Łoskot:
>> On 21 May 2014 21:09, Johann Anhofer <johann....@gmail.com> wrote:
>> >
>> > As described in N3886 iterators a dangerous thing in combination with
>> > result sets.
>> > One may write: std::sort(rs.begin(), rs.end()); or
>> > std::unique(rs.begin(),
>> > rs.end()) which is perfectly legal for normal containers, but result
>> > sets are not containers, so I don't like iterators for them.
>>
>> Iterators are not reserved to containers only.
>
> For sure, but they work best for in-memory data structures.

They work with any collection of elements.

>> Think of a result set as a view of your data.
>> A view can have iterators.
>> This is a very well established concept in C++:
>
>
> Well, but they are not a golden Hammer!
>
> SQL result sets are already a view to data, why squeeze over a different
> view concept?
> Only to std::copy rows out of the result set?

Iterators and syntax involved give canonical to visit elements of a collection,
regardless their physical organisation in memory.

IMHO, the argument against iterators in N3886 is artificial
and to me it looks like it's based on personal preferences.

std::sort accepts random iterator.
A typical use case of a result set iterator is single pass consumption
of the tuples,
an input iterator is enough, like in iterating over a stream of data.

Johann Anhofer

unread,

May 21, 2014, 5:11:04 PM5/21/14

to data...@isocpp.org

You're right! All a human can say or do, is based on personal preferences!

William La Cholter

unread,

Jun 15, 2014, 9:40:19 PM6/15/14

to data...@isocpp.org

I'll just echo that iterators are important and single pass input/output are appropriate. Iterators are idiomatic for cursors and many database libraries. The tricky part is having an iterator compatible with the input or result type. That's where reflection comes in.

If the iterator has a reflection type that models getting/settings values for a class/type, then the iterator can hold an instance and move it on dereference.

Here's an example of an iterator (from La Carte, a different library) using the Boost Iterators library helper:

namespace detail

{

template <typename Statement, typename Class, class Reflection>

class insert_statement_iterator_proxy
{
public:

    explicit insert_statement_iterator_proxy(Statement & statement):

        statement(&statement)
    {

    }
    Class & operator =(Class & dest)

    {
        assert(statement != nullptr);

        insert<Reflection>(*statement, dest);

        return dest;
    }

private:
    Statement * statement;

};

} // namespace detail



template <typename Statement, typename Class, class Reflection>

class insert_statement_iterator: public boost::iterator_facade<

    insert_statement_iterator<Statement, Class, Reflection>,

    Class,
    std::output_iterator_tag,

    detail::insert_statement_iterator_proxy<Statement, Class, Reflection>>

{
public:
    insert_statement_iterator() : statement(nullptr) {}

    explicit insert_statement_iterator(Statement & statement_ref):

        statement(&statement_ref)
    {

        assert(statement != nullptr);

    }
private:
    friend class boost::iterator_core_access;


    typename std::tuple_element<0, Statement>::type & current_statement() const

    {
        assert(statement != nullptr);

        return std::get<0>(*statement);

    }

    detail::insert_statement_iterator_proxy<Statement, Class, Reflection>

    /*reference*/ dereference() const {

        assert(statement != nullptr);

        return detail::insert_statement_iterator_proxy<Statement, Class, Reflection>(*statement);

    }

    bool equal(insert_statement_iterator const & other) const

    {
        return statement == other.statement;

    }

    // all the action is in the proxy assignment
    void increment() {}


    Statement * statement;

};

Per my reply above, the reflection defines type list tuples, with each element instantiated as a functor for setting a value.

One can also envision caching proxy iterators that can operate over large result sets.

William La Cholter

William La Cholter

unread,

Jun 16, 2014, 9:46:05 PM6/16/14

to data...@isocpp.org

On Tuesday, February 11, 2014 3:04:37 AM UTC-5, Johann Anhofer wrote:

I'm the author of N3886 and I want do know if my design is basically accepted and if I should invest more time to push it any further.
It turns out that in the next four months I've got enough free time, but I want some ideas and feedback from this group, to steer it, into the right direction.

Johann

Cleiton,

I think reflection is a key component to simplify type safety.

A standard database library should look at the best examples from other languages including Java Hibernate, .NET Framework Entity Framework, and Ruby's Active Record, all of which use reflection

I have implemented the outlines of a data mapping library ...

The actual usage would look something like this:


                      std::vector<la_carte_example::system_requirements> rqmts;

                          rqmts.push_back({10,100});

                          sqlite::database
                  database("rqmts.db");

                          auto
                  statement = std::make_tuple(

                              prepare(

                                  database,

                                  insert_into_system_requirements_query()));

                  

                          la_carte::insert<system_requirements_reflection>(statement, rqmts.begin(), rqmts.end());

Note that there's a lot more going on here, but suffice it to say tuples hold type type lists and different tuples also a hierarchy of statements to allow for hierarchical object networks.

I have actual code that I plan on sharing soon as well.

I'll post more specifics on this approach, but just wanted to reply to your suggestion, which I heartily endorse as a way to provide member access.

In regards to the result set, I'm not certain that it's necessary to parameterize it on the bean.Assuming there are different result set types for different databases (at least the underlying implementation must be different to account for wrapping the different APIs), it gets to be more complicated to have a template in each result set. The way I've addressed this is to have overloads for getting from the result set, and when you want to call the setter/mutator for the bean, you know its type and can specify a temporary of that type, which will get picked up by the overload resolution.

Cleiton Santoia

unread,

Jun 21, 2014, 12:25:49 AM6/21/14

to data...@isocpp.org

I think reflection is a key component to simplify type safety.

Indeed, I don't know if you are aware, but we proposed N3951/N4027 and N3984, you can use attributes and reflected types to help make the maps.

A standard database library should look at the best examples from other languages including Java Hibernate, .NET Framework Entity Framework, and Ruby's Active Record, all of which use reflection

IMHO, These are not good examples , we can do better in C++ statically check parameter types and even parse the SQL and many other things that are not allowed in those languages, the only thing they did correctly is to use reflection itself.

I have implemented the outlines of a data mapping library ...

Good, another way to do it, instead of put each getter/setter pair in one class, you make a package with all of them, ok... no problem...

The actual usage would look something like this:
std::vector<la_carte_example::system_requirements> rqmts; rqmts.push_back({10,100}); sqlite::database database("rqmts.db"); auto statement = std::make_tuple( prepare( database, insert_into_system_requirements_query())); la_carte::insert<system_requirements_reflection>(statement, rqmts.begin(), rqmts.end());

Note that there's a lot more going on here, but suffice it to say tuples hold type type lists and different tuples also a hierarchy of statements to allow for hierarchical object networks.

I think this is a prepared insert statement, am I right ? So, you are setting the parameters by position, instead of name, you should provide a way to set them via name too, I don't like this approach, but many people do.

I have actual code that I plan on sharing soon as well.

Looking forward to it :)

I'll post more specifics on this approach, but just wanted to reply to your suggestion, which I heartily endorse as a way to provide member access.

Yeah ! Function access can help a lot on type safety, but object member access can allow the library to skip the need for buffering the result in the resultset memory, you can parse form I/O and set them directly in the bean member, and vice-versa.

In regards to the result set, I'm not certain that it's necessary to parameterize it on the bean.Assuming there are different result set types for different databases (at least the underlying implementation must be different to account for wrapping the different APIs), it gets to be more complicated to have a template in each result set. The way I've addressed this is to have overloads for getting from the result set, and when you want to call the setter/mutator for the bean, you know its type and can specify a temporary of that type, which will get picked up by the overload resolution.

I don't think that attach the mapping to resultset is a good idea too, there are

1 - majority cases for 1 statement x 1 bean type,
2 - many cases for n statements x 1 bean type,
3 - few cases for 1 statement x n beans types ( joins )
4 - fewer cases n statements x n beans types ( more complex joins ).

I think that 1 and 2 are ok to put maps into beans ( even if you use statements in different databases), 3 and 4 no...

And about template resultset, you don't need to have them as template nor a SQLStatement class as a template, only resultset::iterator class must be a template class over the mapping class.

My dream is SQL without resultset: if i could just pick a scalar, a single bean, fill an array or apply a function to all elements, I don't need even the resultset:

class Bean { const std::string& get_description() {...}; void set_description( std::string desc ) {...};public: [[db_attrib("cod", DEFAULT_WHEN_NULL)]] int cod; [[db_attrib("name", DEFAULT_WHEN_NULL)]] std::string name; [[db_attrib("description", DEFAULT_WHEN_NULL), get_description, set_description ]];};class Statement {...public: Statement(std::string sql); template<class _Container, class _Map, typename..._Parms > void fill(_Container c, _Map m, _Parms...parms);// here we can ask for _Container::value_type to instantiate new beans template<class _Algorithm, class _Map, typename..._Parms> void apply(_Algorithm a, _Map m, _Parms parms ); //here we apply an algorithm ( or lambda )...};vector<Bean> v;auto m = BeanMap<Bean>(); // Here I used the default bean-db mapping, but you could create your own detached from the bean if you wish.Statement s = "SELECT cod, name, description FROM db.bean_table WHERE cod BETWEEN ? AND ?";// execute statement 's', pick each result tuple and fill the bean using map 'm' and push each into vector 'v'.s.fill( v, m, 1, 10 ); or
// execute the lambda for all elements of result of statement ss.apply( [this]( const BeanMap<Bean>::tuple_type& t ) { this->do_something(t); }, m, 1, 10 );

But, I think also that a range library could be good also, but involve too many copies:

vector<Bean> v;auto m = BeanMap<Bean>(); // Here I used the default bean-db mapping, but you could create your own detached from the bean if you wish.Statement s = "SELECT cod, name, description FROM db.bean_table WHERE cod BETWEEN ? AND ?";SQLResultsetRange<Bean>range = s.exec( m, 1, 10 ); std::copy(range.begin(), range.end(), back_inserter(v)); // back_inserter copies the bean

In this case, range owns the resultset, and when if comes out of scope, it can delete the underline resultset.

Roland Bock

unread,

Jun 22, 2014, 6:40:02 AM6/22/14

to data...@isocpp.org

Hi,

On 2014-06-21 06:25, Cleiton Santoia wrote:

I think reflection is a key component to simplify type safety.

Indeed, I don't know if you are aware, but we proposed N3951/N4027 and N3984, you can use attributes and reflected types to help make the maps.

Maybe I misread the proposals or overlooked something. But I think these proposals are missing their counterpart which allows to construct types (think analysis and synthesis).

Oversimplified, with reflection you can figure out that

struct X
{
   int i;
   std::string s;
};

contains an int member called "i" and a string called s.

This can be very helpful if you want to store or retrieve objects from a database, no doubt about that.

But, for instance, if you wanted to extract data from joined tables, say X and Y, selecting (int i) from X and (string name) from Y, then it would be really cool to be able to construct a result object which basically looks like this

template<typedef... columns>
struct result
{
internal_result_representation;
columns...; // puts the content of each column into the struct.
};

result r{internal_result}; //
int i = result.i;
std::string name = result.name;

It is possible to achieve this today with multiple inheritance, but I would really like to have a language feature for this and use it to make sqlpp11 much leaner and easier to comprehend.

In particular, the combination of such type analysis and synthesis might work wonders when aiming for something like LINQ in C#.

Best,

Roland

William La Cholter

unread,

Jun 22, 2014, 11:11:34 PM6/22/14

to data...@isocpp.org

On Saturday, June 21, 2014 12:25:49 AM UTC-4, Cleiton Santoia wrote:

I think reflection is a key component to simplify type safety.

Indeed, I don't know if you are aware, but we proposed N3951/N4027 and N3984, you can use attributes and reflected types to help make the maps.

I have looked at this proposal, and it looks like it covers member functions, which is crucial because many people write object oriented code that hides data members.

I'm not suggesting that result set is tied to the type.

There should be (at least) four basic CRUD operations, which can be defined as free function templates, parameterized on data source/sink (e.g., SQL database), user domain type (i.e., bean), and a mapping. With type deduction, it is possible to eliminate two explicit template parameters if the two are passed by reference, such as:

select<mapping>(sql_statement, data_obj)

Mapping is orthogonal to the statically defined data source ("statement") and to the bean type (user domain type). The data source may expose a variety of type-specific functions (e.g., sqlite3_bind_int(), sqlite3_column_int(), and sqlite3_column_text16()) some of which may not map to standard types (e.g., sqlite3_column_text16) and there may also be gaps (e.g., unsigned long long). Necessarily at least some aspect of the result set's type is based on the data source's type collection.

My dream is SQL without resultset: if i could just pick a scalar, a single bean, fill an array or apply a function to all elements, I don't need even the resultset:

class Bean { const std::string& get_description() {...}; void set_description( std::string desc ) {...};public: [[db_attrib("cod", DEFAULT_WHEN_NULL)]] int cod; [[db_attrib("name", DEFAULT_WHEN_NULL)]] std::string name; [[db_attrib("description", DEFAULT_WHEN_NULL), get_description, set_description ]];};

Annotations are problematic if the text names are specific column names because a change in the column names could mean recompilation. That said there is benefit if reflection per the proposal could allow the database mapping to construct DDL (e.g., SQL "create table").

William La Cholter

unread,

Jun 22, 2014, 11:29:50 PM6/22/14

to data...@isocpp.org

On Tuesday, February 11, 2014 3:04:37 AM UTC-5, Johann Anhofer wrote:

I'm the author of N3886 and I want do know if my design is basically accepted and if I should invest more time to push it any further.
It turns out that in the next four months I've got enough free time, but I want some ideas and feedback from this group, to steer it, into the right direction.

Johann

General comments:

I'm glad to see that you've put together a proposal. We need a standard in C++. I've attached a text file with comments (original text indented with ">>"). I'm happy to post the entire file, but it is 53K.

It certainly expands the scope greatly to go beyond SQL databases, but a Standard C++ library proposal without "SQL" in the title should not represent a bias towards SQL relational databases.

The standard would benefit from a data(base) abstraction layer that includes other data repositories with or without query languages:
- Pure XML
- XML with XPath for querying
- NoSQL databases
- Arbitrary key/value stores (could even just be something like std::map<string, string>)
- Parsed data (e.g., Boost Spirit Qi)

Some provisions for a query language are critical but should be almost orthogonal to the data mapping.

I understand that it's a lot to swallow and a proposal could start with part of the solution while allowing for future expansion

Some requirements I envision for a Standard database library are below.

Goals:
- Be as easy to use as The Java Hibernate library, Microsoft Entity Framework (EF), or Ruby's Active Record library (I understand that these are not ideal C++ models but their ease of use can't be ignored)
- Minimize user code changes to have persistence, i.e., keep data model out of domain model.
- Impose no unnecessary costs.
- Maintain the basic guarantee with respect to exceptions
- Data store model-neutral, but work with lots of models

Definitions:
- Repository model - a collection of classes, templates, functions, and types that encapsulate the operations for a specific repository brand or type
   Examples include: SQLite, PostgreSQL, CSV, XML, JSON, ODBC, OleDB, std::map<string, boost::any>>, std::tuple, CouchDB, MySQL
   (Note that it's important that the Standard not require any specific database models that incure licensing problems)
- User data model - a collection classes, functions, and types that define a library user's data model, commonly referred to as a business model, beans
- Reflection meta-model - a collection of classes, types, and methods that enable the mapping of a user data model to an arbitrary reporitory model
- CRUD operations - create, read, update, delete, i.e., the basic operations for data manipulation
- DDL - data definition language, the mechanism to define/create the storage representation in the repository model
- Tuple - not to be confused with std::tuple, is a sequence of data typically representing one-row in a relational database

Requirements
- The library shall not require that a concrete repository model have a vtable, in accordance with the principle of pay for what you use.
- However, understanding that many applications must support multiple databases, there should be support:
   - A connection factory
   - A vtable-based repository model that can wrap an arbitrary repository model
- The library shall enable the user definition of a reflection through the reflection model.
- The library shall provide iterators, ranges, and container like template classes to support integration with standard algorithms
   - I understand this is contrary to the text below. Even though one may not be able to make guarantees about execution time in the same sense, one cannot do that with the IOStream-based iterators, either, when the underlying stream does not make guarantees. However, there may be cases where a database can make certain guarantees. This suggest that there could be iterators with explicit guarantees when they are statically associated with operations that have guaranteed runtime. For example, a b-tree/RB-tree based repository collection provides O(log n) access, and a hash table based store repository collection provides O(1). Now the bindings to get the repository back end are another matter--but one can imagine a SQL or NoSQL database and query builder
- The library shall include an escape hatch to allow arbitrary underling implementation-specific operations (this is like the std::thread::native_handle)
   - The escape hatch shall cover queries and data bindings
- The library shall include a reflection meta-model (if no such Standard library exists) to enable mapping arbitrary user data models to operations on the repository model
   - The operations shall include:
     - defining a uniform compile-time representation of user model accessors ("getters")
     - defining a uniform compile-time representation of user model mutators ("setters")
     - defining a unfirom compile-time representation of user model data members
     - defining user model element annotation for implementation-specific metadata, such as primary key definition, read-only, auto-calculated check constraints, etc.
     - defining compile-time association of column names
     - supporting nullable values (e.g., proposal's null type, nullptr_t, experimental::optional, boost::optional)
     - navigation over a class hierarchy/class graph to represent parent/child, master/detail, foreign-key relationships, etc.
   - The meta-model shall be compile-time type graph (this sounds like implementation, but a user data model graph is unlikely to map cleanly to any other reflection model)
   - The meta-model shall impose no runtime costs over direct access to any candidate libraries
   - The meta-model shall be orthogonal to the repository model to enable arbitrary coupling of user model and repositories
- A repository model shall include the following capabilities:
   - Data source access (e.g., "connection")
     - To provide container access
   - Collection access (e.g., "table," "statement," or "query")
   - Inputs (e.g., "parameters") - to provide inputs for collections
   - Results - to get outputs from collections
   - Tuple (e.g., "row," "tuple") - to manipulate individual, like aggregates from a collection
   - Element access with a tuple (not std::tuple) which may be the following, depending on the repository model and interface chosen within
     - Sequential access
     - Index access (e.g., sqlite3_column_int(sqlite3_stmt *, int index))
     - Key access (e.g., XML tag name as text string)
     - Tag dispatch access
   - Support for some core Standard C++ data types: basic integral values and text
   - Each repository model shall be in its own namespace
   - core non-member functions for access:
     - prepare (or some more generic term?)
     - execute(database, query) - execute an arbitrary query
     - set() - set individual values in data store target
     - get() - retrieve individual values from data store target
     - size() - number of columns in result tuple
     - advance() - advance a row
- A repository model may include the following:
   - Support for all Standard C++ data types (that are otherwise a bad fit)
   - Support for arbitrary Standard C++ types
   - Support for binary objects ("BLOBS")
     - std::vector<char>
     - a new BLOB type
   - Transaction wrapper
     - optional nested transactions
     - RAII
     - begin/end functions to allow close/reopen
- A repository model-neutral mapping interface that performs CRUD, query building, and DDL operations over the repository model and user data model shall include
   - insert - enable updates of database auto-generated fields (e.g., primary key or timestamp)
   - select/select one - also include arbitrary input parameter binding (e.g., SELECT id, name, city, country FROM customer WHERE country == $1)
   - update
   - delete (not using the reserved word "delete")
   - All operations shall have iterator and single item interfaces
   - insert/update/delete may also include arbitrary
- A query builder interface shall
   - Use the repository model and user data model to construct queries
- A query builder interface may be part of the repository model
- It is preferable that queries may be built at compile-time, but limitations for text processing in constexpr and template meta-programming may make that hard
- The query builder shall also support run-time definitions (likely through a different set of classes having the same interfaces)

N3886-feedback.txt

Johann Anhofer

unread,

Jun 25, 2014, 4:35:02 AM6/25/14

to data...@isocpp.org

Thank you for the exhaustive Review!

My intend for the proposal was always to provide a simple to use abstraction layer over commonly used C-Libraries for SQL based relational Databases. I was looking for a better title, but it was already too long and so I decided not to append "SQL based relational" as well.

Your thoughts/requirements are for sure very interesting but, IMHO, such a huge library is impossible to standardize.

Only helpful components from which it's build could be pass standardization (it would be certainly too much work).

So, my focus was on the part about SQL based relational databases.

I will work through your review if I have more time left.

Thanks, Johann

Csaba Csoma

unread,

Jun 25, 2014, 2:24:25 PM6/25/14

to data...@isocpp.org

Johann and William,

I think standardizing just the SQL interface is already a big task.

A similar effort can be done for the other types of persistence (document store like Amazon S3, key-value pair etc).

After that it might be possible to come up with a high level, reduced functionality interface that handles all data sources.

It would be easy to get started for new apps and still drop down to specific interface later on, when the app grows big enough.

Csaba

Mateusz Łoskot

unread,

Jun 25, 2014, 4:57:24 PM6/25/14

to Csaba Csoma, data...@isocpp.org

On 25 June 2014 20:24, Csaba Csoma <csaba...@gmail.com> wrote:
> Johann and William,
>
> I think standardizing just the SQL interface is already a big task.
>
> A similar effort can be done for the other types of persistence (document
> store like Amazon S3, key-value pair etc).

Ad-hoc thought, as it's easy to get carried away...
So, either we speak of an SQL-based data access abstraction or extend
the initial
proposal to a data access in generic data store.

For the former, I think Roland Bock's propsal as in sqlpp11 is a good
place to start.
For the latter, we should probably move back and restart discussion
from the point
of generic C++11/14 API for querying (selecting, filtering,
searching), adding, updating and
deleting data records/items/objects where implementations encapsulate
details like
SQL standarisation and vendor-specific features support.

Cleiton Santoia

unread,

Jun 25, 2014, 5:56:40 PM6/25/14

to data...@isocpp.org

I'm not suggesting that result set is tied to the type.

There should be (at least) four basic CRUD operations, which can be defined as free function templates, parameterized on data source/sink (e.g., SQL database), user domain type (i.e., bean), and a mapping. With type deduction, it is possible to eliminate two explicit template parameters if the two are passed by reference, such as:

select<mapping>(sql_statement, data_obj)

Mapping is orthogonal to the statically defined data source ("statement") and to the bean type (user domain type). The data source may expose a variety of type-specific functions (e.g., sqlite3_bind_int(), sqlite3_column_int(), and sqlite3_column_text16()) some of which may not map to standard types (e.g., sqlite3_column_text16) and there may also be gaps (e.g., unsigned long long). Necessarily at least some aspect of the result set's type is based on the data source's type collection.

In this case, you want to use the same sql_statement to different types of data_obj ?

select<mapping>(sql_statement, data_obj);
select<mapping>(sql_statement, data_obj2);

And sql_statement need to be static ? what if I have a dynamic SQL ?

Annotations are problematic if the text names are specific column names because a change in the column names could mean recompilation.

Do you want to generate beans based on the table definition when compiling ?

William La Cholter

unread,

Jun 29, 2014, 2:04:35 PM6/29/14

to data...@isocpp.org

You can take a look at the proof-of-concept library: https://bitbucket.org/la_carte/la_carte/src/

This code is very early, and there's lots of clean-up and harmonizing before considering it for any publicly maintained library. It depends on SQLite, PostgreSQL, and Boost (Function Traits, Iterators, and MPL). It makes moderate use of C++11 features including variadic templates.

So far, I've tested it with Visual C++ 2013, g++ 4.6 on Ubuntu 12 LTS, and g++ 4.8 PPA on Ubuntu 12 LTS.

The general features include:

per-vendor namespaces with type collections:

database - moveable, not copyable
statement - moveable, not copyable
free functions:

database::statement prepare(database & d, query_type q)
execute(database & d, query_type q) - simple query execution to support empty result set operations, such as DDL and transactions (though should be expanded to support full results)
reset(statement & s) - reset a query for re-invocation - basis for Iterator design pattern
advance(statement & s) - execute a query/advance through results - basis for Iterator design pattern
template <typename T> get<T> (statement & s, index_type i) - retrieve a value from a result set
template <typename T> set(statement & s, index_type i, value_type_and_modifier v) - bind a value for input to a query - note this should eventually support && and optimize for small values

SQLite support for fundamental types
PostgreSQL support for fundamental types

Mapping and reflection support

reflection functors for accessing class data:

getters and setters:

non-type members function pointers
free functions
nested class/detail access including simple composition and iterator-based association

CRUD free functions:

use hierarchical mapping classes that define reflection for getting and setting
recursive and nested operations:

each template function uses recursive meta-function calls to enable complete eliding of function calls to have an identical call graph of hand-coded versions

insert<Mapping, Statement, Target>

results extraction for auto-generated fields, such as primary keys and timestamps
iterator-based

select<Mapping, Statement, Target>

output iterator
select_one - bind key value and retrieve rest of object
variadic template parameter list of input values for binding (e.g., select<Mapping>(stmt, target, in_1, in_2, ...)

update<Mapping, Statement, Target>

iterator-based

delete_from<Mapping, Statement, Target>

iterator-based
may clear primary key field
cascading delete

Various helpers

iterator wrappers, including wrapping of select as output iterator and insert as input iterator
common types in database-oriented code, such as keys

Example code

some basic example code to test all CRUD and nesting types
some example as I worked through the syntax

A somewhat elaborate example of nested master/detail table use is in https://bitbucket.org/la_carte/la_carte/src/15a997c49df1e4171c8a76644ad78d07a9e5c2da/test/basic_crud/basic_crud.cpp

On Wednesday, June 25, 2014 5:56:40 PM UTC-4, Cleiton Santoia wrote:

I'm not suggesting that result set is tied to the type.

There should be (at least) four basic CRUD operations, which can be defined as free function templates, parameterized on data source/sink (e.g., SQL database), user domain type (i.e., bean), and a mapping. With type deduction, it is possible to eliminate two explicit template parameters if the two are passed by reference, such as:

select<mapping>(sql_statement, data_obj)

Mapping is orthogonal to the statically defined data source ("statement") and to the bean type (user domain type). The data source may expose a variety of type-specific functions (e.g., sqlite3_bind_int(), sqlite3_column_int(), and sqlite3_column_text16()) some of which may not map to standard types (e.g., sqlite3_column_text16) and there may also be gaps (e.g., unsigned long long). Necessarily at least some aspect of the result set's type is based on the data source's type collection.

In this case, you want to use the same sql_statement to different types of data_obj ?

select<mapping>(sql_statement, data_obj); select<mapping>(sql_statement, data_obj2);

Each sql_statement instance ties to a specific query. Each "statement" class is defined in its own namespace (e.g., la_carte::postgresql). The statement is a resource holding object. It is moveable and not copyable, owing to it encapsulating an expensive quantity: a database query/statement/etc.

In the case of SQLite, it's holds a sqlite3_stmt. In the case of PostgreSQL, it holds a lot more due to the PostgreSQL text-based interface: a cache of input values (vectors of vector<char>), results (PQresult handle), and a prepared statement name.

Within a per-database namespace (type collection), there must be one or more free get() and set() functions that retrieve and bind values, respectively.

The database class encapsulates a database connection, and each "vendor-specific" namespace must define a database class that takes a single connection string constructor. The database class is moveable, and not copyable, owing to it encapsulating an expensive quantity: a database connection. The database class is a statement factory. Commonly, statements are created from the prepare(query_text) free function, which returns a statement.

And sql_statement need to be static ? what if I have a dynamic SQL ?

The statement is not static, per above. Each instance represents a query. In practice, statements are part of a hierarchical tuple, where a class hierarchy (e.g., foo has many bar) must be mapped by multiple statements (e.g., foo's statement and bar's statement)

Annotations are problematic if the text names are specific column names because a change in the column names could mean recompilation.

Do you want to generate beans based on the table definition when compiling ?

I'm not quite sure what you mean by generating beans. I envision a few things given time to work it out:

creating DDL from class reflections to create databases from code (on a per-vendor basis of course) (going the other way) - and of course CRUD queries from class reflections
support for std::tuples in addition to user-defined classes (this is rather easy compared to the messiness of all the ways one must get/set data in a class)
maybe: some generic variant-type classes, like boost::any class hierarchies, like the old TinyJSON C++ library (https://code.google.com/p/couchdbpp/issues/attachmentText?id=3&aid=30001000&name=tinyjson.hpp&token=mLmHbuQEtLSTUcLjaF0yJwfi8oU%3A1361181136441 ?)

William La Cholter

unread,

Jun 29, 2014, 2:15:59 PM6/29/14

to data...@isocpp.org

On Wednesday, June 25, 2014 2:24:25 PM UTC-4, Csaba Csoma wrote:

Johann and William,

I think standardizing just the SQL interface is already a big task.

Agreed. The problem with many attempts at standardizing is that "perfect is the enemy of the good." And the good that is out there isn't nearly good enough compared to the excellent examples of STL and many of the Boost Libraries. The database libraries tend to be highly locked into vendor-specifics, add unnecessary runtime overhead, and just not as general as what we've become accustomed to. This isn't just databases--there still isn't a standard socket library.

A similar effort can be done for the other types of persistence (document store like Amazon S3, key-value pair etc).
After that it might be possible to come up with a high level, reduced functionality interface that handles all data sources.

Those are probably an easier place to start because they tend to be NoSQL-ish and you can define a lot of operations without worrying about the messiness of query dialects.

It would be easy to get started for new apps and still drop down to specific interface later on, when the app grows big enough.

I think a reasonable proof-of-concept Boost quality library including SQL, that is dialect-free and sufficient for people to use it in production code would take would take 3 - 6 person years to develop. I've written libraries in 4 months labor that were good enough for limited production use and compelling enough over alternative libraries (e.g., SQLite C API, PostgreSQL, and SOCI). But it's a lot of work, and I'm not sure who is willing to pay for it. So, we're left with this as a hobby/part-time effort in most cases.

Roland Bock

unread,

Jun 30, 2014, 1:08:37 AM6/30/14

to data...@isocpp.org

On 2014-06-29 20:15, William La Cholter wrote:

On Wednesday, June 25, 2014 2:24:25 PM UTC-4, Csaba Csoma wrote:

Johann and William,

I think standardizing just the SQL interface is already a big task.

Agreed. The problem with many attempts at standardizing is that "perfect is the enemy of the good." And the good that is out there isn't nearly good enough compared to the excellent examples of STL and many of the Boost Libraries. The database libraries tend to be highly locked into vendor-specifics, add unnecessary runtime overhead, and just not as general as what we've become accustomed to. This isn't just databases--there still isn't a standard socket library.

Right. Any attempt should be as vendor neutral as possible. In particular this means that it should have no connection to any specific database. Those connections need to be built based on the core library.

A similar effort can be done for the other types of persistence (document store like Amazon S3, key-value pair etc).

After that it might be possible to come up with a high level, reduced functionality interface that handles all data sources.

Those are probably an easier place to start because they tend to be NoSQL-ish and you can define a lot of operations without worrying about the messiness of query dialects.

It would be easy to get started for new apps and still drop down to specific interface later on, when the app grows big enough.

I think a reasonable proof-of-concept Boost quality library including SQL, that is dialect-free and sufficient for people to use it in production code would take would take 3 - 6 person years to develop. I've written libraries in 4 months labor that were good enough for limited production use and compelling enough over alternative libraries (e.g., SQLite C API, PostgreSQL, and SOCI). But it's a lot of work, and I'm not sure who is willing to pay for it. So, we're left with this as a hobby/part-time effort in most cases.

Similar with sqlpp11. It is a hobby/part-time effort, but a game changer already. A slightly older version is already being used in production.
I think that library has received about a person year's effort (not counting pre-C++11 attempts). Judging from feedback on the boost mailing list, talking to boosters and my own requirements for the library, of course, I think it will take about another 0.5 person years to make it ready to be proposed for boost and/or the standard.

I will try to go that way, but it would certainly help to have support, especially with connector libraries showing that sqlpp11 can be used with all kinds of databases.

Regards,

Roland

Reply all

Reply to author

Forward