infi.clickhouse_orm - a simple Python ORM for working with ClickHouse

2,397 views
Skip to first unread message

Itai Shirav

unread,
Jun 28, 2016, 9:00:16 AM6/28/16
to ClickHouse
Hi,

I've developed a simple ORM (Object Relational Mapping) library in Python for working with ClickHouse:

It lets you do things such as this:

from infi.clickhouse_orm import models, fields, engines
from infi.clickhouse_orm.database import Database

# Define a simple model class

class Person(models.Model):

    first_name
= fields.StringField()
    last_name
= fields.StringField()
    birthday
= fields.DateField()
    height
= fields.Float32Field()

    engine
= engines.MergeTree('birthday', ('first_name', 'last_name', 'birthday'))

# Create a table for this model
db
= Database('my_test_db')
db
.create_table(Person)

# Insert some data
db
.insert([
   
Person(first_name='David', last_name='Miles', birthday='1975-05-05', height=1.77),
   
Person(first_name='Mona', last_name='Miles', birthday='1976-09-21', height=1.74)
])

# Read data
for person in db.select("SELECT * FROM my_test_db.person", model_class=Person):
    print person.first_name, person.last_name


See the GitHub page for full documentation.

Ideas, bug reports and pull requests welcome...

-- Itai


man...@gmail.com

unread,
Jun 29, 2016, 2:29:18 PM6/29/16
to ClickHouse
Nice! Your work is strongly appreciated.
Added a star on Github.

Evgeni Makarov

unread,
Aug 28, 2016, 11:51:03 AM8/28/16
to ClickHouse
Hello

Any plans to add support for all Clickhouse Field types?

вторник, 28 июня 2016 г., 16:00:16 UTC+3 пользователь Itai Shirav написал:

Itai Shirav

unread,
Aug 29, 2016, 3:58:45 AM8/29/16
to ClickHouse
Hi Evgeni,

I think it should be pretty easy to add support for FixedString (probably not very useful) and Enum.
Supporting Array is also doable.
Nested fields probably won't be supported.

Which of these types do you see as the most important to support?

Thanks

Itai Shirav

unread,
Sep 13, 2016, 12:41:13 AM9/13/16
to ClickHouse
Hi everyone,

I'm happy to inform you that infi.clickhouse_orm v0.7.0 now supports array fields as well as enum fields.
The documentation includes source code examples that show how to use these fields - https://github.com/Infinidat/infi.clickhouse_orm/

- Itai

Ivan Ladelschikov

unread,
Sep 14, 2016, 3:14:07 PM9/14/16
to ClickHouse
Hello! Does anyone have any ideas about how to improve the insert speed of the ORM? I've prepared a simple script to benchmark the insertion using:
a) clickhouse-client from file
b) plain ORM insert after model creation
c) requests POST from file
d) create model with ORM and then insert with requests
e) create model with ORM and then insert with requests and use Transfer-Encoding: chunked (stream upload) <-- the same as ORM currently does
The results are pretty different:

$ python2 ch-insert-bench.py gen 30000
Done: 30000 lines
$ python2 ch-insert-bench.py ins-orm
Done: 6.59307599068 sec.
$ python2 ch-insert-bench.py ins-client
Done: 0.175565958023 sec.
$ python2 ch-insert-bench.py ins-req
Done: 0.089348077774 sec.
$ python2 ch-insert-bench.py ins-orm-req
Data prepared in 3.65072703362 sec.
Done: 0.181367874146 sec.
$ python2 ch-insert-bench.py ins-orm-req-chunked
Done: 6.20508003235 sec.



Itai Shirav

unread,
Sep 15, 2016, 12:43:56 AM9/15/16
to ClickHouse
Hi,

Thanks for taking the time to do this benchmark!
It's not surprising that there's an overhead associated with the ORM, but let's see if this can be improved... I'll run your benchmark with a profiler and try to find possible optimizations.

man...@gmail.com

unread,
Sep 15, 2016, 9:17:12 PM9/15/16
to ClickHouse

Itai Shirav

unread,
Sep 29, 2016, 4:44:05 AM9/29/16
to ClickHouse
Hi everyone,

Version 0.7.1 of infi.clickhouse_orm was just released.

It includes several performance improvements that make the benchmark run x5 faster.

Additionally it fixes a couple of bugs:
* parsing of numeric arrays
* handling "0000-00-00 00:00:00" as a valid datetime value (thanks tsionyx!)

-- Itai

Alexey Polyakov

unread,
Feb 24, 2017, 7:32:29 AM2/24/17
to ClickHouse
Hello.


Nested fields probably won't be supported

Why not? 
Reply all
Reply to author
Forward
0 new messages