Hello everybody .
This is the first message I send to the group , so I'll introduce
myself . My name is Olemis Lang . I have +10 years of experience in
the software development / engineering business , most of the time
doing something related to Python development . I participated in
PyCon 2014 , in the tutorial sessions for SQL Alchemy , conferences
and sprints .
For further please see my Linkedin profile [1]_
To the point now ...
My team has to start working on two web/mobile projects and in the
process it is required to develop at least standalone platform (e.g.
not possible to use Django nor any other web framework ...) . In the
server side one of the requirements consists in (1) supporting
relational databases (2) as well as NoSQL DBs , initially emphasising
in Apache HBase , Apache Cassandra and MongoDB .
I'd really (<= yes, I mean it ;) like to use SQLAlchemy but it's been
hard for me to find support for NoSQL . I guess this might be normal
considering the fact that te framework seems to be, more than
everything, a SQL toolkit .
As such I was about to turn the SQLAlchemy page and continue looking
for alternatives . Nevertheless during my research I've heard of other
solutions which make me wonder of whether there's still any hope .
The first one is Ming , inspired in SQLAlchemy [3]_ and providing
convenience methods on query results [2]_ . If they were "compatible
enough" at the API level so that the same or a similar contract is
satisfied by both API functions then it might be possible to use them
while maintaining a single code base . Is this a feasible approach ?
The second one is Apache Traffodion [5]_ which claims to be a fully
functional SQL-on-Hadoop solution providing "comprehensive ANSI SQL
language support including full-functioned data definition (DDL), data
manipulation (DML), transaction control (TCL) and database utility
support" [4]_ . Is there any integration with SqlAlchemy ?
There is a number of other SQL-on-Hadoop tools out there . I'll list
below the ones I've found that might be a serious candidate for me to
consider :
- Apache Hive -
https://hive.apache.org/ => HBase
- Stinger -
http://hortonworks.com/labs/stinger/ => HBase
- Apache Drill -
https://drill.apache.org/ => HBase , MongoDB ...
Cassandra [7]_ ?
- Spark SQL -
https://spark.apache.org/sql/ => supported by PySpark
- Apache Phoenix -
http://phoenix.apache.org/ => HBase
- Presto -
http://prestodb.io/ => HBase/Hive , Cassandra
I'd appreciate if you could provide me with any suggestions or hints .
In the end I'd like to know whether SQLAlchemy (or another lib with
"almost-compatible" API) can be considered as a DB abstraction layer
supporting traditional RDBMS as well as the NoSQL DBs mentioned above
(i.e. Apache HBase , Apache Cassandra , MongoDB ) . Thanks in advance
for your time .
.. [1]
https://ca.linkedin.com/pub/olemis-lang/3b/696/b59
.. [2]
http://merciless.sourceforge.net/tour.html#querying-the-database
.. [3]
http://blog.mongodb.org/post/27907941873/using-the-python-toolkit-ming-to-accelerate-your
.. [4]
https://wiki.apache.org/incubator/TrafodionProposal#preview
.. [5]
http://trafodion.incubator.apache.org/
.. [6]
https://drill.apache.org/
.. [7]
http://www.confusedcoders.com/bigdata/apache-drill/sql-on-cassandra-querying-cassandra-via-apache-drill
--
Regards,
Olemis - @olemislc
Apache™ Bloodhound contributor
http://issues.apache.org/bloodhound
http://blood-hound.net
Brython committer
http://brython.info
Blog ES:
http://simelo-es.blogspot.com/
Blog EN:
http://simelo-en.blogspot.com/
Featured article: