A taste of OrientDB 2.1

Luigi Dell'Aquila

unread,

Feb 16, 2015, 4:54:10 AM2/16/15

to orient-...@googlegroups.com

Hi,

OrientDB 2.0 was released just a few week ago, but we are already working hard to the next release.

You know that two of the oldest pieces of code inside OrientDB core are the SQL parser and the query executor, and you know that both have some flaws.

Some of you struggled to make queries work with additional blank spaces(see https://github.com/orientechnologies/orientdb/issues/3559 and https://github.com/orientechnologies/orientdb/issues/3519 )

or to pass named/unnamed parameters to functions and inner queries (see https://github.com/orientechnologies/orientdb/issues/1069).

After releasing 2.0 we started a new development path to have:

- strict query language definition

- strict query validation with better syntax error messages

- full support to prepared statements and parameters (also where now it fails)

- new command/query executor structure (stateless, less memory consuming)

- better query execution plans and index management

- some additional operators (suspense here ;-) you have to wait for 3.0)

This path will end with 3.0, but in 2.1 we are releasing the first step, that is the new query parser.

The new parser is based on well known JavaCC. It processes all the SQL statements and produces a tree structure that will be the future query executor. So the final query execution structure will be the following:

1. Get the SQL statement

2. look for a cached executor

3. if it exists goto 7

4. parse the statement

5. optimize the statement based on schema and indexes

6. put it in the statement cache

7. run the (new) executor with input parameters

8. send the (streamed) result to the client

At this step, the SQL parser is ready, but the query executor is too complex to be refactored in time for 2.1, so we decided to go with a hybrid solution, to let you have a taste of the new features.

Here's how the query execution changes in 2.1:

1. Get the SQL statement

2. pre-parse the statement with the new SQL parser <- this gives you strict SQL syntax and much better syntax errors

3. replace named/unnamed parameters <- this solves problems in parameter passing to subqueries, functions etc.

4. rewrite the query based on the parsed structure <- this solves problems with white spaces and so on... I know it's dirty stuff, but the old executor is too coupled with the old parser to be completely split in such a short time

5. use the old parser to parse the rewritten query and generate the old executor <- a very small fraction of old parsing problems will remain for now, sorry...

6. run the (old) executor

** AN EYE TO THE PAST **

Advantages in the short term are obvious, but you can also experience small problems in the migration, eg.

- old parser lets you write things like

SELECT FROM FOO ORDER BY A ASC ORDER BY A DESC GROUP BY A ORDER BY A DESC

or even

SELECT FROM FOO ORDER BY A ASC, , , , A ASC

with the new parser you will not be allowed to do this, now every clause has its own position in the statement (see docs)

- old parser supports some old (deprecated) operators like traverse() function (different from TRAVERSE statement)

eg. SELECT FROM Profile WHERE any() traverse(0,3) (city = 'Rome')

This syntax will not be supported anymore, so with the new parser you will get a syntax error

- in old parser IN and CONTAINS operators validation is somehow lazy, eg. you can write

SELECT FROM FOO WHERE a IN 'bar'

but it's supposed to be incorrect, because 'bar' is a string, not a collection, so the new parser will throw an exception

** AND HOW ABOUT BACKWARD COMPATIBILITY...? **

To allow a smooth transition, we decided to apply the following policy to 2.1:

- old databases will continue to work with the old parser

- databases created with 2.1 will have the new parser enabled by default

- for backward compatibility, you will be able to enable/disable the new parser (for now) setting the "strictSql" parameter to the db (ALTER DATABASE docs will be updated shortly)

In next releases we will probably drop a lot of old code, so I suggest you to get used to the new strict SQL validation ;-)

** HOW CAN I TRY IT NOW? **

All this is now in a GIT branch named "javacc_parser", but today we will merge it in "develop" branch, so you will be able to try it in next 2.1-SNAPSHOT releases.

FOR NOW the new parser is enabled only on SELECT and TRAVERSE statements, in next days we will also cover UPDATE, INSERT, DELETE, CREATE EDGE, CREATE VERTEX and so on...

If you have some time, please try it out and give us your feedback, WE NEED YOUR HELP TO MAKE IT PERFECT!!!

Thanks

Luigi

--

Luigi Dell'Aquila

Orient Technologies LTD

Riccardo Tasso

unread,

Feb 16, 2015, 5:36:15 AM2/16/15

to orient-...@googlegroups.com

Hi, this is very good!

Riccardo

--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Leng Sheng Hong

unread,

Feb 26, 2015, 1:21:32 AM2/26/15

to orient-...@googlegroups.com, l.dell...@orientechnologies.com

sounds fantastic!

Dário Marcelino

unread,

Feb 27, 2015, 1:14:52 PM2/27/15

to orient-...@googlegroups.com, l.dell...@orientechnologies.com

This sounds very good Luigi! One question regarding "5. optimize the statement based on schema and indexes". If I have a query like:

select from OUser where @rid=#4:0

Will the parser be able to optimize it to something like the below?

select from #4:0

And if so, will it be able to ensure that it only returns `#4:0` if it is indeed a record from `OUser`?

Thanks!

Dário

Thomas Müller

unread,

Apr 27, 2015, 3:14:19 PM4/27/15

to orient-...@googlegroups.com, l.dell...@orientechnologies.com

ORDER BY does not work anymore in OrientDB 2.1RC1... so it's not implemented yet and not a bug to open a ticket?

Following querry works:

SELECT orderNr, in, publishedAt, publishedBy FROM (SELECT expand(outE('E_BE')) FROM #24:0) SKIP 0 LIMIT 10

following querry doesent work anymore:

SELECT orderNr, in, publishedAt, publishedBy FROM (SELECT expand(outE('E_BE')) FROM #24:0) SKIP 0 LIMIT 10 ORDER BY orderNr DESC

W. Craig Trader

unread,

Apr 27, 2015, 3:24:36 PM4/27/15

to orient-...@googlegroups.com

Try it again with 'SKIP 0 LIMIT 10' after the 'ORDER BY orderNbr DESC' clause.

- Craig -

Thomas Müller

unread,

Apr 28, 2015, 5:52:37 AM4/28/15

to orient-...@googlegroups.com

Hi Craig,

That worked! And named paramters works also now in subquerries. Great!

Thanks a lot.

Thomas

Dário Marcelino

unread,

Apr 28, 2015, 6:36:32 AM4/28/15

to orient-...@googlegroups.com

Hey everyone,

There is an open issue opened for that ORDER BY problem, in case you want to track it: https://github.com/orientechnologies/orientdb/issues/3141

Cheers

machak

unread,

May 19, 2015, 11:02:17 AM5/19/15

to orient-...@googlegroups.com, l.dell...@orientechnologies.com

Hi Luigi,

in a fairly simple app, I see performance drop between 2.0.9 and 2.1-SNAPSHOT, mainly due to time spent in orient parser classes,... 5000 req/sec drops to 3400 req/sec (34% of time is spent in query parsing/tokenizing).

This is a small dataset and whole request is executing only one query, but still may be worth investigating...

cheers

/m

Luigi Dell'Aquila

unread,

May 20, 2015, 3:24:59 AM5/20/15

to orient-...@googlegroups.com, Luigi Dell'Aquila

Hi,

thank you for reporting. I'd like to have some more details to understand if it's a corner case or if it's the average.

Anyway I'll investigate it.

In the roadmap we have the caching of SQL statements (like prepared statements) but it will not be released in 2.1

Luigi

--

machak

unread,

May 20, 2015, 8:52:26 AM5/20/15

to orient-...@googlegroups.com, l.dell...@orientechnologies.com

Hi Luigi,

On Wednesday, May 20, 2015 at 9:24:59 AM UTC+2, Luigi Dell'Aquila wrote:

Hi,

thank you for reporting. I'd like to have some more details to understand if it's a corner case or if it's the average.
Anyway I'll investigate it.
In the roadmap we have the caching of SQL statements (like prepared statements) but it will not be released in 2.1

please see attached simple testcase...my query is similar to one below,

timings on 2.0.9 are around 1.3

and for 2.1-SNAPSHOT 3.2

Normally I wouldn't care that much, but for one of my projects I need quite low latencies (it's not anything like trading platform, but I need to keep it quite low) so I am optimizing anything I can.. :)

import java.util.Calendar;

import com.orientechnologies.orient.core.command.OCommandRequest;
import com.orientechnologies.orient.core.metadata.schema.OType;
import com.orientechnologies.orient.core.sql.query.OSQLSynchQuery;
import com.tinkerpop.blueprints.impls.orient.OrientGraphNoTx;
import com.tinkerpop.blueprints.impls.orient.OrientVertex;
import com.tinkerpop.blueprints.impls.orient.OrientVertexType;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.Test;

@Test
public class SimpleTest  {
    private static final Logger log = LogManager.getLogger(SimpleTest.class);

    private OrientGraphNoTx graph;
    int counter = 0;

    @Test
    public void testQuery() {

        final long start = System.nanoTime();
        for (int i = 0; i < 10000; i++) {
            final OCommandRequest command = new OSQLSynchQuery<>("select from  Game where (user1 = 'user1' or user2 = 'user1') and result = -1 order by gamedate asc");
            final Iterable<OrientVertex> it = graph.command(command)
                    .setFetchPlan("*:-2")
                    .setLimit(10)
                    .execute();
            for (final OrientVertex anIt : it) {
                counter++;
            }
        }
        final long end = System.nanoTime();
        log.info("# {}", ((end - start) / 1000000));
        log.info("counter {}", counter);

        // 3280 @2.1-SNAPSHOT
        // 1304 @2.0.9

    }

    @BeforeClass
    public void setup() {


        graph = new OrientGraphNoTx("memory:/TestDB", "admin", "admin");
        // game
        final OrientVertexType game = graph.createVertexType("Game");
        game.createProperty("result", OType.INTEGER);
        game.createProperty("user1", OType.STRING);
        game.createProperty("user2", OType.STRING);
        game.createProperty("gamedate", OType.DATETIME);
        // data
        createGame("user1", "user2");
        createGame("user2", "user1");


    }

    private void createGame(final String user1, final String user2) {
        final OrientVertex g1 = graph.addVertex("class:Game");
        g1.setProperty("result", -1);
        g1.setProperty("user1", user1);
        g1.setProperty("user2", user2);
        g1.setProperty("gamedate", Calendar.getInstance().getTime());
        graph.commit();
    }


}

machak

unread,

May 20, 2015, 9:38:12 AM5/20/15

to orient-...@googlegroups.com, l.dell...@orientechnologies.com

On Wednesday, May 20, 2015 at 2:52:26 PM UTC+2, machak wrote:

Hi Luigi,
On Wednesday, May 20, 2015 at 9:24:59 AM UTC+2, Luigi Dell'Aquila wrote:
Hi,

thank you for reporting. I'd like to have some more details to understand if it's a corner case or if it's the average.
Anyway I'll investigate it.
In the roadmap we have the caching of SQL statements (like prepared statements) but it will not be released in 2.1

please see attached simple testcase...my query is similar to one below,
timings on 2.0.9 are around 1.3
and for 2.1-SNAPSHOT 3.2

forgot to say, you can test it easily on snapshot version by switching strictsql property..e.g.

final OCommandRequest command = new OCommandSQL("alter database custom strictSql = false");
graph.command(command).execute();

Luigi Dell'Aquila

unread,

May 20, 2015, 9:56:15 AM5/20/15

to machak, orient-...@googlegroups.com, Luigi Dell'Aquila

Thank you very much for the details. I'll work on this soon and let you know

Luigi

Reply all

Reply to author

Forward