Query performance issues

158 views
Skip to first unread message

A. Hébert

unread,
Dec 17, 2019, 3:50:24 AM12/17/19
to Warp 10 users
Hi senx team,

    We have a single application with 2 or 3 single class names and each of them have between 4 and 5 millions unique time series. They have simply 3 labels. This makes query a single Time series slow on the user side: ~5s for each FIND statement (which of course is still fast when searching on 5 millions series). How could we speed up those kind of use cases on Warp 10?

Best,

Mathias Herberts

unread,
Dec 19, 2019, 5:23:06 AM12/19/19
to Warp 10 users
If you know the values associated with those labels for the GTS you wish to fetch you can use the 'gts' map key in the FETCH parameter map to specify an explicit list of GTS to fetch instead of performing a prior search on Directory.

A. Hébert

unread,
Dec 19, 2019, 12:27:03 PM12/19/19
to Warp 10 users
I just tried that and got the following exeption `ERROR line #8: Exception at '=>FETCH<=' in section [TOP]`

Mathias Herberts

unread,
Dec 20, 2019, 6:01:18 AM12/20/19
to Warp 10 users
There was an issue with key initialization which had been moved too early. Fixed on master.

A. Hébert

unread,
Jan 8, 2020, 4:17:12 AM1/8/20
to Warp 10 users
I have tried again a simple FETCH based on a FIND result with the current Warp10 master version and get an empty list with the 'gts' key. When I do the same fetch with the selectors I do get my results.

Mathias Herberts

unread,
Jan 8, 2020, 8:20:58 AM1/8/20
to Warp 10 users
Does your token have a single app/owner/producer?

The FETCH will inject .owner, .producer and .app labels from the value in the READ token provided.

A. Hébert

unread,
Jan 8, 2020, 8:54:14 AM1/8/20
to Warp 10 users
On my knowledge, our token are forged only on a single application base. As example with TOKENINFO, I get:

[{"application":"unicorn","issuance":1569....,"expiry":1591...,"type":"READ","apps":["unicorn"]}]

Mathias Herberts

unread,
Jan 9, 2020, 5:26:33 PM1/9/20
to Warp 10 users
Are you setting all the labels present in the GTS?

A. Hébert

unread,
Jan 13, 2020, 4:42:12 AM1/13/20
to Warp 10 users
All the gts in the 'gts' comes directly from a FIND statement

Mathias Herberts

unread,
Jan 13, 2020, 7:27:26 PM1/13/20
to Warp 10 users
What Warp 10 revision is the problem occurring on?

A. Hébert

unread,
Jan 15, 2020, 4:08:23 AM1/15/20
to Warp 10 users

A. Hébert

unread,
Jan 15, 2020, 4:16:11 AM1/15/20
to Warp 10 users
I do have the same behavior on both standalone as well as on a distributed setup.

--
You received this message because you are subscribed to a topic in the Google Groups "Warp 10 users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/warp10-users/Du8YZ8zHJNM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to warp10-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/warp10-users/66fa13a6-8655-47b2-8c76-2bbf0a8a76d9%40googlegroups.com.


--
Aurélien HÉBERT

Mathias Herberts

unread,
Jan 15, 2020, 5:53:47 PM1/15/20
to Warp 10 users
Are you performing the FETCH with the same token used for the FIND?
To unsubscribe from this group and all its topics, send an email to warp10-users+unsubscribe@googlegroups.com.


--
Aurélien HÉBERT

A. Hébert

unread,
Jan 29, 2020, 9:08:53 AM1/29/20
to Warp 10 users
Yes, of course. As soon as I find some time, I will provide more tests and descriptions here :). I use Worf to generate both standalone and Distributed tokens.
To unsubscribe from this group and all its topics, send an email to warp10...@googlegroups.com.


--
Aurélien HÉBERT

A. Hébert

unread,
Mar 11, 2020, 9:46:39 AM3/11/20
to Warp 10 users
I find the answer about this empty list. In our READ TOKEN, we do not have any "producer" labels set. However it's well set in our series labels stored. Applying this small change (https://github.com/senx/warp10-platform/compare/master...aurrelhebert:fix/gts/fetch?expand=1) fixes it.

Mathias Herberts

unread,
Mar 11, 2020, 11:52:34 AM3/11/20
to Warp 10 users
Did you see an improvement in terms of performance?

A. Hébert

unread,
Mar 11, 2020, 12:01:50 PM3/11/20
to Warp 10 users
For sure on a batch of hundred queries an average of 150ms, the best resolved in only 28ms, and the worst in 1s
Message has been deleted

A. Hébert

unread,
Apr 27, 2020, 5:54:41 AM4/27/20
to Warp 10 users
Hello Mathias,

I am back again on this story of performance for queries due to the Index size.

First of all, I wanted to say that in many use-cases the FETCH with the "gts" key improve a lot of query in our side.

However we still have massive slow queries, for example when we want to compute a FIND on a specific label on one of our most large application. The selector query is '~.*{id=42}' and this queries takes more than 70 seconds (server side). The FINDSTATS provides the following result "{"classes.estimate":14,"gts.estimate":14,"labelnames.estimate":13,"labelvalues.estimate":12,"error.rate":0.008125}".

Even when setting one series name on the selector 'test{id=42}', we still get series than takes more than 10 seconds to be executed (server side).

I would be to help to improve those queries. What can I look like to start with?

Mathias Herberts

unread,
Apr 28, 2020, 3:34:03 PM4/28/20
to Warp 10 users
Hi,

assuming you know the classes of those 14 series you want to access, you could very well use the same approach using the 'gts' parameter with a list of GTS you build manually, i.e.:

'gts' [
NEWGTS 'class-1' RENAME { 'id' '42' } RELABEL
NEWGTS 'class-2' RENAME { 'id' '42' } RELABEL
..
]

Of course you need to specify the full set of labels so this approach will only work if you know what the labels are.

Another approach, assuming you do not know what the labels are but you know the class names, you could select the GTS using a single class name then use the 'extra' parameter to specify the other 13 classes.

A. Hébert

unread,
May 14, 2020, 10:42:02 AM5/14/20
to Warp 10 users
Thanks for your proposal. Do you have an example for the 'extra' parameter? I couldn't make it work (only one series as result) on a Warp10 2.5.
As a matter of fact this time I am really in the second case (I can't know the full set of labels only the id) and I really would like to be able to reduce the 10 seconds to FIND a single series.

A. Hébert

unread,
May 14, 2020, 10:53:33 AM5/14/20
to Warp 10 users
I do apologise, I run my tests to fast, the 'extra' parameter isn't working with a FIND statement only with FETCH. I couldn't really reduce by a lot using this method (from 70 - 80 seconds) to (40 - 50 seconds) to retrieve the 14 last values of those series. I had to split my statement in 3 or 4 FETCH, as the labels keys are different for several classes.

Mathias Herberts

unread,
May 15, 2020, 3:45:12 AM5/15/20
to Warp 10 users
Can you provide the directory log lines related to the searches you are performing? They should look like:

Search returned AAA results in BBB ms, inspected CCC metadatas in DDD classes (EEE matched) and performed FFF comparisons

A. Hébert

unread,
May 15, 2020, 11:35:21 AM5/15/20
to Warp 10 users
I can't really provides those lines with certainty... We are using HA sharded directories. I could find an example of what may like a slow query:
Search returned 14 results in 25218.834073 ms, inspected 435067 metadatas in 1 classes (1 matched) and performed 1306154 comparisons.

For information in all the sharded directory logs, I didn't found a log with 14 results [ ... ] in 14 classes or with 14 matched neither.

Mathias Herberts

unread,
May 18, 2020, 2:56:07 AM5/18/20
to Warp 10 users
1.3M comparisons on 435k metadata usually does not take 25s, this is a typical log in one of our own systems:

Search returned 27 results in 183.488054 ms, inspected 182304 metadatas in 1 classes (1 matched) and performed 546912 comparisons


what is the load average on your Directory? What is the time a /find request actually takes (the path is slightly different so result may give some additional infos).

A. Hébert

unread,
May 18, 2020, 3:26:54 AM5/18/20
to Warp 10 users

A. Hébert

unread,
May 18, 2020, 3:42:50 AM5/18/20
to Warp 10 users
We do have fast queries on our indexes too (on average our find queries took 150 to 200ms). Our directories are not too overloaded (between 40 and 60 % of cpu load). In our use-cases we do have specific slow queries on high class names cardinalities. For instance if 20/30 accounts pushed the same class more than 1 millions times (ex: my.cpu, my.mem, ...). Then even a new account pushing this specific series is impacted by slow index queries too.

I was more searching for the root cause of this behaviour / an hint on how to improve those data retrieval.

For information with the /find request it took the same time (real: 1m13.770s)

Mathias Herberts

unread,
May 18, 2020, 8:47:22 AM5/18/20
to Warp 10 users
You can try fetching directly from one of your Directory using the class at https://gist.github.com/hbs/30bf6921b5145e0474adae3273027ef3

It is called like this:

java -cp warp10/build/libs/warp10-x.y.z.jar:. DirectoryStreamClient "{ 'urls' [ 'http://HOST:PORT/directory-streaming' ] 'psk' 'HEX ENCODED PRESHARED KEY' 'class' [ 'CLASS_SELECTOR' ] 'labels' [ { /* LABEL SELECTOR */ } ] }"


It will output the list of fetched GTS (their Metadata) and the time taken for the request in ns.


This should help you determine if the Directory instance itself is the bottleneck of something else down the line.

A. Hébert

unread,
May 29, 2020, 10:34:56 AM5/29/20
to Warp 10 users

I run this test and I have the following result:

- Retrieving 1 series takes 6269551620 ns (6 s)
- Retrieving all 14 series takes 197156864266 ns (3m 17s), I had to use a regular expression as I couldn't specified several classes (may be that's why it's a bit slower that previous find queries).

However I didn't get 1 or 14 series as result but 3 and 42.

This test leads me to think that the bottleneck is related to the directory component or at least our directories architecture. In production we have:
- 6 directories
- 3 replicas
- 2 shards per replica

- Between 210 and 230 millions of series per directory

Mathias Herberts

unread,
May 29, 2020, 1:37:40 PM5/29/20
to Warp 10 users
You can specify several selectors, 'class' and 'labels' are lists, see definition of DirectoryRequest thrift structure.

Can you share the number of comparisons which were performed for each request?

A. Hébert

unread,
Jun 2, 2020, 5:13:54 AM6/2/20
to Warp 10 users
How can I get such an informations? I would say that at least 3 applications have more than 1 millions occurrences of those specifics classes.

Mathias Herberts

unread,
Jun 6, 2020, 8:17:20 AM6/6/20
to Warp 10 users
The log output should appear in the directory logs.

Having an idea of the number of comparisons and the actual time spent should guide you towards a possible explanation of the slowness you experience, either lack of resources or actual huge number of comparisons.

A. Hébert

unread,
Jun 8, 2020, 8:07:51 AM6/8/20
to Warp 10 users
As we do not experience directories latencies (avg answer time is 200ms), I would say it's related to the number of comparisons. Our directories are too chatty (around 500 requests per seconds per directory) to find the specific related query output in it's logs.

Mathias Herberts

unread,
Jun 9, 2020, 5:10:48 PM6/9/20
to Warp 10 users
The log extract you gave a few weeks ago does not support the assumption that the culprit is the number of comparisons, if it is still 1.3M then they should not take 300s.

Unfortunately without more infos from the logs diagnosing the issue will be hard.
Reply all
Reply to author
Forward
0 new messages