InMemory

147 views
Skip to first unread message

Kevin GEORGES

unread,
Jan 9, 2017, 11:32:31 AM1/9/17
to Warp 10 users
Hello,

I use a Warp10 in in memory mode.
Everything is running fine but i have some issues:
  - Fetch/Find is pretty slow - about 1800ms by request
  - Memory consumption - garbage collector consumed around 40GB

Did you already encounter those issues?

Find attach a screenshot of the memory
Screen Shot 2017-01-09 at 16.50.16.png

Mathias Herberts

unread,
Jan 9, 2017, 11:54:51 AM1/9/17
to Warp 10 users
FIND requests do not access the data but only the directory, FETCH requests access the directory then the datastore.

How many GTS does directory manage?

Kevin GEORGES

unread,
Jan 9, 2017, 12:04:52 PM1/9/17
to Warp 10 users
Hello,

25M

Performance are the same between FIND and FETCH 

Mathias Herberts

unread,
Jan 9, 2017, 1:58:51 PM1/9/17
to Warp 10 users
This means time is mostly spent in directory.

How many classes? How many producers/owners.

What type of GTS selectors?

Kevin GEORGES

unread,
Jan 9, 2017, 2:25:09 PM1/9/17
to Warp 10 users
Around 40 classes. Only one producer/owner.

Non regex selector
[ $RTOKEN 'haproxy.stats.scur' { 'haproxy' ': 'myHAProxy' 'type' '0' } ] FIND

Mathias Herberts

unread,
Jan 9, 2017, 2:32:31 PM1/9/17
to Warp 10 users
How many GTS in this specific class?

Kevin GEORGES

unread,
Jan 9, 2017, 2:48:40 PM1/9/17
to Warp 10 users
1099245

Mathias Herberts

unread,
Jan 9, 2017, 2:51:05 PM1/9/17
to Warp 10 users
I'll patch the standalone directory so it uses the same optimizations as the distributed directory. Once that's pushed you can give it a try, it should improve the time spent in FIND.

We also have another PR in the pipe which will speed it up tremendously if you're willing to compromise on freshness a bit.

Kevin GEORGES

unread,
Jan 9, 2017, 2:58:18 PM1/9/17
to Warp 10 users
Glad to hear that :)
Do you have an eta on the patch?

What do you mean by "compromise on freshness a bit"?

Mathias Herberts

unread,
Jan 10, 2017, 10:32:53 AM1/10/17
to Warp 10 users
PR #141 (https://github.com/cityzendata/warp10-platform/pull/141) modifies the way search is performed on the directory.

As for your memory consumption you may want to try out the sharded memstore in PR#143 (https://github.com/cityzendata/warp10-platform/pull/143), it removes lots of locking and in our own tests performs better than the previous one.

Eager to see what it does on your own use case.

Kevin GEORGES

unread,
Jan 13, 2017, 1:05:39 PM1/13/17
to Warp 10 users
We have mixed PR #141 and #143. Query performances are much more better than before (around 5x better) :D

We still have issues with datapoints ingestion :/ 
Stuck around 80k dps | 0.4 req/s | 90Mbs
The load is pretty low 7/56

There is some field of improvement, regarding this issue ?

Mathias Herberts

unread,
Jan 13, 2017, 3:28:15 PM1/13/17
to Warp 10 users


On Friday, January 13, 2017 at 7:05:39 PM UTC+1, Kevin GEORGES wrote:
We have mixed PR #141 and #143. Query performances are much more better than before (around 5x better) :D

We still have issues with datapoints ingestion :/ 
Stuck around 80k dps | 0.4 req/s | 90Mbs
The load is pretty low 7/56

There is some field of improvement, regarding this issue ?

That ingestion performance is low, what is your batch size? How many parallel threads? Do your batch only have new GTS? How many chunks? What chunk width?

Kevin Georges

unread,
Jan 13, 2017, 4:28:03 PM1/13/17
to Mathias Herberts, Warp 10 users
I have 6 acceptors and 8 selectors. Batch size is between 5 to 300Mb. 
GTS  count is constant so no new GTS at all. 

36 chunks of 1h. 

The ingestion rate is slower now - around 50k dps. Java gc is pretty stressful :/
--
You received this message because you are subscribed to a topic in the Google Groups "Warp 10 users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/warp10-users/rPxAPrzuaD8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to warp10-users...@googlegroups.com.
To post to this group, send email to warp10...@googlegroups.com.
Visit this group at https://groups.google.com/group/warp10-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/warp10-users/ccaa395d-4701-4898-833e-8ceef11ed138%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mathias Herberts

unread,
Jan 13, 2017, 4:54:45 PM1/13/17
to Warp 10 users, mathias....@gmail.com
Are you running G1?


On Friday, January 13, 2017 at 10:28:03 PM UTC+1, Kevin GEORGES wrote:
I have 6 acceptors and 8 selectors. Batch size is between 5 to 300Mb. 
GTS  count is constant so no new GTS at all. 

36 chunks of 1h. 

The ingestion rate is slower now - around 50k dps. Java gc is pretty stressful :/
To unsubscribe from this group and all its topics, send an email to warp10-users+unsubscribe@googlegroups.com.

Kevin GEORGES

unread,
Jan 13, 2017, 5:02:01 PM1/13/17
to Warp 10 users, mathias....@gmail.com
Yep

opt/java8/bin/java -javaagent:/opt/warp/bin/jmx_prometheus_javaagent-0.7.jar=127.0.0.1:9101:/opt/warp/etc/jmx_prometheus.yml -Djava.net.preferIPv4Stack=true -Djava.security.egd=file:/dev/./urandom -Djava.awt.headless=true -Dlog4j.configuration=file:/opt/warp/etc/log4j.properties -Dsensision.server.port=9100 -Dsensision.events.dir=/opt/sensision/data/metrics/ -Dsensision.default.labels=cell=inmemory -Xms64g -Xmx300g -XX:+UseG1GC -cp etc:/opt/warp/bin/warp10-1.2.5-rc6-16-g4abacc1.jar io.warp10.standalone.Warp /opt/warp/etc/warp.conf >> /opt/warp/nohup.out 2>&1



GC is episodic... 


To unsubscribe from this group and all its topics, send an email to warp10-users...@googlegroups.com.

Mathias Herberts

unread,
Jan 13, 2017, 5:07:49 PM1/13/17
to Warp 10 users, mathias....@gmail.com
You should launch the JVM with Xms300g.

Your grafana dashboard indicates 2.6*10**4 update requests, is that per second? Is the unit wrong?

Even 2.6k requests/s is probably the root cause of your problems. How many cores does your server have? What is jetty's thread pool size? What's the average number of datapoints per request?

Kevin Georges

unread,
Jan 13, 2017, 5:15:23 PM1/13/17
to Mathias Herberts, Warp 10 users
Dashboard indicates 2.6k request 10**4 so 0,26 req/s

Server is 2*14 hyper threaded core -> 56 threads
Requests have a payload of 155k data points. 

Jetty has 6 acceptors and 12 selectors. 

Fixing xms :)
You received this message because you are subscribed to the Google Groups "Warp 10 users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to warp10-users...@googlegroups.com.

To post to this group, send email to warp10...@googlegroups.com.
Visit this group at https://groups.google.com/group/warp10-users.

Kevin GEORGES

unread,
Jan 13, 2017, 5:20:29 PM1/13/17
to Warp 10 users, mathias....@gmail.com
CPU usage is asymetric.


While GC



To unsubscribe from this group and stop receiving emails from it, send an email to warp10-users+unsubscribe@googlegroups.com.

Mathias Herberts

unread,
Jan 13, 2017, 5:42:53 PM1/13/17
to Warp 10 users, mathias....@gmail.com
What does the gc log actually show?

Try disabling the javaagent.

Mathias Herberts

unread,
Jan 13, 2017, 5:43:07 PM1/13/17
to Warp 10 users, mathias....@gmail.com
How many concurrent connections to ingress? (netstat -na|grep 8882|grep ESTABLISHED|wc -l)

How many threads in Jetty's pool?

Kevin GEORGES

unread,
Jan 13, 2017, 5:58:36 PM1/13/17
to Warp 10 users, mathias....@gmail.com
Disabling agent and enabling gc logs ;)

Kevin GEORGES

unread,
Jan 13, 2017, 6:05:14 PM1/13/17
to Warp 10 users, mathias....@gmail.com

# netstat -na | grep 8080|grep ESTABLISHED|wc -l

33


Jetty has 6 acceptors and 12 selectors. 

To unsubscribe from this group and stop receiving emails from it, send an email to warp10-users...@googlegroups.com.

To post to this group, send email to warp10...@googlegroups.com.
Visit this group at https://groups.google.com/group/warp10-users.

Kevin GEORGES

unread,
Jan 15, 2017, 1:52:55 PM1/15/17
to Warp 10 users, mathias....@gmail.com
Please find attach the gc logs

Ingestion rates is near 75k dps

# netstat -na | grep 8080|grep ESTABLISHED|wc -l

33



gc.log

Mathias Herberts

unread,
Jan 15, 2017, 2:06:43 PM1/15/17
to Warp 10 users, mathias....@gmail.com
There are no full GC in your logs, but the number of GC Worker threads seems a bit high (38). You can try lowering that number via -XX:ParallelGCThreads.

Also did you enable DataLogging in your standalone instance?

The chunked store calls getUnsafeDecoder so there is no extra allocation performed when storing the incoming data, and given the amount of memory you dedicated to you jvm there's no reason the gc should become aggressive. Any special plugin you've deployed?

Kevin GEORGES

unread,
Jan 15, 2017, 4:00:51 PM1/15/17
to Warp 10 users, mathias....@gmail.com
Setting XX:ParallelGCThreads to 8 cause a massive drop in ingestion rate 75k to 30k :/
I retry with 24.

No, nor DataLogging or special plugins.

Find attach the gc log
gc.log

Kevin GEORGES

unread,
Jan 15, 2017, 4:53:05 PM1/15/17
to Warp 10 users, mathias....@gmail.com
Same behaviour...

Mathias Herberts

unread,
Jan 15, 2017, 5:01:14 PM1/15/17
to Warp 10 users, mathias....@gmail.com
What overall performance do you achieve when you limit the number of parallel ingress requests to 1 / 2 / 4 / 8 / 16 instead of the 33 you mentioned earlier?

Can you share your configuration file?

How many labels do your GTS have?

Mathias Herberts

unread,
Jan 15, 2017, 5:13:09 PM1/15/17
to Kevin GEORGES, Warp 10 users
Can you also check your ingestion performance when using the null or plasma backends?

Set in.memory to false

For the null backend, add 'null=true' to your configuration.
For the plasma backend, add 'pureplasma=true'

The null backend will allow you to benchmark the parsing of the ingested data as data won't be stored neither their metadata registered.
The plasma backend will allow you to benchmark the parsing + directory registration.


Kevin GEORGES

unread,
Jan 15, 2017, 6:28:34 PM1/15/17
to Warp 10 users, mathias....@gmail.com
Please find attach the configuration file

Gts have 10 labels.

Starting parallel benching
warp.conf

Mathias Herberts

unread,
Jan 15, 2017, 6:39:46 PM1/15/17
to Kevin GEORGES, Warp 10 users
Ok, so 72 chunks of 1 day.

One thing you could also try is to modify StandaloneDirectoryClient so the register method is no longer synchronized. This is an upcoming change anyway so might as well test it now.

Given that each GTS has 10 labels, this may speed up ingestion as labels Id is computed in 'register' and may take up quite some resources with 10 labels.


--
You received this message because you are subscribed to the Google Groups "Warp 10 users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to warp10-users+unsubscribe@googlegroups.com.

To post to this group, send email to warp10...@googlegroups.com.
Visit this group at https://groups.google.com/group/warp10-users.

Kevin GEORGES

unread,
Jan 15, 2017, 7:08:56 PM1/15/17
to Warp 10 users, k4ge...@gmail.com
We support a pretty steady rate on null backend ~400k dps
Plasma backend rate is way slower ~80k dps

Why the directory has a so huge impact?

Mathias Herberts

unread,
Jan 16, 2017, 1:14:14 AM1/16/17
to Kevin GEORGES, Warp 10 users
If that's the case then suppress the 'synchronized' on register and tell us if that's better.

Probably linked to the fact that your GTS have 10 labels and that their labelsId is for now computed in the synchronized part thus lowing down all ingesters.


You received this message because you are subscribed to the Google Groups "Warp 10 users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to warp10-users+unsubscribe@googlegroups.com.
To post to this group, send email to warp10...@googlegroups.com.
Visit this group at https://groups.google.com/group/warp10-users.

Mathias Herberts

unread,
Jan 16, 2017, 2:17:47 AM1/16/17
to Warp 10 users, k4ge...@gmail.com
That commit was reverted because GTS accounting is incorrect with it, will push a correct one this morning. But the removal of synchronized is ok besides that.

Kevin GEORGES

unread,
Jan 16, 2017, 5:36:59 AM1/16/17
to Warp 10 users, k4ge...@gmail.com
synchronized removal has a huge impact on performances!
We go from 80kdps to 400kdps on plasma backend :) (400kdps is our nominal rate)

Trying to enable chunk storage

Mathias Herberts

unread,
Jan 16, 2017, 5:56:00 AM1/16/17
to Warp 10 users, k4ge...@gmail.com
The latest version of PR #143 removes the synchronized and fixes the GTS accounting.

Kevin GEORGES

unread,
Jan 16, 2017, 6:29:43 AM1/16/17
to Warp 10 users, k4ge...@gmail.com
Just pushed PR#143 in prod :)

I have enable chunk storage which cause issues on ingestion rate.
Gc is a bit aggressive to.
gc.log
Screen Shot 2017-01-16 at 12.24.10.png

Mathias Herberts

unread,
Jan 16, 2017, 8:28:08 AM1/16/17
to Warp 10 users, k4ge...@gmail.com
You can try varying the size of the young generation and the number of gc threads to see if it helps.

Kevin GEORGES

unread,
Jan 17, 2017, 7:07:27 AM1/17/17
to Warp 10 users, k4ge...@gmail.com
Adjusting gc parameters allow me to keep up with the ingestion rate, but we still have issues with memory management
Warp10 reach out of memory bound in some random ways :/
Screen Shot 2017-01-17 at 13.05.02.png

Mathias Herberts

unread,
Jan 17, 2017, 8:56:05 AM1/17/17
to Warp 10 users
Could you share your GC logs?

Is your dataset supposed to fit in memory? Are you executing some warpscript which fiddles with lots of data?

Kevin GEORGES

unread,
Jan 19, 2017, 2:49:51 PM1/19/17
to Warp 10 users
Please find attach my gc logs

Yes, if for a seven bytes by datapoint storage -> 36 buckets * 6min * 400kdp * 7 = ~37GB

Same behaviour with or without queries
gc.log

Mathias Herberts

unread,
Jan 19, 2017, 5:29:52 PM1/19/17
to Kevin GEORGES, Warp 10 users
Thanks for your gc.log.

The memory consumption of the actual datapoints (the memory reported via sensision) does not take into account the memory overhead of the various objects.

The memory footprint of a GTSEncoder is a little less than 200 bytes, assuming you have 36 chunks per GTS and 25M GTS as you mentioned in a previous message, this means you have for the GTSEncoder alone a total south of 200 * 36 * 25M = 180G, not counting the actual data, but once your number of GTS has stabilized, this won't evolve.

If you reduce your chunk count to 18 chunks of 12 minutes, you will save ~90G of memory.

During the period that your gc log covers, how did the number of GTS evolve?



--
You received this message because you are subscribed to the Google Groups "Warp 10 users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to warp10-users+unsubscribe@googlegroups.com.
To post to this group, send email to warp10...@googlegroups.com.
Visit this group at https://groups.google.com/group/warp10-users.

David Morin

unread,
Jan 20, 2017, 5:21:08 AM1/20/17
to Warp 10 users, k4ge...@gmail.com
Most of the GC time is spent in Remembered Set (Rset) within the "Evacuation Pause" of the Young Generation.
-XX:G1NewSizePercent has been set: -XX:G1NewSizePercent=10
By default, -XX:G1MaxNewSizePercent=60
Could you try with G1NewSizePercent=G1MaxNewSizePercent=60 ?


On Thursday, January 19, 2017 at 11:29:52 PM UTC+1, Mathias Herberts wrote:
Thanks for your gc.log.

The memory consumption of the actual datapoints (the memory reported via sensision) does not take into account the memory overhead of the various objects.

The memory footprint of a GTSEncoder is a little less than 200 bytes, assuming you have 36 chunks per GTS and 25M GTS as you mentioned in a previous message, this means you have for the GTSEncoder alone a total south of 200 * 36 * 25M = 180G, not counting the actual data, but once your number of GTS has stabilized, this won't evolve.

If you reduce your chunk count to 18 chunks of 12 minutes, you will save ~90G of memory.

During the period that your gc log covers, how did the number of GTS evolve?


On Thu, Jan 19, 2017 at 8:49 PM, Kevin GEORGES <k4ge...@gmail.com> wrote:
Please find attach my gc logs

Yes, if for a seven bytes by datapoint storage -> 36 buckets * 6min * 400kdp * 7 = ~37GB

Same behaviour with or without queries

Le mardi 17 janvier 2017 14:56:05 UTC+1, Mathias Herberts a écrit :
Could you share your GC logs?

Is your dataset supposed to fit in memory? Are you executing some warpscript which fiddles with lots of data?

--
You received this message because you are subscribed to the Google Groups "Warp 10 users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to warp10-users...@googlegroups.com.

To post to this group, send email to warp10...@googlegroups.com.
Visit this group at https://groups.google.com/group/warp10-users.

Kevin GEORGES

unread,
Jan 20, 2017, 6:45:26 AM1/20/17
to Warp 10 users, k4ge...@gmail.com
The number of GTS does not evolve.

I perform a test with 3 chunk of 5min, WarpGC cleanup some points but global JVM memory still increase at a steady rate :/
I also set up G1NewSizePercent at 60 and G1MaxNewSizePercent 60 in the same time as David suggest.
Good points, ingestion rate is more stable and average load is reduced by 2.

Find attach some metrics about the instance performances (test start at 10h45 and crash at 11h35)

Le jeudi 19 janvier 2017 23:29:52 UTC+1, Mathias Herberts a écrit :
Thanks for your gc.log.

The memory consumption of the actual datapoints (the memory reported via sensision) does not take into account the memory overhead of the various objects.

The memory footprint of a GTSEncoder is a little less than 200 bytes, assuming you have 36 chunks per GTS and 25M GTS as you mentioned in a previous message, this means you have for the GTSEncoder alone a total south of 200 * 36 * 25M = 180G, not counting the actual data, but once your number of GTS has stabilized, this won't evolve.

If you reduce your chunk count to 18 chunks of 12 minutes, you will save ~90G of memory.

During the period that your gc log covers, how did the number of GTS evolve?


On Thu, Jan 19, 2017 at 8:49 PM, Kevin GEORGES <k4ge...@gmail.com> wrote:
Please find attach my gc logs

Yes, if for a seven bytes by datapoint storage -> 36 buckets * 6min * 400kdp * 7 = ~37GB

Same behaviour with or without queries

Le mardi 17 janvier 2017 14:56:05 UTC+1, Mathias Herberts a écrit :
Could you share your GC logs?

Is your dataset supposed to fit in memory? Are you executing some warpscript which fiddles with lots of data?

--
You received this message because you are subscribed to the Google Groups "Warp 10 users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to warp10-users...@googlegroups.com.

To post to this group, send email to warp10...@googlegroups.com.
Visit this group at https://groups.google.com/group/warp10-users.
Screen Shot 2017-01-20 at 12.39.07.png

Mathias Herberts

unread,
Jan 20, 2017, 10:00:34 AM1/20/17
to Warp 10 users, k4ge...@gmail.com
When you say the JVM crashes, what do you mean?

Kevin Georges

unread,
Jan 20, 2017, 5:07:20 PM1/20/17
to Mathias Herberts, Warp 10 users
JVM restart because out of memory 
I am actually running warp10 with 5 chunks of 1 hour for 25M gts and this is pretty stable (10 hours right now). Setting G1NewSizePercent to 60% reduce jvl gc but cause earlier crash :/

Mathias Herberts

unread,
Jan 21, 2017, 2:04:09 PM1/21/17
to Warp 10 users, mathias....@gmail.com
Do you have the actual error message?

What version of the JVM are you running?

Mathias Herberts

unread,
Jan 29, 2017, 12:04:40 PM1/29/17
to Warp 10 users, mathias....@gmail.com
Are you still experiencing your OOM issues? If not what did you do to solve them?

Kevin Georges

unread,
Jan 30, 2017, 6:26:24 PM1/30/17
to Mathias Herberts, Warp 10 users
Yes, I still have OOM issues. I have to inspect the JVM memory to look for suspicious allocations. 

Mathias Herberts

unread,
Jan 31, 2017, 1:51:20 AM1/31/17
to Warp 10 users, mathias....@gmail.com
Is the JVM crashing or simply throwing OOMs?

If it crashes is it on its own or killed by the oom killer?
Reply all
Reply to author
Forward
0 new messages