Issues upgrading to 2.3.0RC1, Opentsdb High on CPU at high loads

96 ملاحظات
پہلے نہ پڑھے ہوئے پیغام پر جائیں

raghu...@gmail.com

نہ پڑھا ہوا،
19 مئی، 2017، 11:36:21 AM19/5/17
بنام OpenTSDB
Hi All,

We have been successfully running 2.2.1 for more than 7 months on cloudera cluster dedicated to HBASE, Opentsdb.

CDH version 5.10.0,
Java version 1.8.1_121

we are very interested in new math functions in 2.3.0  and want to take advantage it, after upgrade under heavy load(700K writes/sec in Hbase) Opentsdb hangs with high CPU around 95-98% for 15 to 20 mins.

Here are the couple of error's I see in the logs

2017-04-29 16:11:39,210 [OpenTSDB I/O Worker #6] ERROR [ConnectionManager.exceptionCaught] - Unexpected exception from downstream for [id: 0x405992f2, hbase:51186 :> /opentsdb:9999]
java.lang.IllegalStateException: received HttpChunk without HttpMessage


2017-04-29 15:59:20,712 [OpenTSDB I/O Worker #10] ERROR [ConnectionManager.exceptionCaught] - Unexpected exception from downstream for [id: 0x216c7bc9, /ipaddress:54723 => /ipaddress:9999]
java.io.IOException: Broken pipe


Here is /api/conf output

{
"tsd.network.bind": "",
"tsd.core.auto_create_metrics": "true",
"tsd.core.auto_create_tagks": "true",
"tsd.core.auto_create_tagvs": "true",
"tsd.core.connections.limit": "0",
"tsd.core.enable_api": "true",
"tsd.core.enable_ui": "true",
"tsd.core.meta.cache.enable": "false",
"tsd.core.meta.enable_realtime_ts": "false",
"tsd.core.meta.enable_realtime_uid": "true",
"tsd.core.meta.enable_tsuid_incrementing": "false",
"tsd.core.meta.enable_tsuid_tracking": "true",
"tsd.core.plugin_path": "/usr/share/opentsdb/plugins",
"tsd.core.preload_uid_cache": "false",
"tsd.core.preload_uid_cache.max_entries": "300000",
"tsd.core.socket.timeout": "0",
"tsd.core.stats_with_port": "false",
"tsd.core.storage_exception_handler.enable": "false",
"tsd.core.tree.enable_processing": "false",
"tsd.core.uid.random_metrics": "false",
"tsd.http.cachedir": "/tmp/opentsdb",
"tsd.http.query.allow_delete": "false",
"tsd.http.request.cors_domains": "",
"tsd.http.request.cors_headers": "Authorization, Content-Type, Accept, Origin, User-Agent, DNT, Cache-Control, X-Mx-ReqToken, Keep-Alive, X-Requested-With, If-Modified-Since",
"tsd.http.request.enable_chunked": "true",
"tsd.http.request.max_chunk": "524288000",
"tsd.http.show_stack_trace": "true",
"tsd.http.staticroot": "/usr/share/opentsdb/static/",
"tsd.mode": "rw",
"tsd.network.async_io": "true",
"tsd.network.keep_alive": "true",
"tsd.network.port": "9999",
"tsd.network.reuse_address": "true",
"tsd.network.tcp_no_delay": "true",
"tsd.network.worker_threads": "",
"tsd.no_diediedie": "false",
"tsd.query.allow_simultaneous_duplicates": "true",
"tsd.query.enable_fuzzy_filter": "true",
"tsd.query.filter.expansion_limit": "4096",
"tsd.query.skip_unresolved_tagvs": "false",
"tsd.query.timeout": "0",
"tsd.rtpublisher.enable": "false",
"tsd.rtpublisher.plugin": "",
"tsd.search.enable": "false",
"tsd.search.plugin": "",
"tsd.startup.enable": "false",
"tsd.startup.plugin": "",
"tsd.stats.canonical": "false",
"tsd.storage.compaction.flush_interval": "10",
"tsd.storage.compaction.flush_speed": "2",
"tsd.storage.compaction.max_concurrent_flushes": "10000",
"tsd.storage.compaction.min_flush_threshold": "100",
"tsd.storage.enable_appends": "false",
"tsd.storage.enable_compaction": "false",
"tsd.storage.fix_duplicates": "true",
"tsd.storage.flush_interval": "1000",
"tsd.storage.hbase.data_table": "tsdb",
"tsd.storage.hbase.meta_table": "tsdb-meta",
"tsd.storage.hbase.prefetch_meta": "false",
"tsd.storage.hbase.scanner.maxNumRows": "128",
"tsd.storage.hbase.tree_table": "tsdb-tree",
"tsd.storage.hbase.uid_table": "tsdb-uid",
"tsd.storage.hbase.zk_basedir": "/hbase",
"tsd.storage.hbase.zk_quorum": "",
"tsd.storage.repair_appends": "false",
"tsd.timeseriesfilter.enable": "false",
"tsd.uidfilter.enable": "false"
}


Thanks for the help in advance.
Raghu

ManOLamancha

نہ پڑھا ہوا،
27 مئی، 2017، 6:16:51 PM27/5/17
بنام OpenTSDB
On Friday, May 19, 2017 at 8:36:21 AM UTC-7, raghu...@gmail.com wrote:
Hi All,

We have been successfully running 2.2.1 for more than 7 months on cloudera cluster dedicated to HBASE, Opentsdb.

CDH version 5.10.0,
Java version 1.8.1_121

we are very interested in new math functions in 2.3.0  and want to take advantage it, after upgrade under heavy load(700K writes/sec in Hbase) Opentsdb hangs with high CPU around 95-98% for 15 to 20 mins.

Here are the couple of error's I see in the logs

2017-04-29 16:11:39,210 [OpenTSDB I/O Worker #6] ERROR [ConnectionManager.exceptionCaught] - Unexpected exception from downstream for [id: 0x405992f2, hbase:51186 :> /opentsdb:9999]
java.lang.IllegalStateException: received HttpChunk without HttpMessage


2017-04-29 15:59:20,712 [OpenTSDB I/O Worker #10] ERROR [ConnectionManager.exceptionCaught] - Unexpected exception from downstream for [id: 0x216c7bc9, /ipaddress:54723 => /ipaddress:9999]
java.io.IOException: Broken pipe

Did you also run the same queries as before to the TSDs or were you trying out the new paths? The "without HttpMessage" is really strange as it's almost like a bad request came into the TSD.  

raghu...@gmail.com

نہ پڑھا ہوا،
24 جولائی، 2017، 11:27:01 AM24/7/17
بنام OpenTSDB،raghu....@mining.komatsu
Apologies that I lost track of it. 
We still see this issue and high write loads, we are still using the old endpoints as the old version.

Thanks,
Raghu 

ManOLamancha

نہ پڑھا ہوا،
11 اگست، 2017، 3:21:21 PM11/8/17
بنام OpenTSDB،raghu....@mining.komatsu
On Monday, July 24, 2017 at 8:27:01 AM UTC-7, raghu...@gmail.com wrote:

We still see this issue and high write loads, we are still using the old endpoints as the old version.

Would you be able to dump a stack trace or a full heap when it hits the high CPU load? And do you have compactions turned off in the TSDB code? Thanks! 

Raghu G

نہ پڑھا ہوا،
28 اگست، 2017، 12:22:31 PM28/8/17
بنام ManOLamancha،OpenTSDB،raghu....@mining.komatsu
Yes, we have compactions turned off.
Stack trace might take a bit time(may be 2 weeks) as I need to create a whole new test environment given that I cannot touch production.

Thanks
Raghu
سبھی کو جواب دیں
مصنف کو جواب دیں
فارورڈ کریں
0 نئے پیغامات