Increase Bytes Processed per Second

186 views

Skip to first unread message

גיל מרי‎

unread,

Jul 20, 2021, 2:29:53 AM7/20/21

to lokiproject

I'm running 3 Queriers with 12 cpu's each behind 2 query frontends but only getting around 500MB of bytes processed per second. This speed makes Loki metric queries unusable, and after testing Loki for 2 months we are highly disappointed with the performance we are getting.

I fear there is some bottleneck between S3 and the queriers but i'm not really sure about that at all.

What actions go into increasing query speeds? I removed all caches from the configuration so I can better test the s3 to queriers speed.

Logcli stats results of a 4 hour query:

Ingester.TotalReached 0

Ingester.TotalChunksMatched 0

Ingester.TotalBatches 0

Ingester.TotalLinesSent 0

Ingester.HeadChunkBytes 0B

Ingester.HeadChunkLines 0

Ingester.DecompressedBytes 0

Ingester.DecompressedLines 0

Ingester.CompressedBytes 0B

Ingester.TotalDuplicates 0

Store.TotalChunksRef 450

Store.TotalChunksDownloaded 450

Store.ChunksDownloadTime 25.736060467s

Store.HeadChunkBytes 0B

Store.HeadChunkLines 0

Store.DecompressedBytes 7.5GB

Store.DecompressedLines 42448956

Store.CompressedBytes 1.5GB

Store.TotalDuplicates 6294356

Summary.BytesProcessedPerSecond 586MB

Summary.LinesProcessedPerSecond 3297466

Summary.TotalBytesProcessed 7.5GB

Summary.TotalLinesProcessed 42448956

Summary.ExecTime 12.873201912s

My Loki Configuration:
auth_enabled: false

server:

http_listen_port: 3100

grpc_listen_port: 9096

grpc_server_max_concurrent_streams: 1000

grpc_server_max_recv_msg_size: 104857600

grpc_server_max_send_msg_size: 104857600

http_server_idle_timeout: 120s

http_server_write_timeout: 1m

distributor:

ring:

kvstore:

store: memberlist

ingester:

chunk_encoding: snappy

chunk_block_size: 262144

chunk_target_size: 4000000

chunk_idle_period: 15m

lifecycler:

heartbeat_period: 5s

join_after: 30s

num_tokens: 512

ring:

heartbeat_timeout: 1m

kvstore:

store: memberlist

replication_factor: 3

final_sleep: 0s

max_transfer_retries: 60

ingester_client:

grpc_client_config:

max_recv_msg_size: 67108864

remote_timeout: 1s

frontend:

compress_responses: true

log_queries_longer_than: 5s

max_outstanding_per_tenant: 1024

frontend_worker:

frontend_address: frontend:9096

grpc_client_config:

max_send_msg_size: 104857600

parallelism: 12

limits_config:

enforce_metric_name: false

ingestion_burst_size_mb: 10

ingestion_rate_mb: 5

ingestion_rate_strategy: local

max_global_streams_per_user: 10000

max_query_length: 12000h

max_query_parallelism: 254

max_streams_per_user: 0

reject_old_samples: true

reject_old_samples_max_age: 168h

querier:

query_ingesters_within: 2h

query_range:

align_queries_with_step: true

cache_results: false

max_retries: 5

parallelise_shardable_queries: false

split_queries_by_interval: 15m

compactor:

working_directory: /opt/loki/compactor

shared_store: s3

compaction_interval: 30m

schema_config:

configs:

- from: 2020-06-10

store: boltdb-shipper

object_store: s3

schema: v11

index: prefix: test_

period: 24h

storage_config:

aws:

bucketnames: bucket_name

endpoint: endpoint

region: region

access_key_id: mysecret_key_id

secret_access_key: mysecret_access_key

http_config:

idle_conn_timeout: 90s

response_header_timeout: 0s

insecure_skip_verify: true

s3forcepathstyle: true

boltdb-shipper:

active_index_directory /opt/loki/boltdb-shipper-active

cache_location /opt/loki/boltdb-shipper-cache

shared_store: s3

memberlist:

abort_if_cluster_join_fails: false

bind_addr:

- the_bind_ip_address

bind_port: 7946

join_members:

- ip_address:7946 - off_all_the_loki_components:7946

Reply all

Reply to author

Forward

0 new messages