Idea was to store netflow(ipfix) data in clickhouse.
I have a table:
CREATE TABLE IF NOT EXISTS default.netflow (
FlowDate Date DEFAULT toDate(FlowDateTime),
FlowDateTime DateTime,
SourceIPV4 UInt32,
DestinationIPV4 UInt32,
SourcePort UInt16,
DestinationPort UInt16,
tcpControlBits UInt16,
tcpOptions UInt64,
protocolIdentifier UInt8,
octetDeltaCount UInt64,
packetDeltaCount UInt64
) ENGINE=MergeTree(FlowDate, (FlowDateTime, SourceIPV4, DestinationIPV4, SourcePort, DestinationPort), 8192);
It was working great with small amount of netflow data, however when we tried it with production data sizes/rates clickhouse just died
It suddenly stopped listening on ports, while still running.
It wont stop (only by SIGKILL), and when started again behaves the same.
And there's nothing suspicious in logs.
Also, there's over 6M files in data dir and, imho, that's a really big number of small files.
I'm pretty sure i'm doing something wrong (keys? granularity?)