feed large csv file into the Prometheus

24 views
Skip to first unread message

Mehrdad

unread,
Sep 5, 2024, 9:48:13 AM9/5/24
to Prometheus Users
Hi
how can i feed large csv file into the Prometheus?
Thanks

Brian Candler

unread,
Sep 5, 2024, 10:57:54 AM9/5/24
to Prometheus Users
Prometheus is very specific to timeseries data, and normally new data is ingested as of the current time.

If you have previous timeseries data that you need to import as a one-time activity, then there is "backfilling", see
This is not something you would want to do on a regular basis though.

If the reason for CSV import is you are trying to gather data from remote sites which don't have continuous connectivity, then another option is to run prometheus in those sites in "agent" mode, and have it upload data to another prometheus server using "remote write".

Mehrdad

unread,
Sep 5, 2024, 11:49:19 AM9/5/24
to Prometheus Users
Hi Brian, want to import 10GB csv file into the Prometheus, after that try to run different queries to find out how it performs with data with high cardinality.
This process need to run once, and data belong to last 24 hours of another monitoring tool.
Now which option more suitable? And faster?

Thanks

Brian Candler

unread,
Sep 5, 2024, 1:37:07 PM9/5/24
to Prometheus Users
> Hi Brian, want to import 10GB csv file into the Prometheus, after that try to run different queries to find out how it performs with data with high cardinality.

In prometheus, the timeseries data consists of float values and there's no "cardinality" as such. But each timeseries is determined by its unique set of labels, and if those labels have high cardinality, it will perform very poorly (due to an explosion in the number of distinct timeseries).

> Now which option more suitable? And faster?

More suitable for ingestion into Prometheus? Backfilling via OpenMetrics format is the only approach.

More suitable for your application? I don't know what that application is, or anything about the data you're importing, so I can't really say.

If you have high cardinality and/or non-numeric data then you might want to look at logging systems (e.g. Loki, VictoriaLogs), document databases (e.g. OpenSearch/ElasticSearch, MongoDB), columnar databases (e.g. Clickhouse, Druid) or various other "analytics/big data" platforms.
Reply all
Reply to author
Forward
0 new messages