Gangplank: Exporting Keras Metrics to Prometheus

32 views
Skip to first unread message

Carl Meijer

unread,
Jun 18, 2025, 6:56:42 AMJun 18
to Keras-users
Prometheus is a popular software tool for monitoring and alerting. It can gather metrics from applications and infrastructure and its query language, PromQL, can be used to create alerts when things go wrong.

Chapter 7 of François's "Deep Learning with Python, second edition" covers Keras callbacks and the TensorBoard callback for feedback during model training. I thought it would fun (and possibly useful) to write a callback for exposing training/testing loss and other metrics (accuracy, mean absolute error, histograms of model weights, etc.) to Prometheus. So I wrote Gangplank (https://github.com/hammingweight/gangplank). TensorBoard is more than a monitoring tool - it's useful for visualizing how a model works and is behaving. Gangplank is intended solely for monitoring.

Once I'd implemented the training and testing callbacks, I wanted to get Gangplank to emit inference metrics as well. The two metrics that (almost) every service should emit are the number of requests and the duration per request, so Gangplank exposes those to Prometheus. Additionally, if you have sufficient data to implement a meaningful statistical test for drift (data drift or prediction drift), it can expose the test metrics like a p-value or test statistic to Prometheus.

Carl
Reply all
Reply to author
Forward
0 new messages