Scraping applications deployed on Cloud Foundry

1,384 views
Skip to first unread message

Tommy Ludwig

unread,
Feb 7, 2018, 3:48:38 AM2/7/18
to Prometheus Users
Given the below assumptions, I am having a difficult time figuring out the best way to get metrics into Prometheus from my applications deployed on Cloud Foundry.
  1. Prometheus does not recommend using a push gateway for non-batch applications (documentation)
  2. Multiple instances of applications deployed on Cloud Foundry are generally not individually addressable without using a header with the instance ID (documentation)
  3. There is no way to configure Prometheus to send the necessary header (GitHub issue #1724)
  4. There is no Cloud Foundry service discovery configuration (documentation)
Given the popularity of Prometheus and Cloud Foundry, I imagine there must be others wanting to use the two together - in fact, there were some Cloud Foundry users that commented on GitHub issue #1724.

How are others accomplishing this? Is the recommendation to (ab)use a pushgateway?

I appreciate any help anyone can offer on this.

Brian Brazil

unread,
Feb 7, 2018, 4:04:06 AM2/7/18
to Tommy Ludwig, Prometheus Users
You could write a small proxy server that converted a URL parameter into a header. Unfortunately options are limited with solutions that don't allow direct network access, which is what Prometheus generally presumes. 

--

Tommy Ludwig

unread,
Feb 22, 2018, 4:47:26 AM2/22/18
to Prometheus Users
Thank you for the response.

When actually looking at how I might do this, even with some hacks, I'm still not seeing it being possible with file_sd. Assuming I can get the information I need to be in the headers and I have a proxy that converts URL parameters into headers, how can I dynamically set URL parameters per target? Using files_sd, I can only write a JSON file with targets (host:port) and labels, if I understand correctly. I cannot specify URL parameters in there; those are specified at the job level, which is static.

If I have 3 instances of appA with its metrics scrape endpoint at some.domain/appA/metrics. I need Prometheus to scrape like the following (ex using curl)

curl -H "X-CF-APP-INSTANCE:b1aae7ce-89bd-4ea3-9d2e-6324b64d0076:0" http://some.domain/appA/metrics
curl -H "X-CF-APP-INSTANCE:b1aae7ce-89bd-4ea3-9d2e-6324b64d0076:1" http://some.domain/appA/metrics
curl -H "X-CF-APP-INSTANCE:b1aae7ce-89bd-4ea3-9d2e-6324b64d0076:2" http://some.domain/appA/metrics

Note that the APP_GUID will change when we deploy new versions of appA, so hard-coding that in the config file doesn't work. That is, it needs to somehow be included in the dynamic service discovery, but as far as I can tell, file_sd only allows host, port, and labels.

It seems like we would need to dynamically update the config file itself in order to specify the URL parameters we would need to give to the proxy that would convert them to headers. Even then, we would need a job per instance to set the right URL parameter, it seems? Is dynamically changing the config file often viable?

Rather than doing something like above, it seems like a better idea to just use the push gateway or otherwise collect metrics somewhere intermediary that can be scraped with Prometheus' current configuration capabilities.

Let me know if I'm missing anything here or there are any better approaches.

Thank you,
Tommy Ludwig

2018年2月7日水曜日 18時04分06秒 UTC+9 Brian Brazil:

Brian Brazil

unread,
Feb 22, 2018, 5:21:02 AM2/22/18
to Tommy Ludwig, Prometheus Users
On 22 February 2018 at 09:47, Tommy Ludwig <tommy.lu...@gmail.com> wrote:
Thank you for the response.

When actually looking at how I might do this, even with some hacks, I'm still not seeing it being possible with file_sd. Assuming I can get the information I need to be in the headers and I have a proxy that converts URL parameters into headers, how can I dynamically set URL parameters per target? Using files_sd, I can only write a JSON file with targets (host:port) and labels, if I understand correctly. I cannot specify URL parameters in there; those are specified at the job level, which is static.

If I have 3 instances of appA with its metrics scrape endpoint at some.domain/appA/metrics. I need Prometheus to scrape like the following (ex using curl)

curl -H "X-CF-APP-INSTANCE:b1aae7ce-89bd-4ea3-9d2e-6324b64d0076:0" http://some.domain/appA/metrics
curl -H "X-CF-APP-INSTANCE:b1aae7ce-89bd-4ea3-9d2e-6324b64d0076:1" http://some.domain/appA/metrics
curl -H "X-CF-APP-INSTANCE:b1aae7ce-89bd-4ea3-9d2e-6324b64d0076:2" http://some.domain/appA/metrics

Note that the APP_GUID will change when we deploy new versions of appA, so hard-coding that in the config file doesn't work. That is, it needs to somehow be included in the dynamic service discovery, but as far as I can tell, file_sd only allows host, port, and labels.

It seems like we would need to dynamically update the config file itself in order to specify the URL parameters we would need to give to the proxy that would convert them to headers. Even then, we would need a job per instance to set the right URL parameter, it seems? Is dynamically changing the config file often viable?

Service discovery always needs to be updated when things change. It's normal to update file_sd regularly. You can set the label __param_foo to set the foo URL parameter.
 
Rather than doing something like above, it seems like a better idea to just use the push gateway or otherwise collect metrics somewhere intermediary that can be scraped with Prometheus' current configuration capabilities.

This is not what the pushgateway is for.

Brian
 

Let me know if I'm missing anything here or there are any better approaches.

Thank you,
Tommy Ludwig

2018年2月7日水曜日 18時04分06秒 UTC+9 Brian Brazil:
On 7 February 2018 at 08:48, Tommy Ludwig <tommy.lu...@gmail.com> wrote:
Given the below assumptions, I am having a difficult time figuring out the best way to get metrics into Prometheus from my applications deployed on Cloud Foundry.
  1. Prometheus does not recommend using a push gateway for non-batch applications (documentation)
  2. Multiple instances of applications deployed on Cloud Foundry are generally not individually addressable without using a header with the instance ID (documentation)
  3. There is no way to configure Prometheus to send the necessary header (GitHub issue #1724)
  4. There is no Cloud Foundry service discovery configuration (documentation)
Given the popularity of Prometheus and Cloud Foundry, I imagine there must be others wanting to use the two together - in fact, there were some Cloud Foundry users that commented on GitHub issue #1724.

How are others accomplishing this? Is the recommendation to (ab)use a pushgateway?
 
You could write a small proxy server that converted a URL parameter into a header. Unfortunately options are limited with solutions that don't allow direct network access, which is what Prometheus generally presumes. 

--

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/3aa4e3c7-552b-436b-bad5-f73109979262%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--

Tommy Ludwig

unread,
Feb 22, 2018, 5:34:30 AM2/22/18
to Prometheus Users
Thank you for the super quick response.

You can set the label __param_foo to set the foo URL parameter.

I didn't realize this was a possible way to set a URL parameter for a target. I will try to do things that way and report back.

Thank you.

2018年2月22日木曜日 19時21分02秒 UTC+9 Brian Brazil:
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.



--

2018年2月22日木曜日 19時21分02秒 UTC+9 Brian Brazil:
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.



--
Message has been deleted
Message has been deleted
Message has been deleted

anja...@gmail.com

unread,
Mar 28, 2018, 5:17:52 PM3/28/18
to Prometheus Users
Hey Tommy, were you able to get it to work?

I have a service discovery file for CF apps like this.

[
  {
    "targets": [ "my.cf.app.com" ],
    "labels": {
      "cf_index": "0",
      "__param_x-cf-app-instance": "my-app-guid:0"
    }
  },
  {
    "targets": [ "my.cf.app.com" ],
    "labels": {
      "cf_index": "1",
      "__param_x-cf-app-instance": "my-app-guid:1"
    }
  }
]

But this produces the error: 

time="2018-03-28T21:10:06Z" level=error msg="Error reading file "/sd_data/cf_targets.json": "__param_x-cf-app-instance" is not a valid label name" source="file.go:200"

The problem appears to be that Prometheus does not accept dashes in the name of the parameter. Unfortunately, that param appears to be only way to accomplish CF service discovery. https://docs.cloudfoundry.org/devguide/deploy-apps/routes-domains.html.

 Am I using the "__param_foo" label in the way Brian implied in his response?

Brian Brazil

unread,
Mar 28, 2018, 6:58:37 PM3/28/18
to anja...@gmail.com, Prometheus Users
It's expected that things using params were designed to work with Prometheus, so hyphens in param names aren't going to work. That wouldn't help you with CloudFoundary anyway, as that requires a header.

--

anja...@gmail.com

unread,
Mar 28, 2018, 7:04:03 PM3/28/18
to Prometheus Users
Ah, yes. I misread the discussion before me, sorry.

Matt Doughty

unread,
Mar 31, 2018, 2:02:30 AM3/31/18
to Brian Brazil, Prometheus Users, anja...@gmail.com
FYI, we just solved this problem for CF. We used the a graphql interface to CF to figure out the GUID and number of instances for the app/route. We used that to generate targets that looked like:

For consumption by the file based service discovery for Prometheus.

Then we would rewrite the address to send it to a locally running nginx proxy that takes the first to parts of the target to set the X-CF-APP-INSTANCE header and then send that request to my app.example.com.

The end result had two extra moving parts: the service discovery tool, and the local nginx proxy. It has worked very well for us so if you are still trying to figure this out feel free to ask me about it.

—Matt

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
--Matt

git...@schmoigl-online.de

unread,
Mar 31, 2018, 5:08:29 AM3/31/18
to Prometheus Users

FYI, we just solved this problem for CF. We used the a graphql interface to CF to figure out the GUID and number of instances for the app/route. We used that to generate targets that looked like:
<instance>.<guid>.public.route.to.app e.g. 0.12ffgds03.myapp.example.com.

For consumption by the file based service discovery for Prometheus.

Then we would rewrite the address to send it to a locally running nginx proxy that takes the first to parts of the target to set the X-CF-APP-INSTANCE header and then send that request to my app.example.com.

The end result had two extra moving parts: the service discovery tool, and the local nginx proxy. It has worked very well for us so if you are still trying to figure this out feel free to ask me about it.


Nice approach using "virtual hostnames", sounds quite tricky to me.
It's quite the same approach as Promregator uses with its Single Target Scraping mode. For several reasons, however, the Single Endpoint Scraping mode is preferred there.
I'd be interesting in hearing your point of view on the pros and cons comparing your approach with that one of promregator.

Brian Brazil

unread,
Mar 31, 2018, 5:33:47 AM3/31/18
to git...@schmoigl-online.de, Prometheus Users
Prometheus should be the one determining when scrapes happen and determining the target labels. This avoids a big load spike when everything is scraped at once, and gives the user choice over what they want the target labels to be via relabelling.

-- 

git...@schmoigl-online.de

unread,
Mar 31, 2018, 5:52:45 AM3/31/18
to Prometheus Users

Prometheus should be the one determining when scrapes happen and determining the target labels. This avoids a big load spike when everything is scraped at once, and gives the user choice over what they want the target labels to be via relabelling.


Yes, I know. That's why Promregator also provides the Single Target Scraping mode (in the meantime) - as some sort of compensation to Prometheus' inability to handle HTTP headers with target-dependent values. Yet, not everyone has a use case which requires scraping hundreds of targets all the time, but instead more fears complex configuration.

Oleg Mayko

unread,
Aug 14, 2018, 10:32:40 AM8/14/18
to Prometheus Users
Hi, maybe obsolete but i found this https://github.com/promregator/promregator and it looks exactly you are looking for. 

nikko...@gmail.com

unread,
Sep 13, 2018, 3:09:46 PM9/13/18
to Prometheus Users
I managed to do this with Prometheus DNS service discovery + Cloud foundry networking policy (*.apps.internal route)


you will need configuration something like this:

scrape_configs:
- job_name: 'dns'
  metrics_path: '/actuator/prometheus'
  dns_sd_configs: 
  - names:
    - backend.apps.internal
    port: 8080
    type: A

  1. deploy prometheus (go to root folder and execute cf push prometheus-service -b binary_buildpack -c './prometheus --web.listen-address=:8080' -m 64m)
  2. deploy app
  3. add policy, ie. cf add-network-policy prometheus-service --destination-app backend --protocol tcp --port 8080
hope this helps.
Nikola

Reply all
Reply to author
Forward
0 new messages