Cryostat automated rules expected behavior

11 views
Skip to first unread message

Dimitris Koumantos

unread,
Oct 12, 2022, 11:31:27 AM10/12/22
to Cryostat Development List
Hi,

I am currently working on a POC with Cryostat in kubernetes and I am having trouble understanding the expected behavior of the automated rules.

Cryostat detects my pods (JVM targets) using the `jfr-jmx` port of the respective service.
When I create an automated rule it is ineed applied to existing JVM targets based on the matching expression and new recordings are created.

However, If I restart the (kubernetes) deployment and new pods get created under the same service, no recordings are created for them, despite Cryostat detecting them as JVM targets.

My question is this:

Should automated rules create new recordings for pods/JVM targets (under an existing service that is already watched by Cryostat) that are deployed *after* the rule is created?

Kind regards,
Dimitris.

Andrew Azores

unread,
Oct 12, 2022, 11:42:09 AM10/12/22
to cryostat-d...@googlegroups.com
Hi Dimitris,
Yes, target JVMs that appear after a rule is created are intended to
cause the rule to trigger against those new JVMs. A new active recording
should get created, and if you've configured it then that recording's
contents should also be periodically copied to archives.

I will try to reproduce this problem and see if there is a bug where
this isn't working as expected.

In the meantime, would you be able to share some logs from the Cryostat
deployment's main container during one of these runs where it fails to
activate the rule?

Which version of Cryostat are you testing?

And finally, what is the matchExpression of the rule that you defined?
It's a silly question, but are you sure that the newly appearing Pods
still do match the expression as intended?

Thanks,
--
Andrew Azores
Principal Software Engineer, Java Monitoring
Red Hat Canada

Andrew Azores

unread,
Oct 12, 2022, 11:43:14 AM10/12/22
to cryostat-d...@googlegroups.com, dkoum...@verge.capital
Re-posting to include Dimitris - original response only went on-list.

Dimitris Koumantos

unread,
Oct 12, 2022, 12:19:15 PM10/12/22
to Cryostat Development List
Hi Andrew,

Thank you for the quick reply.
I've installed cryostat helm chart and as I can see from the cyostat's pod,
this is the image used: `quay.io/cryostat/cryostat:2.1.1` .
So I guess I am using cryostat v2.1.1.

My matching expression is inspired by your article and is like this:

```
target.annotations.platform['jfr'] == 'enabled'
```

If I create a new rule with the same matching expression the newly created pod is matched.
Either way, I can see the custom annotation in the newly created pod's description, so yes I am sure the new pod should match the expression.

Please find attached the logs of the main container you requested.

Best,
Dimitris.
cryostat_logs.txt

Andrew Azores

unread,
Oct 12, 2022, 1:30:44 PM10/12/22
to Dimitris Koumantos, Cryostat Development List
Hi Dimitris,

On 2022-10-12 12:19, Dimitris Koumantos wrote:
> Hi Andrew,
>
> Thank you for the quick reply.
> I've installed cryostat helm chart and as I can see from the cyostat's pod,
> this is the image used: `quay.io/cryostat/cryostat:2.1.1` .
> So I guess I am using cryostat v2.1.1.
>
> My matching expression is inspired by your article
> <https://developers.redhat.com/articles/2021/11/09/automating-jdk-flight-recorder-containers#use_case_2__custom_monitoring_with_kubernetes_labels_or_annotations> and is like this:
>
> ```
> target.annotations.platform['jfr'] == 'enabled'
> ```
>
> If I create a new rule with the same matching expression the newly
> created pod is matched.
> Either way, I can see the custom annotation in the newly created pod's
> description, so yes I am sure the new pod should match the expression.
>
> Please find attached the logs of the main container you requested.
>
> Best,
> Dimitris.
>

Excellent, thanks for all of that information. Your match expression and
use of annotations makes good sense.

In your container logs I see what I believe to be the root cause:

```
Exception in thread "main"
io.fabric8.kubernetes.client.KubernetesClientException: Failed to start
websocket
```

Cryostat 2.1.1 uses a k8s API query to populate the full list of target
applications, so when you open the Cryostat web UI this issues a `GET
/api/v1/targets` Cryostat API request which becomes a k8s API query. So,
this is how you're able to see your new pods appearing.

But Cryostat also uses a k8s watcher to listen for changes and detect
when target applications come and go. The WebSocket exception above
indicates that this watcher connection failed, so Cryostat isn't
receiving any of these async notifications from the k8s API server.

Unfortunately, this also means that some of Cryostat's internal
machinery that depends on these async notifications won't work -
including the rule processing that triggers rule activation against new
target applications. This same mechanism enables async updates of the
UI, so with the breakage you're experiencing I would expect that the UI
does not auto-update when you redeploy your applications?

In Cryostat 2.2 [0] I have done some work in this area [1] which
improves the reliability of that connection to the k8s API server and
unifies the data model between queries and async updates. This might
help in your case, too, though it's curious that the connection fails to
begin with. I don't know what could be causing the Watcher WebSocket to
fail with "No route to host", while the later queries to the same API
server apparently succeed.

[0] expected GA: November 15th. If you want to try out an upstream
preview build for your PoC before that date please just ask for help
setting that up, but be aware that this would be an unsupported upstream
community preview build
[1] https://github.com/cryostatio/cryostat/pull/1037

Dimitris Koumantos

unread,
Oct 13, 2022, 4:41:28 AM10/13/22
to Andrew Azores, Cryostat Development List
Hi again Andrew,

Your response makes absolute sense.

I forgot to mention that I am running Cryostat on minikube.
I cannot imagine how that could cause the issue
but it's better to document it here for historic purposes.

After several tests and restarts (in both cryostat deployment and minikube),
I have concluded that the issue does not occur at all times but occasionally.
The UI, indeed, does not auto-refresh when the issue appears.

Thank you for the suggestion to build my PoC with an upstream preview build,
but at the moment I can afford creating rules manually after each release/deployment.
Therefore, I am going to watch the main repo for new releases.


Your help is much appreciated!

Best,
Dimitris
 
Reply all
Reply to author
Forward
0 new messages