jaeger has no endpoint

1,907 views
Skip to first unread message

zack

unread,
May 30, 2020, 6:01:46 AM5/30/20
to Jaeger Tracing
I met with some problems when use jaeger operator. Some error occures in logs.
# k logs -f jaeger-query-99bc97598-7bsqv jaeger-agent

{"level":"warn","ts":1590767929.0646005,"caller":"gr...@v1.27.1/clientconn.go:1223","msg":"grpc: addrConn.createTransport failed to connect to {10.233.75.44:14250 <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 10.233.75.44:14250: connect: connection refused\". Reconnecting...","system":"grpc","grpc_log":true}

Then I consult the IP, then found no such endpoint matched.

```
jaeger-operator istio-system 1 2020-05-30 07:06:43.535284201 +0000 UTC deployed jaeger-operator-2.14.2 1.17.1
[root@node1 ~]# k get ep | grep jaeger
jaeger-collector-headless 10.233.102.179:9411,10.233.102.179:14268,10.233.102.179:14250 + 1 more... 5h19m
jaeger-operator <none> 5h19m
jaeger-operator-metrics <none> 169m
jaeger-query 10.233.71.56:16686 5h19m
```

Then found jaeger-operator and jaeger-operator-metrics has no endpoints. What's the matter?How can I fix it?

I installe jaeger with the chart in official repositry and consult the official steps.

My version is 1.17.

[root@node1 ~]# helm list | grep jaeger
jaeger-operator istio-system 1 2020-05-30 07:06:43.535284201 +0000 UTC deployed jaeger-operator-2.14.2 1.17.1

Here is my crd:

k get jaeger -o yaml

```
apiVersion: v1
items:
- apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
annotations:
{"apiVersion":"jaegertracing.io/v1","kind":"Jaeger","metadata":{"annotations":{},"name":"jaeger","namespace":"istio-system"},"spec":{"agent":{"image":"jaegertracing/jaeger-agent:1.17"},"collector":{"image":"jaegertracing/jaeger-collector:1.17"},"query":{"image":"jaegertracing/jaeger-query:1.17"},"storage":{"dependencies":{"enabled":false},"esIndexCleaner":{"enabled":false},"options":{"es":{"index-prefix":"logstash","server-urls":"http://elasticsearch-logging-data.kubesphere-logging-system.svc.cluster.local:9200"}},"type":"elasticsearch"},"strategy":"production"}}
creationTimestamp: "2020-05-30T04:36:43Z"
generation: 3
labels:
jaegertracing.io/operated-by: istio-system.jaeger-operator
name: jaeger
namespace: istio-system
resourceVersion: "224456"
uid: d65d2ed4-0a84-4061-ac8a-85c858ce5da9
spec:
agent:
image: jaegertracing/jaeger-agent:1.17
options: {}
resources: {}
allInOne:
options: {}
resources: {}
collector:
image: jaegertracing/jaeger-collector:1.17
options: {}
resources: {}
ingester:
options: {}
resources: {}
ingress:
openshift: {}
options: {}
...
resources:
limits:
memory: 16Gi
requests:
cpu: "1"
memory: 16Gi
storage: {}
esIndexCleaner:
enabled: false
image: jaegertracing/jaeger-es-index-cleaner:1.17.1
numberOfDays: 7
resources: {}
schedule: 55 23 * * *
esRollover:
image: jaegertracing/jaeger-es-rollover:1.17.1
resources: {}
schedule: 0 0 * * *
options:
es:
index-prefix: logstash
type: elasticsearch
strategy: production
ui:
options:
dependencies:
menuEnabled: false
menu:
- items:
- label: Documentation
label: About
status:
phase: Running
version: 1.17.1
kind: List
metadata:
resourceVersion:
```

Can someone would help me?
Thanks.

Gary Brown

unread,
Jun 1, 2020, 3:28:55 AM6/1/20
to zack, Jaeger Tracing
This is just a warning, probably due to the collector not being available when the agent was initially starting up.

When you access the Jaeger UI multiple times, after a page refresh on the search page, do you end up seeing the "jaeger-query" service in the list of services, with traces? If so, then it got resolved.

Or do you continue to see the connection refused messages?

Regards
Gary


--
You received this message because you are subscribed to the Google Groups "Jaeger Tracing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jaeger-tracin...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jaeger-tracing/33a7a82b-b4b1-4d26-bd2a-5fd66ef9d081%40googlegroups.com.
Message has been deleted
Message has been deleted

zack

unread,
Jun 18, 2020, 5:29:49 AM6/18/20
to Jaeger Tracing
@GaryThanks a lot for you replying.

I have the doubt, because the endpoints exist in an old version(v1.13.1), but it disapeard after upgraded to a new version(v1.17.1)


Before uprade:

```
 # helm list|grep jaeger
jaeger-operator 2 Sat May 30 08:32:57 2020 DEPLOYED jaeger-operator-2.9.0 1.13.1

 # k get ep
jaeger-operator 10.233.99.145:8383 13d
jaeger-operator-metrics 10.233.99.145:8383   
```

After upgrade:



I am not sure if it is ok for the endpoints disappead.

Gary Brown

unread,
Jun 18, 2020, 7:40:11 AM6/18/20
to zack, Juraci Paixão Kröhling, Jaeger Tracing
@Juraci Paixão Kröhling
May be better to answer this - but as far as I am aware, the metrics service was added as part of the operator-sdk.

Other than that, I don't think the operator had a service. How are you deploying the operator?

--
You received this message because you are subscribed to the Google Groups "Jaeger Tracing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jaeger-tracin...@googlegroups.com.

Juraci Paixão Kröhling

unread,
Jun 18, 2020, 7:59:37 AM6/18/20
to jaeger-...@googlegroups.com
On 6/18/20 1:39 PM, Gary Brown wrote:
> @Juraci Paixão Kröhling <mailto:jpkro...@redhat.com>
> May be better to answer this - but as far as I am aware, the metrics
> service was added as part of the operator-sdk.
>

That's correct. Both ports are metrics-related and are being made
available via the Operator SDK. Looking at the code for the
operator-sdk, I'd expect an Endpoint to exist, as the service has a
selector:

https://github.com/operator-framework/operator-sdk/blob/e35ec7b722ba095e6438f63fafb9e7326870b486/pkg/metrics/metrics.go#L101-L131

Zack: could you please check whether your `-metrics` service has a
selector? Or better yet, could you please share your service definition?
Something like: `kubectl get service jaeger-operator-metrics -o yaml`.
One extra thing you could try is to remove the service and start the
Jaeger Operator again. The service should be re-created upon startup,
perhaps the endpoint might be created as well.

- Juca.

Message has been deleted
Message has been deleted

zack

unread,
Jun 19, 2020, 2:17:11 AM6/19/20
to Jaeger Tracing
The jaeger-operator-metric service has no related pods.

```
hugo@zack:/Users/hugo $ k get svc jaeger-operator-metrics -o yaml | grep select -A 5
...
service
  selector:
    name: jaeger-operator
  sessionAffinity: None
  type: ClusterIP

hugo@zack:/Users/hugo $ k get po -l name=jaeger-operator
No resources found in istio-system namespace.


The jaeger-opeator sevice has related pods , but also has no endpoints.


```
hugo@zack:/Users/hugo $ k get svc jaeger-operator -o yaml | grep select -A 10
  selector:
    app.kubernetes.io/instance: jaeger-operator
    app.kubernetes.io/name: jaeger-operator
  sessionAffinity: None
  type: ClusterIP

hugo@zack:/Users/hugo $ k get po -l app.kubernetes.io/name=jaeger-operator
NAME                              READY   STATUS    RESTARTS   AGE
jaeger-operator-bfb5f956f-gmkqg   1/1     Running   0          19h

hugo@zack:/Users/hugo $ k get ep |grep jaeger-operator
jaeger-operator             <none>                                                                    19h
jaeger-operator-metrics     <none>                                                                    43m
``` 

Juraci Paixão Kröhling

unread,
Jun 19, 2020, 3:34:10 AM6/19/20
to jaeger-...@googlegroups.com
Where is your `jaeger-operator` _service_ coming from? We don't create
such a service in the jaeger-operator itself. Based on the selector
snippet you shared, looks like you are provisioning a jaeger-operator
via Helm? Are you using the official Helm charts? They also don't seem
to create a `jaeger-operator` service...

Whatever is provisioning your jaeger-operator deployment is *not* using
the right labels, which is why you aren't seeing any pods backing the
service (and hence, no endpoints). Your operator deployment should look
as close as possible to our official reference. In particular, your
deployment seem to be missing the `name: jaeger-operator` label, like this:

https://github.com/jaegertracing/jaeger-operator/blob/ec535126223c87c7358ef1dfb6c4af588d4e07b3/deploy/operator.yaml#L12-L13

When provisioning the Jaeger Operator following the official
instructions (no Helm involved), here's what I get:

```console
$ kubectl get pods -l name=jaeger-operator
NAME READY STATUS RESTARTS AGE
jaeger-operator-78ff8688c6-zp9z9 1/1 Running 0 7m17s

$ kubectl get services jaeger-operator-metrics
NAME TYPE CLUSTER-IP EXTERNAL-IP
PORT(S) AGE
jaeger-operator-metrics ClusterIP 10.111.63.253 <none>
8383/TCP,8686/TCP 9m12s

$ kubectl get endpoints jaeger-operator-metrics
NAME ENDPOINTS AGE
jaeger-operator-metrics 10.88.0.5:8686,10.88.0.5:8383 9m27s
```

You are not seeing any endpoints for the jaeger-operator service
probably because the pods don't contain *all* the labels from its
selector. It requires the pod to contain all these labels:

> app.kubernetes.io/instance: jaeger-operator
> app.kubernetes.io/managed-by: Tiller
> app.kubernetes.io/name: jaeger-operator

- Juca.

zack

unread,
Jun 19, 2020, 4:17:20 AM6/19/20
to Jaeger Tracing
In the chart, the selector of jaeger-operator-metrics service  is as follows:

"  selector:
    app.kubernetes.io/instance: jaeger-operator
    app.kubernetes.io/name: jaeger-operator
"

when I get the manifest of the chart, it also matches the definition as in the chart.





Mine is the same as your enviroment, why the selector becomes "name=jaeger-operator"?


Juraci Paixão Kröhling

unread,
Jun 19, 2020, 4:33:15 AM6/19/20
to jaeger-...@googlegroups.com
On 6/19/20 10:17 AM, zack wrote:
> In the chart, the selector of jaeger-operator-metrics service  is as
> follows:

This service is created by the operator and should not be created by any
Helm chart.

- Juca.

zack

unread,
Jun 19, 2020, 5:11:27 AM6/19/20
to Jaeger Tracing
I mean the jaeger officiall repo.





But why the selector of service not the same with its template, I cannot understand.

the selector of jaeger-operator-metrics service is "name=jaeger-operator".
But the pod label is "app.kubernetes.io/name=jaeger-operator" same as the manifest in the chart release.

Why the selector of the jaeger-operator-metrics service changed?




zack

unread,
Jun 19, 2020, 5:25:37 AM6/19/20
to Jaeger Tracing
I install helm as official . I met with the same problem.

Juraci Paixão Kröhling

unread,
Jun 19, 2020, 5:26:35 AM6/19/20
to jaeger-...@googlegroups.com
On 6/19/20 11:11 AM, zack wrote:
> I mean the jaeger officiall repo.
>
> https://github.com/jaegertracing/helm-charts/blob/master/charts/jaeger-operator/templates/service.yaml

It doesn't make it right :)

https://github.com/jaegertracing/helm-charts/issues/126

- Juca.

Juraci Paixão Kröhling

unread,
Jun 19, 2020, 5:27:52 AM6/19/20
to jaeger-...@googlegroups.com
On 6/19/20 11:25 AM, zack wrote:
> I install helm as official . I met with the same problem.

Try to delete the service and the running jaeger-operator pod. Once the
jaeger-operator starts again, it should attempt to create the
jaeger-operator-metrics service again, with the proper configuration.

- Juca.

zack

unread,
Jun 19, 2020, 5:33:09 AM6/19/20
to Jaeger Tracing
when I delete the service , the service cannot create by itself.

Juraci Paixão Kröhling

unread,
Jun 19, 2020, 5:56:04 AM6/19/20
to jaeger-...@googlegroups.com
Have you deleted the jaeger-operator pod as well?

```console
$ kubectl get service jaeger-operator-metrics
NAME TYPE CLUSTER-IP EXTERNAL-IP
PORT(S) AGE
jaeger-operator-metrics ClusterIP 10.111.63.253 <none>
8383/TCP,8686/TCP 152m

$ kubectl get pods -l name=jaeger-operator
NAME READY STATUS RESTARTS AGE
jaeger-operator-78ff8688c6-zp9z9 1/1 Running 0 151m

$ kubectl delete service jaeger-operator-metrics
service "jaeger-operator-metrics" deleted

$ kubectl delete pods -l name=jaeger-operator
pod "jaeger-operator-78ff8688c6-zp9z9" deleted

$ kubectl get pods -l name=jaeger-operator
NAME READY STATUS RESTARTS AGE
jaeger-operator-78ff8688c6-x799j 1/1 Running 0 9s

$ kubectl get service jaeger-operator-metrics
NAME TYPE CLUSTER-IP EXTERNAL-IP
PORT(S) AGE
jaeger-operator-metrics ClusterIP 10.106.178.130 <none>
8383/TCP,8686/TCP 9s
```

- Juca.


On 6/19/20 11:33 AM, zack wrote:
> when I delete the service , the service cannot create by itself.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Jaeger Tracing" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to jaeger-tracin...@googlegroups.com
> <mailto:jaeger-tracin...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jaeger-tracing/ea89c58b-72bd-45e3-b8f7-fbeb58faabe3o%40googlegroups.com
> <https://groups.google.com/d/msgid/jaeger-tracing/ea89c58b-72bd-45e3-b8f7-fbeb58faabe3o%40googlegroups.com?utm_medium=email&utm_source=footer>.

zack

unread,
Jun 19, 2020, 6:10:20 AM6/19/20
to Jaeger Tracing
I delete pod and svc, then pod can create by itself. But the service cannot.


Juraci Paixão Kröhling

unread,
Jun 19, 2020, 7:01:37 AM6/19/20
to jaeger-...@googlegroups.com
On 6/19/20 12:10 PM, zack wrote:
> I delete pod and svc, then pod can create by itself. But the service cannot.

Could you please share the logs in debug mode? From this point, I'd ask
you to open an issue against the jaeger-operator repository, as it's
easier to keep track of the issue.

- Juca.

zack

unread,
Jun 19, 2020, 7:07:59 AM6/19/20
to Jaeger Tracing
I alse think this is a bug. and  I try to modify the chart lable varible to make po/deploy/service stay in same, but failed.

The label defination in the chart doesn't take effect.

```
root@ks-allinone:/root # k logs -f jaeger-operator-5dbf4647d5-l465m -v 7
I0619 19:07:45.167345   22628 loader.go:375] Config loaded from file:  /root/.kube/config
I0619 19:07:45.176011   22628 round_trippers.go:420] GET https://ks-allinone:6443/api/v1/namespaces/default/pods/jaeger-operator-5dbf4647d5-l465m
I0619 19:07:45.176040   22628 round_trippers.go:427] Request Headers:
I0619 19:07:45.176046   22628 round_trippers.go:431]     Accept: application/json, */*
I0619 19:07:45.176052   22628 round_trippers.go:431]     User-Agent: kubectl/v1.16.7 (linux/amd64) kubernetes/be3d344
I0619 19:07:45.187440   22628 round_trippers.go:446] Response Status: 200 OK in 11 milliseconds
I0619 19:07:45.193902   22628 round_trippers.go:427] Request Headers:
I0619 19:07:45.193908   22628 round_trippers.go:431]     Accept: application/json, */*
I0619 19:07:45.193914   22628 round_trippers.go:431]     User-Agent: kubectl/v1.16.7 (linux/amd64) kubernetes/be3d344
I0619 19:07:45.204565   22628 round_trippers.go:446] Response Status: 200 OK in 10 milliseconds
time="2020-06-19T11:04:18Z" level=info msg=Versions arch=amd64 identity=default.jaeger-operator jaeger=1.18.0 jaeger-operator=v1.18.0 operator-sdk=v0.15.1 os=linux version=go1.13.3
time="2020-06-19T11:04:18Z" level=info msg="Consider running the operator in a cluster-wide scope for extra features"
time="2020-06-19T11:04:19Z" level=info msg="Auto-detected the platform" platform=kubernetes
time="2020-06-19T11:04:19Z" level=info msg="Auto-detected ingress api" ingress-api=networking
time="2020-06-19T11:04:19Z" level=info msg="Automatically adjusted the 'es-provision' flag" es-provision=no
time="2020-06-19T11:04:19Z" level=info msg="Automatically adjusted the 'kafka-provision' flag" kafka-provision=no
time="2020-06-19T11:04:19Z" level=info msg="The service account running this operator does not have the role 'system:auth-delegator', consider granting it for additional capabilities"

```

zack

unread,
Jun 19, 2020, 7:17:33 AM6/19/20
to Jaeger Tracing
I have already open an issue once before.

Reply all
Reply to author
Forward
0 new messages