Research of lost messages in JetStream

116 views
Skip to first unread message

Oleg Trautvein

unread,
Jun 8, 2021, 8:13:29 PM6/8/21
to nats
Hello
I generate 3x100 log lines using simple bash script
oleg@mac:~/temp$ kubectl delete deployments.apps/loggenerator ;  kubectl apply -f loggenerator.yaml && sleep 3 && kubectl get po | grep loggenerator | grep Running
deployment.apps "loggenerator" deleted
deployment.apps/loggenerator created
loggenerator-74d56cbd79-bzkv2   1/1     Running       0          4s
loggenerator-74d56cbd79-f2p8g   1/1     Running       0          4s
loggenerator-74d56cbd79-fx6l8   1/1     Running       0          4s

Sample:
 93 2021-06-08 23:30:46 <----- I2PBa DpqO
 94 2021-06-08 23:30:46 <----- I2PBa MamC
 95 2021-06-08 23:30:46 <----- I2PBa eYKb
 96 2021-06-08 23:30:46 <----- I2PBa g0E7
 97 2021-06-08 23:30:46 <----- I2PBa YEvK
 98 2021-06-08 23:30:46 <----- I2PBa CXeI
 100 2021-06-08 23:30:47 <----- I2PBa Loggenerator finished
Sample (fluentbit debug output):
[97] logs.kube.var.log.containers.loggenerator-74d56cbd79-f2p8g_default_loggenerator-4e7ce1a86fffc0c01cdcda7cd9eff0ff9f4f143c5fb0278fffd82c996f0f2
a0c.log: [1623195046.386101584, {"log"=>"97 2021-06-08 23:30:46 <----- I2PBa YEvK
", "stream"=>"stdout", "time"=>"2021-06-08T23:30:46.386101584Z", "kubernetes"=>{"pod_name"=>"loggenerator-74d56cbd79-f2p8g", "namespace_name"=>"de
fault", "pod_id"=>"3160bb15-c48b-4075-a863-b47f959afc72", "labels"=>{"app"=>"loggenerator", "pod-template-hash"=>"74d56cbd79"}, "annotations"=>{"c
latform-vsp1-group1-0", "container_name"=>"loggenerator", "docker_id"=>"4e7ce1a86fffc0c01cdcda7cd9eff0ff9f4f143c5fb0278fffd82c996f0f2a0c", "contai
ner_hash"=>"bash@sha256:ad5e40db3e26bed0d627e049f83e76988ea7d93133e03ea0649fbce51d161272", "container_image"=>"bash:latest"}}]
[98] logs.kube.var.log.containers.loggenerator-74d56cbd79-f2p8g_default_loggenerator-4e7ce1a86fffc0c01cdcda7cd9eff0ff9f4f143c5fb0278fffd82c996f0f2
a0c.log: [1623195046.388465421, {"log"=>"98 2021-06-08 23:30:46 <----- I2PBa CXeI
", "stream"=>"stdout", "time"=>"2021-06-08T23:30:46.388465421Z", "kubernetes"=>{"pod_name"=>"loggenerator-74d56cbd79-f2p8g", "namespace_name"=>"de
fault", "pod_id"=>"3160bb15-c48b-4075-a863-b47f959afc72", "labels"=>{"app"=>"loggenerator", "pod-template-hash"=>"74d56cbd79"}, "annotations"=>{"c
latform-vsp1-group1-0", "container_name"=>"loggenerator", "docker_id"=>"4e7ce1a86fffc0c01cdcda7cd9eff0ff9f4f143c5fb0278fffd82c996f0f2a0c", "contai
ner_hash"=>"bash@sha256:ad5e40db3e26bed0d627e049f83e76988ea7d93133e03ea0649fbce51d161272", "container_image"=>"bash:latest"}}]
[99] logs.kube.var.log.containers.loggenerator-74d56cbd79-f2p8g_default_loggenerator-4e7ce1a86fffc0c01cdcda7cd9eff0ff9f4f143c5fb0278fffd82c996f0f2
a0c.log: [1623195047.390425066, {"log"=>"100 2021-06-08 23:30:47 <----- I2PBa Loggenerator finished

Nats stream monitoring:
~ # nats stream info MAIN
Information for Stream MAIN created 2021-06-07T12:57:31Z

Configuration:

             Subjects: logs.>
     Acknowledgements: true
            Retention: File - WorkQueue
             Replicas: 1
       Discard Policy: Old
     Duplicate Window: 1m0s
     Maximum Messages: unlimited
        Maximum Bytes: unlimited
          Maximum Age: 0.00s
 Maximum Message Size: unlimited
    Maximum Consumers: unlimited


Cluster Information:

                 Name: project-logging-nats
               Leader: project-logging-nats-0

State:

             Messages: 2
                Bytes: 177 KiB
             FirstSeq: 2,054 @ 2021-06-08T23:30:48 UTC
              LastSeq: 2,055 @ 2021-06-08T23:30:48 UTC
     Active Consumers: 0

There are messages (I cut it):
[2054] Subject: logs.kube.var.log.containers.loggenerator-74d56cbd79-bzkv2_default_loggenerator-d485539c18bc62a54c9ad8e757cbbaf3ea9c36d4bfc409d1e8c1f8f0d9df9f77.log Received: 2021-06-08T23:30:48Z

[[1623195043.884139,{"tag":"logs.kube.var.log.containers.loggenerator-74d56cbd79-bzkv2_default_loggenerator-d485539c18bc62a54c9ad8e757cbbaf3ea9c36d4bfc409d1e8c1f8f0d9df9f77.log","log":"0 2021-06-08 23:30:43

[2055] Subject: logs.kube.var.log.containers.loggenerator-74d56cbd79-f2p8g_default_loggenerator-4e7ce1a86fffc0c01cdcda7cd9eff0ff9f4f143c5fb0278fffd82c996f0f2a0c.log Received: 2021-06-08T23:30:48Z

[[1623195044.137557,{"tag":"logs.kube.var.log.containers.loggenerator-74d56cbd79-f2p8g_default_loggenerator-4e7ce1a86fffc0c01cdcda7cd9eff0ff9f4f143c5fb0278fffd82c996f0f2a0c.log","log":"0 2021-06-08 23:30:44


There is only 2 messages here from pods %bzkv2% and %f2p8g%, but... seems NATS got all 3 messages:

kubectl logs  -lapp.kubernetes.io/name=project-logging-nats -c nats --follow --tail=0 | grep "PUB logs.kube.var.log.containers.loggenerator"

[7] 2021/06/08 23:30:48.952711 [TRC] 100.96.1.42:45098 - cid:84 - "v1.7.8:c:fluent-bit" - <<- [PUB logs.kube.var.log.containers.loggenerator-74d56cbd79-bzkv2_default_loggenerator-d485539c18bc62a54c9ad8e757cbbaf3ea9c36d4bfc409d1e8c1f8f0d9df9f77.log 90801]
[10] 2021/06/08 23:30:48.969185 [TRC] 100.96.1.42:56886 - cid:88 - "v1.7.8:c:fluent-bit" - <<- [PUB logs.kube.var.log.containers.loggenerator-74d56cbd79-f2p8g_default_loggenerator-4e7ce1a86fffc0c01cdcda7cd9eff0ff9f4f143c5fb0278fffd82c996f0f2a0c.log 90393]
[7] 2021/06/08 23:30:51.227563 [TRC] 100.96.2.17:37398 - cid:146 - "v1.7.8:c:fluent-bit" - <<- [PUB logs.kube.var.log.containers.loggenerator-74d56cbd79-fx6l8_default_loggenerator-821ed7be6690839da31be58a9c31ac76f3f463da57be21f7b882288f9d5bf9fd.log 90806]

All three pod logs here (bzkv2, f2p8g, fx6l8). Logs from second fluentbit (100.96.2.17) absent.

This problem is unstable. Sometimes all logs on the place (in NATS Stream).
In case of my loggenerator produces 1000 lines, part of lines can be lost (some messages in term of NATS).

Should I research something else in NATS trace logs?


Nats version 20210511, image synadia/nats-server:nightly, 3-nodes cluster.
We do not use security because of fluentbit doesn't support it. 
Helm chart values:
  init:
    image:
      registry: docker.io
      name: natsio/nats-box
      tag: 0.6.0
      pullPolicy: IfNotPresent
    resources:
      limits:
        memory: 32Mi
      requests:
        cpu: 100m
        memory: 32Mi
  maxBytes: -1
  discardPolicy: old

  nameOverride: project-logging-nats
  fullnameOverride: project-logging-nats
  nats:
    image: synadia/nats-server:nightly
    jetstream:
      enabled: true
      memStorage:
        enabled: true
        size: "64Mi"
      fileStorage:
        enabled: true
        size: "1Gi"
        storageDirectory: /data/
        storageClassName: project-system

  cluster:
    enabled: true
    # Cluster name is required, by default will be release name.
    # name: "nats"
    replicas: 3

  affinity: # I have only 2 nodes now
    podAntiAffinity: {}

Fluentbit version 1.7

Thanks

Derek Collison

unread,
Jun 8, 2021, 8:27:17 PM6/8/21
to nat...@googlegroups.com
Best to post an issue on https://github.com/nats-io/nats-server/issues. Then we can take a look.

--
You received this message because you are subscribed to the Google Groups "nats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to natsio+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/natsio/d449fb1a-672f-48ce-8e5f-d21dc63a1ed3n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages