Cluster communication failer

60 views
Skip to first unread message

HeinHtet Aung

unread,
Jul 3, 2025, 11:55:09 AMJul 3
to Tinode General
I am facing the "cluster unreachable" error in chatbot side. When I check in tinode server code , I saw it is because of the in cluster communication fail.
Can it be related to an infrastructure issue? 

Thanks


Gene

unread,
Jul 3, 2025, 12:13:40 PMJul 3
to Tinode General
It's probably a network problem. Or something is misconfigured.

HeinHtet Aung

unread,
Jul 4, 2025, 3:18:06 AMJul 4
to Tinode General
Ok thanks for your reply , When I check the server log,  one cluster is doing restart, and I saw this error log  from the restart node " fatal error: concurrent map iteration and map write" 

Gene

unread,
Jul 4, 2025, 3:29:35 AMJul 4
to Tinode General
Which version of Go did you use to build the server?

Gene

unread,
Jul 4, 2025, 3:32:09 AMJul 4
to Tinode General
Also, please post the whole stack trace.
Message has been deleted

HeinHtet Aung

unread,
Jul 4, 2025, 8:55:10 PMJul 4
to Tinode General
I am using docker image version "tinode-postgres:0.23.0" and here is the log (not the whole complete exception),


p2mSenderLoop: call failed tinode-1 read tcp 127.0.0.1:37274->127.0.0.1:12001: read: connection reset by pee
ws: session started 4pponOwPj7E 202.58.91.57 128

fatal error: concurrent map iteration and map write
(empty)
goroutine 26565 [running]:
main.serveStatus.func2({0xc00106f8f0?, 0xc000d68930?}, {0x197b1e0?, 0xc00082ef00})
/Users/gene/go/src/github.com/tinode/chat/server/http.go:466 +0x411
sync.(*Map).Range(0xc00119e2a0?, 0xc00106fad0)
/usr/local/go/src/sync/map.go:501 +0x1f8
main.serveStatus({0x1f95230, 0xc002043500}, 0xc00106fb38?)
/Users/gene/go/src/github.com/tinode/chat/server/http.go:455 +0x2bd
net/http.HandlerFunc.ServeHTTP(0xc0005ae0e0?, {0x1f95230?, 0xc002043500?}, 0x76e656?)
/usr/local/go/src/net/http/server.go:2220 +0x29
net/http.(*ServeMux).ServeHTTP(0x46c379?, {0x1f95230, 0xc002043500}, 0xc001283b80)
/usr/local/go/src/net/http/server.go:2747 +0x1ca
net/http.serverHandler.ServeHTTP({0xc00135b890?}, {0x1f95230?, 0xc002043500?}, 0x6?)
/usr/local/go/src/net/http/server.go:3210 +0x8e
net/http.(*conn).serve(0xc000fe67e0, {0x1f98b18, 0xc000fd8960})
/usr/local/go/src/net/http/server.go:2092 +0x5d0
created by net/http.(*Server).Serve in goroutine 59
/usr/local/go/src/net/http/server.go:3360 +0x485
(empty)
goroutine 1 [select, 14 minutes]:
main.listenAndServe({0xc000b0d076, 0x5}, 0xc0005ae0e0, 0x0, 0xc00118c3f0)
/Users/gene/go/src/github.com/tinode/chat/server/http.go:104 +0x1e6
main.main()


And I also saw these logs after the exception , but I'm not sure if it is normal or not

W2025/07/04 01:45:56 p2mSenderLoop: call failed tinode-1 cluster: node 'tinode-1' not connected
E2025/07/04 01:45:57 cluster: request to route to self

W2025/07/04 01:45:57 proxy topic[p2pIMLZMerJoQatQCA2KyBwxw] shutdown: failed to notify master - node for topic not found

Gene

unread,
Jul 5, 2025, 2:04:33 AMJul 5
to Tinode General
It looks like you configured your docker to execute container health checks by calling the 'server_status' URL. Do not do it. That's not what this endpoint is for.

The other messages are a side effect of the crash.

HeinHtet Aung

unread,
Jul 6, 2025, 2:16:07 AMJul 6
to Tinode General
I didn't use the "/debug/status" as health check , I used the web app url "/" as health check, but for deployment, I use a stateful set with Kubernetes

Gene

unread,
Jul 6, 2025, 6:45:45 AMJul 6
to Tinode General
Well, something in your setup is calling "/debug/status". Find it and make it stop.

zhipeng wang

unread,
Jul 6, 2025, 8:39:46 PMJul 6
to Tinode General

Is there no commercial version? I didn't see the background management.

Gene

unread,
Jul 7, 2025, 2:38:23 AMJul 7
to Tinode General
On Monday, July 7, 2025 at 3:39:46 AM UTC+3 zhipeng wang wrote:

Is there no commercial version?

 Please take a look here for commercial options:

I didn't see the background management.

I don't understand the question. Please clarify what you mean by "background management".

zhipeng wang

unread,
Jul 7, 2025, 6:20:56 AMJul 7
to Tinode General

Does the commercial version provide source code?

Gene

unread,
Jul 7, 2025, 10:28:29 AMJul 7
to Tinode General
The entire source code is already available at https://github.com/tinode. If you think something is missing, please be specific.

HeinHtet Aung

unread,
Jul 8, 2025, 12:24:38 AMJul 8
to Tinode General
Thanks, Gene,  now the issue is solved, yes,  you are right. From our load balancer, calling the server_status endpoint as a health check 
Reply all
Reply to author
Forward
0 new messages