How to correctlly create a scylla DB cluster?

1,675 views
Skip to first unread message

flywingangel@gmail.com

<flywingangel@gmail.com>
unread,
Dec 3, 2015, 6:38:19 AM12/3/15
to ScyllaDB users
Hi, guys
     Is there a document which describes how to create a scylla db cluster?
     I am trying to construct a scylla DB cluster includes two nodes but failed. It seems two nodes cannot communicate with each other but actually the network between them is fine. 
     I can use cash connect to first node on the server which run second node.

     For expansibility reason, I has set both two nodes as seed. However, I have got the message "[shard 0] gossip - Fail to send EchoMessage to 168.192.0.1:0: rpc::closed_error (connection is closed)" when checked second node status after started the second node. 
     Also nodetool couldn't connect to second node while scylla service was running, error message is "nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused'.".

     The notetool works well on first node but cannot view the second node's information
nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens  Owns    Host ID                               Rack
UN  168.192.0.1  6.86496e+08  256     ?       b12168fc-b3e1-49a4-ac78-4aa6d3cc378d  rack1

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless


     To my surprise, I have copied scylla.yaml from first node and just changed listen_address from 168.192.0.1 to 168.192.0.2. 

     Below are some setting in scylla.yaml file:

# seed_provider class_name is saved for future use.
# seeds address are mandatory!
seed_provider:
    # Addresses of hosts that are deemed contact points.
    # Scylla nodes use this list of hosts to find each other and learn
    # the topology of the ring.  You must change this if you are running
    # multiple nodes!
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          # seeds is actually a comma-delimited list of addresses.
          # Ex: "<ip1>,<ip2>,<ip3>"
          - seeds: "168.192.0.1,168.192.0.2"

# Address or interface to bind to and tell other Scylla nodes to connect to.
# You _must_ change this if you want multiple nodes to be able to communicate!
#
# Setting listen_address to 0.0.0.0 is always wrong.
#listen_address: localhost
listen_address: 168.192.0.1

rpc_address: 0.0.0.0
broadcast_rpc_address: 168.192.0.1

     Other setting is default.

      By the way, before added second node, I have tested the first node by cassandra-stress tool, so that there are some data on first node.

Asias He

<asias@scylladb.com>
unread,
Dec 3, 2015, 6:54:00 AM12/3/15
to scylladb-users@googlegroups.com
On Thu, Dec 3, 2015 at 7:38 PM, <flywin...@gmail.com> wrote:
Hi, guys
     Is there a document which describes how to create a scylla db cluster?
     I am trying to construct a scylla DB cluster includes two nodes but failed. It seems two nodes cannot communicate with each other but actually the network between them is fine. 
     I can use cash connect to first node on the server which run second node.

     For expansibility reason, I has set both two nodes as seed. However, I have got the message "[shard 0] gossip - Fail to send EchoMessage to 168.192.0.1:0: rpc::closed_error (connection is closed)" when checked second node status after started the second node. 
     Also nodetool couldn't connect to second node while scylla service was running, error message is "nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused'.".

Is your port 7000 blocked by the firewall?

Try

$ sudo iptables -F
Note, seed node will not bootstrap, so if you have data on node 1 and then start node 2 as seed node, data on node 1 will not stream to node 2 automatically.
  

--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To post to this group, send email to scyllad...@googlegroups.com.
Visit this group at http://groups.google.com/group/scylladb-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/8020205b-3ac5-4066-8195-9d89a22084bb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Asias

flywingangel@gmail.com

<flywingangel@gmail.com>
unread,
Dec 3, 2015, 7:09:06 AM12/3/15
to ScyllaDB users
No, I din't start the firewall. Will try to remove second seed setting tomorrow.
Thanks Asias a lot.

在 2015年12月3日星期四 UTC+8下午7:54:00,Asias He写道:

flywingangel@gmail.com

<flywingangel@gmail.com>
unread,
Dec 3, 2015, 9:31:25 PM12/3/15
to ScyllaDB users
I have removed 168.192.0.2 from seeds setting on both two nodes. It seems second node still cannot communicate with first node.

Dec 04 10:29:12 tempt1045 scylla[3082]: [shard 0] gossip - Sleep 1 second and connect seeds again ... (250 seconds passed)
Dec 04 10:29:13 tempt1045 scylla_run[3078]: WARNING: exceptional future ignored of type 'rpc::error': bad response frame header

Does anyone have any option? Thanks a lot.

在 2015年12月3日星期四 UTC+8下午7:54:00,Asias He写道:

Asias He

<asias@scylladb.com>
unread,
Dec 3, 2015, 9:37:54 PM12/3/15
to scylladb-users@googlegroups.com
On Fri, Dec 4, 2015 at 10:31 AM, <flywin...@gmail.com> wrote:
I have removed 168.192.0.2 from seeds setting on both two nodes. It seems second node still cannot communicate with first node.

Dec 04 10:29:12 tempt1045 scylla[3082]: [shard 0] gossip - Sleep 1 second and connect seeds again ... (250 seconds passed)
Dec 04 10:29:13 tempt1045 scylla_run[3078]: WARNING: exceptional future ignored of type 'rpc::error': bad response frame header

Does anyone have any option? Thanks a lot.


I still suspect it is a firewall issue.

Run the following commands on both nodes.

$ sudo iptables -nL

$ sudo netstat -natlp|grep 7000


Try 

$ sudo iptables -F

on both nodes.
 

For more options, visit https://groups.google.com/d/optout.



--
Asias

flywingangel@gmail.com

<flywingangel@gmail.com>
unread,
Dec 4, 2015, 3:20:37 AM12/4/15
to ScyllaDB users
Hi, Asias
     Thanks a lot.
     It seems a network issue. I can telnet 192.168.0.1:7000 from 192.168.0.2 but 192.168.0.1 cannot add to cluster now, no matter as seed or node only.
     I used another two servers as node and now 192.168.0.2 is seed, they work well; but 192.168.0.1 still cannot add to cluster even though restarted it. Status log shows it added to cluster but few time scylla service on 192.168.0.1 exited.
     Will check the detail log to find more information. I guess the press test yesterday might crash the system of 192.168.0.1. Plan to reinstall CentOS 7.1 and try again.

logs:
Dec 04 16:00:39 tempt1044 scylla[10788]: [shard 0] gossip - net_address 192.168.0.4 is now UP
Dec 04 16:00:39 tempt1044 scylla[10788]: [shard 0] storage_service - Starting up server gossip
Dec 04 16:00:39 tempt1044 scylla_run[10785]: Start gossiper service ...
Dec 04 16:00:39 tempt1044 scylla[10788]: [shard 0] storage_service - Detected previous bootstrap failure; retrying
Dec 04 16:00:39 tempt1044 scylla[10788]: [shard 0] storage_service - JOINING: waiting for ring information
Dec 04 16:00:39 tempt1044 scylla[10788]: [shard 0] storage_service - JOINING: schema complete, ready to bootstrap
Dec 04 16:00:39 tempt1044 scylla[10788]: [shard 0] storage_service - JOINING: waiting for pending range calculation
Dec 04 16:00:39 tempt1044 scylla[10788]: [shard 0] storage_service - JOINING: calculation complete, ready to bootstrap
Dec 04 16:00:39 tempt1044 scylla[10788]: [shard 0] storage_service - JOINING: getting bootstrap token
Dec 04 16:00:39 tempt1044 scylla[10788]: [shard 0] storage_service - JOINING: sleeping 5000 ms for pending range setup

[root@tempt1044 /home/longjun]# systemctl status scylla-server
scylla-server.service - Scylla Server
   Loaded: loaded (/usr/lib/systemd/system/scylla-server.service; disabled)
   Active: failed (Result: exit-code) since Fri 2015-12-04 16:00:44 CST; 9s ago
  Process: 10840 ExecStopPost=/usr/lib/scylla/scylla_stop (code=exited, status=0/SUCCESS)
  Process: 10785 ExecStart=/usr/lib/scylla/scylla_run (code=exited, status=3)
  Process: 10782 ExecStartPre=/usr/lib/scylla/scylla_prepare (code=exited, status=0/SUCCESS)
 Main PID: 10785 (code=exited, status=3)

Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 9] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 6] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 18] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 20] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 10] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 11] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 16] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 20] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 3] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 scylla[10788]: [shard 7] compaction_manager - compaction task handler stopped due to shutdown
Dec 04 16:00:44 tempt1044 systemd[1]: scylla-server.service: main process exited, code=exited, status=3/NOTIMPLEMENTED
Dec 04 16:00:44 tempt1044 systemd[1]: Unit scylla-server.service entered failed state.


在 2015年12月4日星期五 UTC+8上午10:37:54,Asias He写道:

Asias He

<asias@scylladb.com>
unread,
Dec 4, 2015, 3:22:28 AM12/4/15
to scylladb-users@googlegroups.com
On Fri, Dec 4, 2015 at 10:37 AM, Asias He <as...@scylladb.com> wrote:


On Fri, Dec 4, 2015 at 10:31 AM, <flywin...@gmail.com> wrote:
I have removed 168.192.0.2 from seeds setting on both two nodes. It seems second node still cannot communicate with first node.

Dec 04 10:29:12 tempt1045 scylla[3082]: [shard 0] gossip - Sleep 1 second and connect seeds again ... (250 seconds passed)
Dec 04 10:29:13 tempt1045 scylla_run[3078]: WARNING: exceptional future ignored of type 'rpc::error': bad response frame header

Does anyone have any option? Thanks a lot.


I still suspect it is a firewall issue.

Run the following commands on both nodes.

$ sudo iptables -nL

$ sudo netstat -natlp|grep 7000


Try 

$ sudo iptables -F

on both nodes.


In addition, you can also run the following tshark command  to check if there is any packages from node2 to node1 on port 7000.


$ sudo tshark -i eth0  -f "tcp and (port 7000)"


modify eth0 to your interface.



--
Asias

Mac

<sarmadys@gmail.com>
unread,
Jul 24, 2019, 4:51:47 AM7/24/19
to ScyllaDB users
Hello,

These commands fixed my nodes communication problem. May I know what did they exactly do?



On Friday, December 4, 2015 at 6:07:54 AM UTC+3:30, Asias He wrote:

$ sudo iptables -F
Reply all
Reply to author
Forward
0 new messages