redis sentinel issue (master-slave replication)

1,263 views
Skip to first unread message

hima...@trulymadly.com

unread,
Jun 22, 2015, 3:44:22 AM6/22/15
to redi...@googlegroups.com
Hello All,


I have master-slave set up of redis on AWS, with one master and 2 slave architecture (all three on different instances located in same region). Now, redis replication is working fine but sentinel is not able to discover master and other sentinels. It's showing me SDOWN on slave side.
Below is the log from slave sentinel:

[5603] 22 Jun 07:22:40.379 # Sentinel runid is 8c4baf6fc315ed2d6920f0ce26eba6b03bfd02fa
[5603] 22 Jun 07:23:10.388 # +sdown master mymaster <master-ip> 6379
[5603] 22 Jun 07:23:10.388 # +sdown slave <slave-ip>:6379 <slave-ip> 6379 @ mymaster <master-ip> 6379
[5603] 22 Jun 07:23:10.441 # +odown master mymaster <master-ip> 6379 #quorum 2/2
[5603] 22 Jun 07:23:10.441 # +new-epoch 1
[5603] 22 Jun 07:23:10.441 # +try-failover master mymaster <master-ip> 6379
[5603] 22 Jun 07:23:10.441 # +vote-for-leader 8c4baf6fc315ed2d6920f0ce26eba6b03bfd02fa 1
[5603] 22 Jun 07:23:10.441 # 127.0.0.1:26379 voted for 8c4baf6fc315ed2d6920f0ce26eba6b03bfd02fa 1
[5603] 22 Jun 07:23:10.503 # +elected-leader master mymaster <master-ip> 6379
[5603] 22 Jun 07:23:10.503 # +failover-state-select-slave master mymaster <master-ip> 6379
[5603] 22 Jun 07:23:10.586 # -failover-abort-no-good-slave master mymaster <master-ip> 6379
[5603] 22 Jun 07:29:10.448 # +new-epoch 2
[5603] 22 Jun 07:29:10.448 # +try-failover master mymaster <master-ip> 6379
[5603] 22 Jun 07:29:10.448 # +vote-for-leader 8c4baf6fc315ed2d6920f0ce26eba6b03bfd02fa 2
[5603] 22 Jun 07:29:10.448 # 127.0.0.1:26379 voted for 8c4baf6fc315ed2d6920f0ce26eba6b03bfd02fa 2
[5603] 22 Jun 07:29:10.548 # +elected-leader master mymaster <master-ip> 6379
[5603] 22 Jun 07:29:10.549 # +failover-state-select-slave master mymaster <master-ip> 6379
[5603] 22 Jun 07:29:10.601 # -failover-abort-no-good-slave master mymaster <master-ip> 6379 

My sentinel login on slave shows like this :
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
(error) IDONTKNOW I have not enough information to reply. Please ask another Sentinel.

Am I making some misconfiguration in my slave or is it a common problem with AWS. My sentinels are running on respective instances only.
Any help would be greatly appreciated.


Regards,
Himanshu

Josiah Carlson

unread,
Jun 23, 2015, 1:23:08 AM6/23/15
to redi...@googlegroups.com
From the servers that your Sentinels are running on, can you connect to your master and slave instances? If not, check your security groups, you will likely need to create a custom security group if you've not done so already. And yes, security group configuration is typical AWS configuration for servers and services running in AWS.

 - Josiah


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

hima...@trulymadly.com

unread,
Jun 23, 2015, 1:39:20 AM6/23/15
to redi...@googlegroups.com
Hi Josiah,

Thanks for revert. 
Our master-slave replication is working on AWS. So, security group should not be an issue really. Sentinels are not even running on their respective machine, showing SDOWN status. Even on the individual machine, this is the output:

127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
(error) IDONTKNOW I have not enough information to reply. Please ask another Sentinel.

Regards,
Himanshu

Josiah Carlson

unread,
Jun 23, 2015, 2:03:46 AM6/23/15
to redi...@googlegroups.com
> Thanks for revert.

I think you mean "reply". Revert is something else entirely.

I wasn't asking about master/slave replication. I was asking if you could connect to all of your servers from your sentinel machines. If the sentinel isn't able to connect to your master, then you obviously have a connectivity problem. This is usually caused by security group problems, which is why I asked, but it can also be caused by trying to connect to public IPs instead of private IPs.

Can you provide INFO output from your servers? Can you provide topology information about your setup? Can you provide Redis configuration settings that you are using? What have you done to try to fix your problem?

 - Josiah

Himanshu Jain

unread,
Jun 23, 2015, 2:50:44 AM6/23/15
to redi...@googlegroups.com


Security groups are open as of now just to make sure things are working.
On Tue, Jun 23, 2015 at 11:33 AM, Josiah Carlson <josiah....@gmail.com> wrote:
> Thanks for revert.

I think you mean "reply". Revert is something else entirely.

I wasn't asking about master/slave replication. I was asking if you could connect to all of your servers from your sentinel machines. If the sentinel isn't able to connect to your master, then you obviously have a connectivity problem. This is usually caused by security group problems, which is why I asked, but it can also be caused by trying to connect to public IPs instead of private IPs.

No, I am not able to connect to respective servers from sentinel.  I am using public IP, but we have tried with private IP as well. (in case of private IP: we changed the bind address to private IP in redis.conf and same we have changed in sentinel.conf in monitor master <private IP>)

Can you provide INFO output from your servers? Can you provide topology information about your setup? Can you provide Redis configuration settings that you are using? What have you done to try to fix your problem?


Structure is like:

Machine A: redis-master, sentinel-master (both present in this machine)
Machine B: redis-slave1, sentinel-slave1 (both present in this machine)
Machine C: redis-slave2, sentinel-slave2 (both present in this machine)

sentinels on respective machines are not able to connect to their redis servers. 

Slaves:
redis info:
bind 0.0.0.0
slaveof <public IP of AWS instance>

sentinel.conf:
in monitor we are using public IP of master machine 

So, we have tried with private IP as well public IP and configuring the security groups, which are even open now. We have tried running the same without replication, means on individual machines. Tried adding logs into sentinel configurations and running as daemon.

Redis info:
# Server
redis_version:2.8.4
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:a44a05d76f06a5d9
redis_mode:standalone
os:Linux 3.13.0-48-generic x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.2
process_id:2885
run_id:caa138e0212688171e3e25555a9678e8b7fb935d
tcp_port:6379
uptime_in_seconds:61883
uptime_in_days:0
hz:10
lru_clock:897701
config_file:/etc/redis/redis.conf

# Clients
connected_clients:1
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0

# Memory
used_memory:1591600
used_memory_human:1.52M
used_memory_rss:9641984
used_memory_peak:1664368
used_memory_peak_human:1.59M
used_memory_lua:33792
mem_fragmentation_ratio:6.06
mem_allocator:jemalloc-3.4.1

# Persistence
loading:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1434978487
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:0
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok

# Stats
total_connections_received:716
total_commands_processed:123516
instantaneous_ops_per_sec:1
rejected_connections:0
sync_full:2
sync_partial_ok:0
sync_partial_err:2
expired_keys:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:545

# Replication
role:master
connected_slaves:2
slave0:ip=52.74.xx.xxx,port=6379,state=online,offset=86451,lag=1
slave1:ip=54.169.xx.xxx,port=6379,state=online,offset=86451,lag=1
master_repl_offset:86451
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:86450

# CPU
used_cpu_sys:15.74
used_cpu_user:7.57
used_cpu_sys_children:0.00
used_cpu_user_children:0.00

# Keyspace
db0:keys=1,expires=0,avg_ttl=0


sentinel info:
127.0.0.1:26379> info
# Server
redis_version:2.8.4
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:a44a05d76f06a5d9
redis_mode:sentinel
os:Linux 3.13.0-48-generic x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.2
process_id:4153
run_id:f408576ad0e8d34cee7292fb79783bca86cc62d5
tcp_port:26379
uptime_in_seconds:144
uptime_in_days:0
hz:12
lru_clock:897782
config_file:/etc/redis/sentinel.conf

# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=mymaster,status=sdown,address=127.0.0.1:6379,slaves=0,sentinels=1

 

--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/ql3c_h5Mf1Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

Josiah Carlson

unread,
Jun 28, 2015, 2:01:32 AM6/28/15
to redi...@googlegroups.com
Replies inline.

On Mon, Jun 22, 2015 at 11:50 PM, Himanshu Jain <hima...@trulymadly.com> wrote:

Security groups are open as of now just to make sure things are working.

Something is strange then. Are they in the same VPC? Even then, you should be able to connect to the public IP.
 
On Tue, Jun 23, 2015 at 11:33 AM, Josiah Carlson <josiah....@gmail.com> wrote:
> Thanks for revert.

I think you mean "reply". Revert is something else entirely.

I wasn't asking about master/slave replication. I was asking if you could connect to all of your servers from your sentinel machines. If the sentinel isn't able to connect to your master, then you obviously have a connectivity problem. This is usually caused by security group problems, which is why I asked, but it can also be caused by trying to connect to public IPs instead of private IPs.

No, I am not able to connect to respective servers from sentinel.  I am using public IP, but we have tried with private IP as well. (in case of private IP: we changed the bind address to private IP in redis.conf and same we have changed in sentinel.conf in monitor master <private IP>)

Can you connect to the servers from a normal Redis client running on the same servers that Sentinel is running on? Because if not, then it's either a network configuration problem, ore you might be on bad hardware (happened to me once). But in the case of bad hardware, usually there is a notice within a few hours that they are going to replace the machine.


Can you provide INFO output from your servers? Can you provide topology information about your setup? Can you provide Redis configuration settings that you are using? What have you done to try to fix your problem?


Structure is like:

Machine A: redis-master, sentinel-master (both present in this machine)
Machine B: redis-slave1, sentinel-slave1 (both present in this machine)
Machine C: redis-slave2, sentinel-slave2 (both present in this machine)

sentinels on respective machines are not able to connect to their redis servers. 

Slaves:
redis info:
bind 0.0.0.0
slaveof <public IP of AWS instance>

sentinel.conf:
in monitor we are using public IP of master machine 

So, we have tried with private IP as well public IP and configuring the security groups, which are even open now. We have tried running the same without replication, means on individual machines. Tried adding logs into sentinel configurations and running as daemon.

Redis info:
# Server
redis_version:2.8.4

That version of Redis was released on January 13, 2014. You should upgrade Redis and the sentinel to 2.8.21 if you can't move to 3.0.2 (you don't need to use Cluster to use 3.0.2). There is a fairly long list of critical fixes pushed to 2.8, which you can read here:
Okay, your slaves can connect, but your sentinels can't. Are you using passwords? Or... it's possible that your sentinels can't find each other because they don't actually know their own IPs - especially if they are connecting directly to 127.0.0.1:6379 to connect to the master and slaves.

 - Josiah
Reply all
Reply to author
Forward
0 new messages