Voldemort: Getting voldemort.store.InsufficientOperationalNodesException

28 views
Skip to first unread message

Anandh Kumar

unread,
Jan 29, 2018, 10:18:13 AM1/29/18
to project-voldemort
Hi,

We are getting following exception while we are calling voldeMortStoreClient.getAll() method rarely.  It was working fine for couple of days. After that I started getting these exception a very few per day and getting increased day by day. Later on all the connections in that pool where turned invalid and all the requests are getting this exception. 

Server Version: 1.10.24 
Cluster size: 1

voldemort.store.InsufficientOperationalNodesException: 1 get alls required, but 0 succeeded. Failing nodes : []
at voldemort
.store.routed.action.PerformSerialGetAllRequests.execute(PerformSerialGetAllRequests.java:197)
at voldemort
.store.routed.Pipeline.execute(Pipeline.java:212)
at voldemort
.store.routed.PipelineRoutedStore.getAll(PipelineRoutedStore.java:497)
at voldemort
.store.routed.PipelineRoutedStore.getAll(PipelineRoutedStore.java:418)
at voldemort
.store.DelegatingStore.getAll(DelegatingStore.java:59)
at voldemort
.store.DelegatingStore.getAll(DelegatingStore.java:59)
at voldemort
.store.stats.StatTrackingStore.getAll(StatTrackingStore.java:133)
at voldemort
.store.serialized.SerializingStore.getAll(SerializingStore.java:122)
at voldemort
.store.DelegatingStore.getAll(DelegatingStore.java:59)
at voldemort
.store.versioned.InconsistencyResolvingStore.getAll(InconsistencyResolvingStore.java:57)


Server side Configuration:

stores.xml:

<stores>
<store>
 
<name>test</name>
 
<persistence>bdb</persistence>
 
<description>test</description>
 
<owners>test</owners>
 
<routing-strategy>consistent-routing</routing-strategy>
 
<routing>client</routing>
 
<replication-factor>1</replication-factor>
 
<required-reads>1</required-reads>
 
<required-writes>1</required-writes>
 
<key-serializer>
   
<type>string</type>
 
</key-serializer>
 
<value-serializer>
   
<type>string</type>
 
</value-serializer>
</store>
</stores>


Server.properties

node.id=0


# configs
admin
.enable=true
admin
.max.threads=40
bdb
.cache.evictln=true
bdb
.cache.size=12GB
bdb
.checkpoint.interval.bytes=2147483648
bdb
.checkpointer.off.batch.writes=true
bdb
.cleaner.interval.bytes=15728640
bdb
.cleaner.lazy.migration=false
bdb
.cleaner.min.file.utilization=0
bdb
.cleaner.threads=1
bdb
.enable=true
bdb
.evict.by.level=true
bdb
.expose.space.utilization=true
bdb
.lock.nLockTables=94
bdb
.minimize.scan.impact=true
bdb
.one.env.per.store=true
enable
.server.routing=false
enable
.verbose.logging=false
http
.enable=true
nio
.connector.selectors=100
num
.scan.permits=2
request
.format=vp3
restore
.data.timeout.sec=1314000
scheduler
.threads=24
slop
.frequency.ms=300000
socket
.enable=true
storage
.configs=voldemort.store.bdb.BdbStorageConfiguration, voldemort.store.readonly.ReadOnlyStorageConfiguration
stream
.read.byte.per.sec=209715200
stream
.write.byte.per.sec=78643200



cluster.xml

<cluster>
       
<name>myprodcluster</name>
       
<server>
               
<id>0</id>
               
<host>192.168.1.10</host>
               
<http-port>8081</http-port>
               
<socket-port>6666</socket-port>
               
<admin-port>7777</admin-port>
               
<partitions>0, 1</partitions>
       
</server>
</cluster>


Client side configuration:

voldemort.max.per.node.connection=500
voldemort
.connection.timedout=5000
voldemort
.socket.timedout=10000
voldemort
.idle.connection.timedout=10



Spring xml for creating store client

 
<bean id="config" class="voldemort.client.ClientConfig">
       
<constructor-arg>
           
<props>
               
<prop key="bootstrap_urls">${voldemort.url}</prop>
               
<prop key="max_connections">${voldemort.max.per.node.connection}</prop>
               
<prop key="connection_timeout_ms">${voldemort.connection.timedout}</prop>
               
<prop key="idle_connection_timeout_minutes">${voldemort.idle.connection.timedout}</prop>
               
<prop key="socket_timeout_ms">${voldemort.socket.timedout}</prop>
           
</props>
       
</constructor-arg>
   
</bean>


   
<bean id="clientFactory" class="voldemort.client.SocketStoreClientFactory" destroy-method="close">
       
<constructor-arg  index="0" ref="config" />
   
</bean>


   
<bean id="storeClient" factory-bean="clientFactory" factory-method="getStoreClient">
       
<constructor-arg value="${voldemort.store.name}" />
   
</bean>



Can you please help me how we can fix this issue

Thanks,
Anand

Arunachalam

unread,
Jan 29, 2018, 2:31:31 PM1/29/18
to project-...@googlegroups.com
Do you have a firewall between Client and Server ? What is the Voldemort Client Version ? Try reducing the idle_connection_timeouts and increasing the max_connections 

Thanks,
Arun.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

Anandh Kumar

unread,
Jan 30, 2018, 12:33:49 AM1/30/18
to project-voldemort
Arun,
         Thanks for your reply.

Client version: 1.10.23
Already we reduce the idle_connection_timeouts in to minimum value which is 10 minutes, voldemort client not supporting lesser than 10 minutes.
We increased max_connections per node value from 100 to 500, after that also this issue still coming.

Can you help me to find the answer of following questions?
1. How can i refresh voldemort client connection pool ?
2. Can i reset my invalid socket connection in some interval time?
3. Can you explain in detail why this exception is coming?
4. Can you please help here what are the way we can go and debugging this issue?

Regards,
-Anandh Kumar
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.

Arunachalam

unread,
Jan 30, 2018, 1:25:01 AM1/30/18
to project-...@googlegroups.com
This seems like either the server is overloaded or there is a firewall sitting between client and the server, which drops the connections. 

The exception indicates that server did not respond within the specified time and the client considers this as a failure. Also you have a single server. Voldemort is ideally designed for multiple servers, where one server fails the other servers are still alive and taking requests, client handles this seamlessly for you.

Look into server logs to see, if you notice any errors. You need to see the jmx stats on the server and client to identify where the bottleneck is and fix it.

Thanks,
Arun.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Anandh Kumar

unread,
Jan 31, 2018, 4:49:20 AM1/31/18
to project-voldemort
Arun,

Thanks for your reply.

I am getting the same exception. It was working fine for couple of days. After that I started getting these exception a very few per day and getting increased day by day. Later on all the connections in that pool where turned invalid and all the requests are getting this exception. Please suggest how can I resolve this issue. 

How can I destroy single voldemort socket connection?

Regards,
-Anandh Kumar

Arunachalam

unread,
Jan 31, 2018, 2:06:29 PM1/31/18
to project-...@googlegroups.com
Most likely, your server is maxing its capacity. Voldemort relies on Client heuristics to discover if the connection is still valid. If the Server does not respond within a specified time, client thinks that Server is busy/dying/dead.

Also Voldemort is built around the philosophy that peer servers are alive and healthy. Running a single voldemort server, Client is not designed for that.

Without JMX metrics recorded from the server, we are talking about many possibilities without backing up data. But my bet is your server is reaching its capacity.

Thanks,
Arun.



To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Anandh Kumar

unread,
Feb 1, 2018, 12:59:18 AM2/1/18
to project-voldemort
Arun,

Thanks for your reply.

I will try for clustering.

How can I destroy single voldemort socket connection?

Also I want to discuss this in deep in the call?  How can pls tell how we can do?
Regards,
-Anandh Kumar

Arunachalam

unread,
Feb 1, 2018, 2:28:55 AM2/1/18
to project-...@googlegroups.com
What do you mean by destroying Voldemort Socket Connection ?  That is just a TCP connection.

Thanks,
Arun.



To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages