Getting exception voldemort.store.InsufficientOperationalNodesException in production systems

Anandh Kumar

unread,

Aug 9, 2017, 5:49:42 AM8/9/17

to project-...@googlegroups.com

We are getting following exption in production system some time examples out of 84mn call 10000 times we are getting this exception

voldemort.store.InsufficientOperationalNodesException: 1 get alls required, but 0 succeeded. Failing nodes : []

I tried to increase the Voldemort ClientConfig connection_timeout and sockettimeout but till getting this error.

Our config:

stores.xml:

<store>

<routing-strategy>consistent-routing</routing-strategy>

<routing>client</routing>

<replication-factor>1</replication-factor>

<required-reads>1</required-reads>

<required-writes>1</required-writes>

<key-serializer>

<type>string</type>

</key-serializer>

<value-serializer>

<type>string</type>

</value-serializer>

</store>

</stores>

Server.properties

node.id=0

# configs

admin.enable=true

admin.max.threads=40

bdb.cache.evictln=true

bdb.cache.size=12GB

bdb.checkpoint.interval.bytes=2147483648

bdb.checkpointer.off.batch.writes=true

bdb.cleaner.interval.bytes=15728640

bdb.cleaner.lazy.migration=false

bdb.cleaner.min.file.utilization=0

bdb.cleaner.threads=1

bdb.enable=true

bdb.evict.by.level=true

bdb.expose.space.utilization=true

bdb.lock.nLockTables=94

bdb.minimize.scan.impact=true

bdb.one.env.per.store=true

enable.server.routing=false

enable.verbose.logging=false

http.enable=true

nio.connector.selectors=100

num.scan.permits=2

request.format=vp3

restore.data.timeout.sec=1314000

scheduler.threads=24

slop.frequency.ms=300000

socket.enable=true

storage.configs=voldemort.store.bdb.BdbStorageConfiguration, voldemort.store.readonly.ReadOnlyStorageConfiguration

stream.read.byte.per.sec=209715200

stream.write.byte.per.sec=78643200

cluster.xml

<name>myprodcluster</name>

<http-port>8081</http-port>

<socket-port>6666</socket-port>

<admin-port>7777</admin-port>

</server>

</cluster>

We have single node cluster

Can you please help me how we can fix this issue

Arunachalam

unread,

Aug 11, 2017, 5:26:58 PM8/11/17

to project-...@googlegroups.com

Try increasing the number of connections. From What I infer, there is no failing nodes, so it was just waiting in the client queue and timing out.

Other reason is you are throwing more load than the Client/Server could handle. You should monitor the p90/p99 response times and see if you see a spike and where is the contention happening from.

Thanks,

Arun.

On Wed, Aug 9, 2017 at 2:49 AM, Anandh Kumar <anand...@gmail.com> wrote:

We are getting following exption in production system some time examples out of 84mn call 10000 times we are getting this exception

voldemort.store.InsufficientOperationalNodesException: 1 get alls required, but 0 succeeded. Failing nodes : []

I tried to increase the Voldemort ClientConfig connection_timeout and sockettimeout but till getting this error.

Can you please help me how we can fix this issue

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

Anandh Kumar

unread,

Sep 1, 2017, 2:01:22 AM9/1/17

to project-voldemort

Arun,

Still we are getting this exception after increased ConnectionTimedout,SocketTimedout.

Can you please help me here how we can fix this.

Regards,

-Anandh Kumar

On Saturday, August 12, 2017 at 2:56:58 AM UTC+5:30, Arun Thirupathi wrote:

Try increasing the number of connections. From What I infer, there is no failing nodes, so it was just waiting in the client queue and timing out.

Other reason is you are throwing more load than the Client/Server could handle. You should monitor the p90/p99 response times and see if you see a spike and where is the contention happening from.

Thanks,
Arun.

On Wed, Aug 9, 2017 at 2:49 AM, Anandh Kumar <anand...@gmail.com> wrote:

We are getting following exption in production system some time examples out of 84mn call 10000 times we are getting this exception

voldemort.store.InsufficientOperationalNodesException: 1 get alls required, but 0 succeeded. Failing nodes : []

I tried to increase the Voldemort ClientConfig connection_timeout and sockettimeout but till getting this error.

Can you please help me how we can fix this issue

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.

Félix GV

unread,

Sep 1, 2017, 10:12:24 AM9/1/17

to project-voldemort

Did you try increasing the number of connections like Arun suggested?

Reply all

Reply to author

Forward