kafka error after upgrading to 1.1.0: “the state store…may have migrated to another instance”

702 views
Skip to first unread message

Mike

unread,
Apr 26, 2018, 10:15:26 AM4/26/18
to Confluent Platform

I recently upgraded from kafka 0.10.1 to 1.1.0.

My stream app produces streams by subscribing to changes from our database by using confluent connect, does some calculation and then publishes their own stream/topic.

When starting the app, i attempt to get each of the stream store the app publishes. This code simply tries to get the store using KafkaStreams.store method in a try/catch loop (i try for 300 times to give the the stream time in case it is rebalancing or truly migrating). This all worked fine for kafka 0.10.2

After upgrading to kafka 1.1.0, the app starts the first time fine. However, if i try to restart the app, in cases where the stream consumes multiple topics from connect, such streams are always throwing InvalidStateStoreException. This does not happen for streams that subscribe to a single connect topic. To fix, i must delete the logs and store, then restarting my stream app, it works fine. But i always have to pretty much wiped the logs, and store each time i restart.

i debugged into the source a bit and found the issue is this call in org.apache.kafka.streams.state.internals.StreamThreadStateStoreProvider


    public <T> List<T> stores(final String storeName, final QueryableStoreType<T> queryableStoreType) {
    if (streamThread.state() == StreamThread.State.DEAD) {
        return Collections.emptyList();
    }
    if (!streamThread.isRunningAndNotRebalancing()) {
        throw new InvalidStateStoreException("the state store, " + storeName + ", may have migrated to another instance.");
    }
    final List<T> stores = new ArrayList<>();
    for (Task streamTask : streamThread.tasks().values()) {
        final StateStore store = streamTask.getStore(storeName);
        if (store != null && queryableStoreType.accepts(store)) {
            if (!store.isOpen()) {
                throw new InvalidStateStoreException("the state store, " + storeName + ", may have migrated to another instance.");
            }
            stores.add((T) store);
        }
    }
    return stores;
}


For streams that consume multiple connect topics and produce a single stream/topic, when i restart the app, the above code is not finding the store for the topic it is supposed to publish (even though it has to exist given the app starts and works fine the first time i start it after clearing the logs and store (im manually delete those folders for now)). What is even more strange however, is that despite it not finding a store, it is still receiving connect produced topics and producing the calculated stream apparently just fine.

Anyone have any ideas on what might be happening here after the upgrade?

jpava...@gmail.com

unread,
Apr 26, 2018, 2:31:10 PM4/26/18
to Confluent Platform
Hello Mike,

I have a similar issue with v4.1. Resetting everything solved my issue. Please see this issue 


Thanks
Pavan Jadda

Mike

unread,
Apr 26, 2018, 3:53:19 PM4/26/18
to Confluent Platform
Hi Jadda, 

What do you mean by resetting everything?

The issue with my code is that the first time, after a clean start (meaning if i wipe the kafka logs and offsets), all works well..data flows in and out of the stream as expected.
When i restart the streaming app however is when i get the IllegalStatestoreException.  This isbecause the code i provided is not able to get the store im looking for.  The function i pointed out returns an empty list and consequently, the calling function QuerableStateProvider.getStore() throws an exception of the return list is empty.

I do have a try/catch loop + sleep that accommodates for stores that may take a while to become queryable (due to such things are rebalancing).
I also check the stream State and it is shown as RUNNING for these problem streams.
Additionally, this only seems to occur for streams that are consuming multiple connect topics and producing a single aggregated stream.  it doesn't happen for streams that are consuming a single topic.

Mike

unread,
Apr 30, 2018, 4:56:51 PM4/30/18
to Confluent Platform
any further thoughts on this?


On Thursday, April 26, 2018 at 10:15:26 AM UTC-4, Mike wrote:

jpava...@gmail.com

unread,
Apr 30, 2018, 5:15:36 PM4/30/18
to Confluent Platform
Mike, I do have the same issue. For the single consumer it's working fine, but for multiple consumers, I got the same exception.

jpava...@gmail.com

unread,
Apr 30, 2018, 5:38:56 PM4/30/18
to Confluent Platform


On Monday, April 30, 2018 at 4:56:51 PM UTC-4, Mike wrote:

dizzy0ny

unread,
Apr 30, 2018, 7:58:36 PM4/30/18
to confluent...@googlegroups.com
thanks jp.  I don't seem to have access to that site.  It states ''

Don't have an account on this workspace yet?Contact the workspace administrator for an invitation

But provides no option to contact admin...


Sent from my T-Mobile 4G LTE Device
--
You received this message because you are subscribed to a topic in the Google Groups "Confluent Platform" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/confluent-platform/g0_PcVh6buA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/ac55ef91-ee4b-4e10-8f7d-da4ec83e3d37%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jpava...@gmail.com

unread,
May 1, 2018, 10:36:58 AM5/1/18
to Confluent Platform
To unsubscribe from this group and all its topics, send an email to confluent-platform+unsub...@googlegroups.com.

Mike MBS

unread,
May 1, 2018, 9:14:02 PM5/1/18
to confluent...@googlegroups.com
i need an invite or i need to contact the workspace admin.  there is no option for the later.  Is this workspace administered by confluent? if so i will try contacting them

To unsubscribe from this group and all its topics, send an email to confluent-platform+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Confluent Platform" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/confluent-platform/g0_PcVh6buA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to confluent-platform+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages