rc1 crash when connecting a new session

101 views
Skip to first unread message

b_tho...@lycos.com

unread,
Feb 13, 2015, 1:14:45 PM2/13/15
to cpp-dri...@lists.datastax.com

hi

I am running:

rc1
Centos 6.5
OpenSSL 1.0.1e-fips 11 Feb 2013
libuv 0.10.28

When connecting a new session I have seen a core with the following backtrace:

(gdb) bt

#0  0x00007f1d2b76f625 in raise () from /lib64/libc.so.6
#1  0x00007f1d2b770d8d in abort () from /lib64/libc.so.6
#2  0x00007f1d29e2048f in uv_mutex_lock (mutex=0x201dad0) at src/unix/thread.c:78
#3  0x00007f1d2b197751 in cass::Mutex::lock() () from /usr/local/test/lib/libcassandra.so.1
#4  0x00007f1d2b197b7b in cass::ScopedLock<cass::Mutex>::lock() () from /usr/local/test/lib/libcassandra.so.1
#5  0x00007f1d2b19787a in cass::ScopedLock<cass::Mutex>::ScopedLock(pthread_mutex_t*, bool) () from /usr/local/test/lib/libcassandra.so.1
#6  0x00007f1d2b1b5d82 in cass::Session::connect_async(cass::Config const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cass::Future*) () from /usr/local/test/lib/libcassandra.so.1
#7  0x00007f1d2b1b4bae in cass_session_connect_keyspace () from /usr/local/test/lib/libcassandra.so.1
#8  0x00007f1d2b1b4afd in cass_session_connect () from /usr/local/test/lib/libcassandra.so.1
#9  0x0000000000429c2b in setupCassandraSession (seedList="127.0.0.1", port=9042, io_thread_count=0)
#10 0x000000000042b600 in _main (argc=5, argv=0x7fff230c6558)
#11 0x00007f1d2b75bd5d in __libc_start_main () from /lib64/libc.so.6
#12 0x0000000000427cc9 in _start ()

when trying to bring connect a session I have the following code:

    s_pSession = cass_session_new();
    while(true) {
        CassFuture *pFuture = cass_session_connect(s_pSession, s_pCluster);

        cass_future_wait(pFuture);
        CassError rc;
        if (CASS_OK != (rc = cass_future_error_code(pFuture))) {
            cass_future_free(pFuture);
        } else {
            cass_future_free(pFuture);
            break;
        }
    }

Is it possible that cass_future_wait is returning before the mutex is unlocked?

Any suggestions?

Thanks
Brenda

Michael Penick

unread,
Feb 13, 2015, 3:49:50 PM2/13/15
to cpp-dri...@lists.datastax.com
Thanks. Working to reproduce: https://gist.github.com/mpenick/701c3b1235be548f89d1

Trying with cluster completely down then starting cluster. Unable to reproduce, yet.

Mike

To unsubscribe from this group and stop receiving emails from it, send an email to cpp-driver-us...@lists.datastax.com.

b_tho...@lycos.com

unread,
Feb 16, 2015, 3:34:07 PM2/16/15
to cpp-dri...@lists.datastax.com, Michael Penick

Hi Mike

I have more information on the crash I emailed you about on Friday.
It occurs during our upgrade process.
The driver is using ssl; the cassandra node is not.

The Cassandra logs have this error “ProtocolException: Invalid or unsupported protocol version: 22”

See apache-cassandra-2.1.2-src/src/java/org/apache/cassandra/transport/Frame.java to see the following bit of code:
            if (version > Server.CURRENT_VERSION)
                             throw new ProtocolException("Invalid or unsupported protocol version: " + version);

Cassandra is comparing the first byte of the ssl header (22) to the current cassandra version (3) and fails on it.

Some of the logs are attached. The core corresponding to these logs happened on Feb 13 at 21:41:09

Thanks Brenda

logs-for-rc1-and-session-connect-crash.txt

Michael Penick

unread,
Feb 17, 2015, 4:22:00 PM2/17/15
to b_tho...@lycos.com, cpp-dri...@lists.datastax.com
That's very useful, thanks. I'll work on reproducing the issue so I get started creating a patch.

Keep you posted.

Mike
Reply all
Reply to author
Forward
0 new messages