UDT type validation bottlenecking Cassandra writes in cpp-driver

50 views
Skip to first unread message

Patrick L

unread,
Feb 8, 2023, 12:47:30 PM2/8/23
to DataStax C++ Driver for Apache Cassandra User Mailing List
Hello Team,

We are using the cpp-driver v2.16.0 and we're running into egregious validation timings with frozen UDT validations.

We have a large frozen UDT type with ~1,200 fields and have observed high validation times when binding to a prepared statement in the client driver.

We ran a flamegraph for one of our slow operations and found that the majority of our time was spent in this method: https://github.com/datastax/cpp-driver/blob/2.16.0/src/data_type.hpp#L352-L380 . That method is called through this call chain:
  • cass_statement_bind_user_type_by_name_n
  • datastax::internal::core::AbstractData::set<datastax::internal::core::UserTypeValue const*>
  • datastax::internal::core::AbstractData::set
  • datastax::internal::core::AbstractData::check<datastax::internal::core::UserTypeValue const*>
  • datastax::internal::core::IsValidDataType<datastax::internal::core::UserTypeValue const*>::operator
  • datastax::internal::core::UserType::equals
Calls to datastax::internal::core::UserType::equals take approximately 33% of our program execution time when populating 1-5 fields in the large UDT.

Based on the content of the equals(...) method it looks like the bound UDT object is being compared with an internal UDT type and performing a field-by-field comparison by field name and field type which results in a large number of string comparisons.

The main question is if there is any reason why the UDT name and some internal version number on the UDT type cannot be compared instead of a deep comparison of the UDT structure? The number of string comparisons which are required seems very excessive and kills the performance we get for our use-case.

Thank you for your time!

Kind regards,
Patrick L

Bret McGuire

unread,
Feb 15, 2023, 2:55:39 PM2/15/23
to DataStax C++ Driver for Apache Cassandra User Mailing List, Patrick L
   Hey Patrick, thanks for the question!

   I've filed CPP-963 to explore the idea of using some kind of versioning scheme for UDTs in the driver.  I would suggest following that ticket as any future updates will likely appear there.

   Also, for sake of completeness: I note that there's a question on StackOverflow at the moment which sounds very similar to the issue you're describing.  Do you know if this question relates to the same issue you describe above?  If for some reason it doesn't I want to make sure to follow up on that issue as well.

   Thanks again!

   - Bret -

Patrick L

unread,
Feb 15, 2023, 3:29:50 PM2/15/23
to DataStax C++ Driver for Apache Cassandra User Mailing List, Bret McGuire, Patrick L
Hi Bret, thank you for the response!

Thank you also for creating the issue to track this item. Regarding the StackOverflow post, it is related to this question and was posted by my teammate regarding this same issue.

Kind regards,
Patrick L
Reply all
Reply to author
Forward
0 new messages