[PATCH 0/7] Make indexed queries with pk restrictions non-filtering

17 views
Skip to first unread message

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 9:24:46 AM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, duarte@scylladb.com
GitHub link: https://github.com/psarna/scylla/commits/si_and_pk_restrictions

Queries that use secondary index and have a full partition key restriction
or full primary key restriction should not require filtering - it's
sufficient to add these restrictions to the index query.
This also adds secondary index tests to cover this case.

Tests: unit (release)

Piotr Sarna (7):
cql3: make primary key restrictions' values unambiguous
cql3: add clone() method to single column restriction
cql3: add explicit conversion between key restrictions
cql3: add is_all_eq to primary key restrictions
cql3: use primary key restrictions in filtering index queries
cql3: make index+primary key restrictions filtering-independent
tests: add index + partition key test

cql3/restrictions/primary_key_restrictions.hh | 1 +
.../single_column_primary_key_restrictions.hh | 24 +++++++++++++++
cql3/restrictions/single_column_restriction.hh | 26 +++++++++++++++++
cql3/restrictions/statement_restrictions.cc | 12 ++++++--
cql3/statements/select_statement.cc | 34 +++++++++++++++++++---
tests/secondary_index_test.cc | 28 ++++++++++++++++++
6 files changed, 119 insertions(+), 6 deletions(-)

--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 9:24:47 AM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, duarte@scylladb.com
using directive must be used to disambiguate the overridden method.
---
cql3/restrictions/primary_key_restrictions.hh | 1 +
1 file changed, 1 insertion(+)

diff --git a/cql3/restrictions/primary_key_restrictions.hh b/cql3/restrictions/primary_key_restrictions.hh
index f4c7a5667..2437e13f5 100644
--- a/cql3/restrictions/primary_key_restrictions.hh
+++ b/cql3/restrictions/primary_key_restrictions.hh
@@ -88,6 +88,7 @@ class primary_key_restrictions: public abstract_restriction,

using restrictions::uses_function;
using restrictions::has_supporting_index;
+ using restrictions::values;

bool empty() const override {
return get_column_defs().empty();
--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 9:24:49 AM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, duarte@scylladb.com
This method allows copying single column restrictions,
possibly with new column definition.
---
cql3/restrictions/single_column_restriction.hh | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/cql3/restrictions/single_column_restriction.hh b/cql3/restrictions/single_column_restriction.hh
index 523dec3fe..fc9b071dc 100644
--- a/cql3/restrictions/single_column_restriction.hh
+++ b/cql3/restrictions/single_column_restriction.hh
@@ -95,6 +95,7 @@ class single_column_restriction : public abstract_restriction {
virtual bool is_supported_by(const secondary_index::index& index) const = 0;
using abstract_restriction::is_satisfied_by;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const = 0;
+ virtual ::shared_ptr<single_column_restriction> clone(const column_definition& cdef) = 0;
#if 0
/**
* Check if this type of restriction is supported by the specified index.
@@ -169,6 +170,9 @@ class single_column_restriction::EQ final : public single_column_restriction {
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
+ virtual ::shared_ptr<single_column_restriction> clone(const column_definition& cdef) override {
+ return ::make_shared<EQ>(cdef, _value);
+ }

#if 0
@Override
@@ -205,6 +209,9 @@ class single_column_restriction::IN : public single_column_restriction {
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
+ virtual ::shared_ptr<single_column_restriction> clone(const column_definition& cdef) override {
+ throw std::logic_error("IN superclass should never be cloned directly");
+ }

#if 0
@Override
@@ -239,6 +246,10 @@ class single_column_restriction::IN_with_values : public single_column_restricti
virtual sstring to_string() const override {
return sprint("IN(%s)", std::to_string(_values));
}
+
+ virtual ::shared_ptr<single_column_restriction> clone(const column_definition& cdef) override {
+ return ::make_shared<IN_with_values>(cdef, _values);
+ }
};

class single_column_restriction::IN_with_marker : public IN {
@@ -264,6 +275,10 @@ class single_column_restriction::IN_with_marker : public IN {
virtual sstring to_string() const override {
return "IN ?";
}
+
+ virtual ::shared_ptr<single_column_restriction> clone(const column_definition& cdef) override {
+ return ::make_shared<IN_with_marker>(cdef, _marker);
+ }
};

class single_column_restriction::slice : public single_column_restriction {
@@ -275,6 +290,11 @@ class single_column_restriction::slice : public single_column_restriction {
, _slice(term_slice::new_instance(bound, inclusive, std::move(term)))
{ }

+ slice(const column_definition& column_def, term_slice slice)
+ : single_column_restriction(column_def)
+ , _slice(slice)
+ { }
+
virtual bool uses_function(const sstring& ks_name, const sstring& function_name) const override {
return (_slice.has_bound(statements::bound::START) && abstract_restriction::term_uses_function(_slice.bound(statements::bound::START), ks_name, function_name))
|| (_slice.has_bound(statements::bound::END) && abstract_restriction::term_uses_function(_slice.bound(statements::bound::END), ks_name, function_name));
@@ -361,6 +381,9 @@ class single_column_restriction::slice : public single_column_restriction {
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
+ virtual ::shared_ptr<single_column_restriction> clone(const column_definition& cdef) override {
+ return ::make_shared<slice>(cdef, _slice);
+ }
};

// This holds CONTAINS, CONTAINS_KEY, and map[key] = value restrictions because we might want to have any combination of them.
@@ -483,6 +506,9 @@ class single_column_restriction::contains final : public single_column_restricti
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
+ virtual ::shared_ptr<single_column_restriction> clone(const column_definition& cdef) override {
+ throw std::logic_error("Cloning 'contains' restriction is not implemented.");
+ }

#if 0
private List<ByteBuffer> keys(const query_options& options) {
--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 9:24:50 AM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, duarte@scylladb.com
Partition and clustering key restrictions sometimes need to be converted
and this commit provides a way to do that.
---
.../single_column_primary_key_restrictions.hh | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/cql3/restrictions/single_column_primary_key_restrictions.hh b/cql3/restrictions/single_column_primary_key_restrictions.hh
index 5f6655ab4..94af84b55 100644
--- a/cql3/restrictions/single_column_primary_key_restrictions.hh
+++ b/cql3/restrictions/single_column_primary_key_restrictions.hh
@@ -63,6 +63,8 @@ class single_column_primary_key_restrictions : public primary_key_restrictions<V
using range_type = query::range<ValueType>;
using range_bound = typename range_type::bound;
using bounds_range_type = typename primary_key_restrictions<ValueType>::bounds_range_type;
+ template<typename OtherValueType>
+ friend class single_column_primary_key_restrictions;
private:
schema_ptr _schema;
bool _allow_filtering;
@@ -80,6 +82,24 @@ class single_column_primary_key_restrictions : public primary_key_restrictions<V
, _in(false)
{ }

+ // Convert another primary key restrictions type into this type, possibly using different schema
+ template<typename OtherValueType>
+ explicit single_column_primary_key_restrictions(schema_ptr schema, const single_column_primary_key_restrictions<OtherValueType> other)
+ : _schema(schema)
+ , _allow_filtering(other._allow_filtering)
+ , _restrictions(::make_shared<single_column_restrictions>(schema))
+ , _slice(other._slice)
+ , _contains(other._contains)
+ , _in(other._in)
+ {
+ for (const auto& entry : other._restrictions->restrictions()) {
+ const column_definition* other_cdef = entry.first;
+ const column_definition* this_cdef = _schema->get_column_definition(other_cdef->name());
+ ::shared_ptr<single_column_restriction> restriction = entry.second;
+ _restrictions->add_restriction(restriction->clone(*this_cdef));
+ }
+ }
+
virtual bool is_on_token() const override {
return false;
}
--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 9:24:51 AM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, duarte@scylladb.com
is_all_eq is later needed to decide if restrictions can be used
in an indexed query.
---
cql3/restrictions/single_column_primary_key_restrictions.hh | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/cql3/restrictions/single_column_primary_key_restrictions.hh b/cql3/restrictions/single_column_primary_key_restrictions.hh
index 94af84b55..080c6fe80 100644
--- a/cql3/restrictions/single_column_primary_key_restrictions.hh
+++ b/cql3/restrictions/single_column_primary_key_restrictions.hh
@@ -120,6 +120,10 @@ class single_column_primary_key_restrictions : public primary_key_restrictions<V
return _in;
}

+ bool is_all_eq() const {
+ return _restrictions->is_all_eq();
+ }
+
virtual bool has_bound(statements::bound b) const override {
return boost::algorithm::all_of(_restrictions->restrictions(), [b] (auto&& r) { return r.second->has_bound(b); });
}
--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 9:24:52 AM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, duarte@scylladb.com
If both index and partition key is used in a query, it should not
require filtering, because indexed query can be narrowed down
with partition key information. This commit appends partition key
restrictions to index query.
---
cql3/statements/select_statement.cc | 34 ++++++++++++++++++++++++++++++----
1 file changed, 30 insertions(+), 4 deletions(-)

diff --git a/cql3/statements/select_statement.cc b/cql3/statements/select_statement.cc
index 442075d8a..abebcab15 100644
--- a/cql3/statements/select_statement.cc
+++ b/cql3/statements/select_statement.cc
@@ -45,6 +45,7 @@
#include "transport/messages/result_message.hh"
#include "cql3/selection/selection.hh"
#include "cql3/util.hh"
+#include "cql3/restrictions/single_column_primary_key_restrictions.hh"
#include "core/shared_ptr.hh"
#include "query-result-reader.hh"
#include "query_result_merger.hh"
@@ -775,7 +776,7 @@ read_posting_list(service::storage_proxy& proxy,
schema_ptr view_schema,
schema_ptr base_schema,
const secondary_index::index& index,
- const std::vector<::shared_ptr<restrictions::restrictions>>& index_restrictions,
+ ::shared_ptr<restrictions::statement_restrictions> base_restrictions,
const query_options& options,
int32_t limit,
service::query_state& state,
@@ -786,7 +787,7 @@ read_posting_list(service::storage_proxy& proxy,
// FIXME: there should be only one index restriction for this index!
// Perhaps even one index restriction entirely (do we support
// intersection queries?).
- for (const auto& restrictions : index_restrictions) {
+ for (const auto& restrictions : base_restrictions->index_restrictions()) {
const column_definition* cdef = base_schema->get_column_definition(to_bytes(index.target_column()));
if (!cdef) {
throw exceptions::invalid_request_exception("Indexed column not found in schema");
@@ -800,7 +801,32 @@ read_posting_list(service::storage_proxy& proxy,
partition_ranges.emplace_back(range);
}
}
+
partition_slice_builder partition_slice_builder{*view_schema};
+
+ if (!base_restrictions->has_partition_key_unrestricted_components()) {
+ auto single_pk_restrictions = dynamic_pointer_cast<restrictions::single_column_primary_key_restrictions<partition_key>>(base_restrictions->get_partition_key_restrictions());
+ if (single_pk_restrictions && single_pk_restrictions->is_all_eq()) {
+ auto clustering_restrictions = ::make_shared<restrictions::single_column_primary_key_restrictions<clustering_key_prefix>>(view_schema, *single_pk_restrictions);
+ // Computed token column needs to be added to index view restrictions
+ const column_definition& token_cdef = *view_schema->clustering_key_columns().begin();
+ auto base_pk = partition_key::from_optional_exploded(*base_schema, base_restrictions->get_partition_key_restrictions()->values(options));
+ bytes token_value = dht::global_partitioner().token_to_bytes(dht::global_partitioner().get_token(*base_schema, base_pk));
+ auto token_restriction = ::make_shared<restrictions::single_column_restriction::EQ>(token_cdef, ::make_shared<cql3::constants::value>(cql3::raw_value::make_value(token_value)));
+ clustering_restrictions->merge_with(token_restriction);
+
+ if (!base_restrictions->has_unrestricted_clustering_columns()) {
+ auto single_ck_restrictions = dynamic_pointer_cast<restrictions::single_column_primary_key_restrictions<clustering_key>>(base_restrictions->get_partition_key_restrictions());
+ if (single_ck_restrictions) {
+ auto clustering_restrictions_from_base = ::make_shared<restrictions::single_column_primary_key_restrictions<clustering_key_prefix>>(view_schema, *single_pk_restrictions);
+ clustering_restrictions->merge_with(clustering_restrictions_from_base);
+ }
+ }
+
+ partition_slice_builder.with_ranges(clustering_restrictions->bounds_ranges(options));
+ }
+ }
+
auto cmd = ::make_lw_shared<query::read_command>(
view_schema->id(),
view_schema->version(),
@@ -828,7 +854,7 @@ indexed_table_select_statement::find_index_partition_ranges(service::storage_pro
schema_ptr view = get_index_schema(proxy, _index, _schema, state.get_trace_state());
auto now = gc_clock::now();
auto timeout = db::timeout_clock::now() + options.get_timeout_config().*get_timeout_config_selector();
- return read_posting_list(proxy, view, _schema, _index, _restrictions->index_restrictions(), options, get_limit(options), state, now, timeout).then(
+ return read_posting_list(proxy, view, _schema, _index, _restrictions, options, get_limit(options), state, now, timeout).then(
[this, now, &options, view] (service::storage_proxy::coordinator_query_result qr) {
std::vector<const column_definition*> columns;
for (const column_definition& cdef : _schema->partition_key_columns()) {
@@ -880,7 +906,7 @@ indexed_table_select_statement::find_index_clustering_rows(service::storage_prox
schema_ptr view = get_index_schema(proxy, _index, _schema, state.get_trace_state());
auto now = gc_clock::now();
auto timeout = db::timeout_clock::now() + options.get_timeout_config().*get_timeout_config_selector();
- return read_posting_list(proxy, view, _schema, _index, _restrictions->index_restrictions(), options, get_limit(options), state, now, timeout).then(
+ return read_posting_list(proxy, view, _schema, _index, _restrictions, options, get_limit(options), state, now, timeout).then(
[this, now, &options, view] (service::storage_proxy::coordinator_query_result qr) {
std::vector<const column_definition*> columns;
for (const column_definition& cdef : _schema->partition_key_columns()) {
--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 9:24:53 AM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, duarte@scylladb.com
If full partition key (or full primary key) is used in an indexed
query, it should not require filtering, because queries like that
can be efficiently narrowed down with stricter index restrictions.
---
cql3/restrictions/statement_restrictions.cc | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/cql3/restrictions/statement_restrictions.cc b/cql3/restrictions/statement_restrictions.cc
index 30b272949..dfbe3d395 100644
--- a/cql3/restrictions/statement_restrictions.cc
+++ b/cql3/restrictions/statement_restrictions.cc
@@ -430,7 +430,15 @@ bool statement_restrictions::need_filtering() const {
for (auto&& restrictions : _index_restrictions) {
number_of_restricted_columns_for_indexing += restrictions->size();
}
- int number_of_all_restrictions = _partition_key_restrictions->size() + _clustering_columns_restrictions->size() + _nonprimary_key_restrictions->size();
+
+ int number_of_filtering_restrictions = _nonprimary_key_restrictions->size();
+ // If the whole partition key is restricted, it does not imply filtering
+ if (_partition_key_restrictions->has_unrestricted_components(*_schema)) {
+ number_of_filtering_restrictions += _partition_key_restrictions->size();
+ if (_clustering_columns_restrictions->has_unrestricted_components(*_schema)) {
+ number_of_filtering_restrictions += _clustering_columns_restrictions->size();
+ }
+ }

if (_partition_key_restrictions->is_multi_column() || _clustering_columns_restrictions->is_multi_column()) {
// TODO(sarna): Implement ALLOW FILTERING support for multi-column restrictions - return false for now
@@ -442,7 +450,7 @@ bool statement_restrictions::need_filtering() const {
|| (number_of_restricted_columns_for_indexing == 0 && _partition_key_restrictions->empty() && !_clustering_columns_restrictions->empty())
|| (number_of_restricted_columns_for_indexing != 0 && _nonprimary_key_restrictions->has_multiple_contains())
|| (number_of_restricted_columns_for_indexing != 0 && !_uses_secondary_indexing)
- || (_uses_secondary_indexing && number_of_all_restrictions > 1);
+ || (_uses_secondary_indexing && number_of_filtering_restrictions > 1);
}

void statement_restrictions::validate_secondary_index_selections(bool selects_only_static_columns) {
--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 9:24:54 AM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, duarte@scylladb.com
Tests covering querying both index and partition keys are added
- it's checked that such queries do not require filtering.
---
tests/secondary_index_test.cc | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/tests/secondary_index_test.cc b/tests/secondary_index_test.cc
index 8c6fec12c..babab814a 100644
--- a/tests/secondary_index_test.cc
+++ b/tests/secondary_index_test.cc
@@ -317,3 +317,31 @@ SEASTAR_TEST_CASE(test_many_columns) {
});
});
}
+
+SEASTAR_TEST_CASE(test_index_with_partition_key) {
+ return do_with_cql_env_thread([] (auto& e) {
+ e.execute_cql("CREATE TABLE tab (a int, b int, c int, d int, e int, f int, PRIMARY KEY ((a, b), c, d))").get();
+ e.execute_cql("CREATE INDEX ON tab (e)").get();
+ e.execute_cql("INSERT INTO tab (a, b, c, d, e, f) VALUES (1, 2, 3, 4, 5, 6)").get();
+ e.execute_cql("INSERT INTO tab (a, b, c, d, e, f) VALUES (1, 0, 0, 0, 0, 0)").get();
+ e.execute_cql("INSERT INTO tab (a, b, c, d, e, f) VALUES (0, 2, 0, 0, 0, 0)").get();
+ e.execute_cql("INSERT INTO tab (a, b, c, d, e, f) VALUES (0, 0, 3, 0, 0, 0)").get();
+ e.execute_cql("INSERT INTO tab (a, b, c, d, e, f) VALUES (0, 0, 0, 4, 0, 0)").get();
+ e.execute_cql("INSERT INTO tab (a, b, c, d, e, f) VALUES (0, 0, 0, 0, 5, 0)").get();
+ e.execute_cql("INSERT INTO tab (a, b, c, d, e, f) VALUES (0, 0, 0, 7, 0, 6)").get();
+ e.execute_cql("INSERT INTO tab (a, b, c, d, e, f) VALUES (1, 2, 3, 7, 5, 0)").get();
+
+ // Queries that restrict the whole partition key and an index should not require filtering - they are not performance-heavy
+ eventually([&] {
+ auto res = e.execute_cql("SELECT * from tab WHERE a = 1 and b = 2 and e = 5").get0();
+ assert_that(res).is_rows().with_rows({
+ {{int32_type->decompose(1)}, {int32_type->decompose(2)}, {int32_type->decompose(3)}, {int32_type->decompose(4)}, {int32_type->decompose(5)}, {int32_type->decompose(6)}},
+ {{int32_type->decompose(1)}, {int32_type->decompose(2)}, {int32_type->decompose(3)}, {int32_type->decompose(7)}, {int32_type->decompose(5)}, {int32_type->decompose(0)}}
+ });
+ });
+
+ // Queries that restrict only a part of the partition key and an index require filtering, because we need to compute token
+ // in order to create a valid index view query
+ BOOST_REQUIRE_THROW(e.execute_cql("SELECT * from tab WHERE a = 1 and e = 5"), exceptions::invalid_request_exception);
+ });
+}
--
2.14.4

Avi Kivity

<avi@scylladb.com>
unread,
Jul 18, 2018, 10:06:27 AM7/18/18
to Piotr Sarna, scylladb-dev@googlegroups.com, duarte@scylladb.com


On 2018-07-18 16:24, Piotr Sarna wrote:
> Partition and clustering key restrictions sometimes need to be converted
> and this commit provides a way to do that.
> ---
> .../single_column_primary_key_restrictions.hh | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/cql3/restrictions/single_column_primary_key_restrictions.hh b/cql3/restrictions/single_column_primary_key_restrictions.hh
> index 5f6655ab4..94af84b55 100644
> --- a/cql3/restrictions/single_column_primary_key_restrictions.hh
> +++ b/cql3/restrictions/single_column_primary_key_restrictions.hh
> @@ -63,6 +63,8 @@ class single_column_primary_key_restrictions : public primary_key_restrictions<V
> using range_type = query::range<ValueType>;
> using range_bound = typename range_type::bound;
> using bounds_range_type = typename primary_key_restrictions<ValueType>::bounds_range_type;
> + template<typename OtherValueType>
> + friend class single_column_primary_key_restrictions;
> private:
> schema_ptr _schema;
> bool _allow_filtering;
> @@ -80,6 +82,24 @@ class single_column_primary_key_restrictions : public primary_key_restrictions<V
> , _in(false)
> { }
>
> + // Convert another primary key restrictions type into this type, possibly using different schema
> + template<typename OtherValueType>
> + explicit single_column_primary_key_restrictions(schema_ptr schema, const single_column_primary_key_restrictions<OtherValueType> other)

Missing "&" before "other"?

> + : _schema(schema)
> + , _allow_filtering(other._allow_filtering)
> + , _restrictions(::make_shared<single_column_restrictions>(schema))
> + , _slice(other._slice)
> + , _contains(other._contains)
> + , _in(other._in)
> + {
> + for (const auto& entry : other._restrictions->restrictions()) {
> + const column_definition* other_cdef = entry.first;
> + const column_definition* this_cdef = _schema->get_column_definition(other_cdef->name());

Missing nullptr check.

Avi Kivity

<avi@scylladb.com>
unread,
Jul 18, 2018, 10:07:34 AM7/18/18
to Piotr Sarna, scylladb-dev@googlegroups.com, Nadav Har'El, duarte@scylladb.com
Looks okay from a cursory review, Nadav please give it a more detailed
review.


On 2018-07-18 16:24, Piotr Sarna wrote:

Duarte Nunes

<duarte@scylladb.com>
unread,
Jul 18, 2018, 10:14:43 AM7/18/18
to Piotr Sarna, scylladb-dev@googlegroups.com
On Wed, 18 Jul 2018 at 15:24, Piotr Sarna <sa...@scylladb.com> wrote:
This method allows copying single column restrictions,
possibly with new column definition.

Not sure about the parameterized clone method, but I don’t have a better name suggestion. Maybe apply_to(column_definition*).

Duarte Nunes

<duarte@scylladb.com>
unread,
Jul 18, 2018, 10:56:43 AM7/18/18
to Piotr Sarna, scylladb-dev@googlegroups.com
Perhaps a comment for the uninitiated, saying we need all eqs to calculate the token key.

> + auto clustering_restrictions = ::make_shared<restrictions::single_column_primary_key_restrictions<clustering_key_prefix>>(view_schema, *single_pk_restrictions);
> + // Computed token column needs to be added to index view restrictions
> + const column_definition& token_cdef = *view_schema->clustering_key_columns().begin();
> + auto base_pk = partition_key::from_optional_exploded(*base_schema, base_restrictions->get_partition_key_restrictions()->values(options));
> + bytes token_value = dht::global_partitioner().token_to_bytes(dht::global_partitioner().get_token(*base_schema, base_pk));
> + auto token_restriction = ::make_shared<restrictions::single_column_restriction::EQ>(token_cdef, ::make_shared<cql3::constants::value>(cql3::raw_value::make_value(token_value)));
> + clustering_restrictions->merge_with(token_restriction);
> +
> + if (!base_restrictions->has_unrestricted_clustering_columns()) {

This is too strict. We only the need restricted clustering columns to form a prefix of the clustering key.

Duarte Nunes

<duarte@scylladb.com>
unread,
Jul 18, 2018, 10:57:20 AM7/18/18
to Piotr Sarna, scylladb-dev@googlegroups.com
On 18/07/2018 15:24, Piotr Sarna wrote:
> If full partition key (or full primary key) is used in an indexed
> query, it should not require filtering, because queries like that
> can be efficiently narrowed down with stricter index restrictions.
> ---
> cql3/restrictions/statement_restrictions.cc | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/cql3/restrictions/statement_restrictions.cc b/cql3/restrictions/statement_restrictions.cc
> index 30b272949..dfbe3d395 100644
> --- a/cql3/restrictions/statement_restrictions.cc
> +++ b/cql3/restrictions/statement_restrictions.cc
> @@ -430,7 +430,15 @@ bool statement_restrictions::need_filtering() const {
> for (auto&& restrictions : _index_restrictions) {
> number_of_restricted_columns_for_indexing += restrictions->size();
> }
> - int number_of_all_restrictions = _partition_key_restrictions->size() + _clustering_columns_restrictions->size() + _nonprimary_key_restrictions->size();
> +
> + int number_of_filtering_restrictions = _nonprimary_key_restrictions->size();
> + // If the whole partition key is restricted, it does not imply filtering
> + if (_partition_key_restrictions->has_unrestricted_components(*_schema)) {
> + number_of_filtering_restrictions += _partition_key_restrictions->size();
> + if (_clustering_columns_restrictions->has_unrestricted_components(*_schema)) {

Same comment about us only needing a prefix.

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 11:28:07 AM7/18/18
to Duarte Nunes, scylladb-dev@googlegroups.com
I know, it's the same optimization case as with
statement_restrictions::get_clustering_bounds(), and I think it requires
the same amount of work to do both of them at the same time. I could do
it in this series, but it sounds a little out of scope, so I was
thinking about creating a github issue about it and fixing it later.

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 11:49:51 AM7/18/18
to Duarte Nunes, scylladb-dev@googlegroups.com
On 07/18/2018 04:56 PM, Duarte Nunes wrote:
that, aaand I forgot that the same is_all_eq() condition must be used in
statement_restrictions::need_filtering(). Thanks

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 12:54:00 PM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, avi@scylladb.com, duarte@scylladb.com
GitHub link: https://github.com/psarna/scylla/commits/si_and_pk_restrictions_2

v2 changes:
* added missing & in (const T&) function header
* added nullptr check
* changed clone(cdef) to apply_to(cdef) to avoid parametered clone()
* added a comment on why is_all_eq() is needed
- I'd vote for adding support for clustering key prefix later, together
with statement_restrictions::optimizing clustering_bounds() optimization.
Still, I can do it in v3 if needed.

Queries that use secondary index and have a full partition key restriction
or full primary key restriction should not require filtering - it's
sufficient to add these restrictions to the index query.
This also adds secondary index tests to cover this case.

Tests: unit (release)

Piotr Sarna (7):
cql3: make primary key restrictions' values unambiguous
cql3: add apply_to() method to single column restriction
cql3: add explicit conversion between key restrictions
cql3: add is_all_eq to primary key restrictions
cql3: use primary key restrictions in filtering index queries
cql3: make index+primary key restrictions filtering-independent
tests: add index + partition key test

cql3/restrictions/primary_key_restrictions.hh | 4 +++
.../single_column_primary_key_restrictions.hh | 27 +++++++++++++++++
cql3/restrictions/single_column_restriction.hh | 26 ++++++++++++++++
cql3/restrictions/statement_restrictions.cc | 12 ++++++--
cql3/statements/select_statement.cc | 35 +++++++++++++++++++---
tests/secondary_index_test.cc | 28 +++++++++++++++++
6 files changed, 126 insertions(+), 6 deletions(-)

--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 12:54:01 PM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, avi@scylladb.com, duarte@scylladb.com

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 12:54:02 PM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, avi@scylladb.com, duarte@scylladb.com
This method allows copying single column restriction,
possibly with a new column definition.
---
cql3/restrictions/single_column_restriction.hh | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/cql3/restrictions/single_column_restriction.hh b/cql3/restrictions/single_column_restriction.hh
index 523dec3fe..b41f443ce 100644
--- a/cql3/restrictions/single_column_restriction.hh
+++ b/cql3/restrictions/single_column_restriction.hh
@@ -95,6 +95,7 @@ class single_column_restriction : public abstract_restriction {
virtual bool is_supported_by(const secondary_index::index& index) const = 0;
using abstract_restriction::is_satisfied_by;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const = 0;
+ virtual ::shared_ptr<single_column_restriction> apply_to(const column_definition& cdef) = 0;
#if 0
/**
* Check if this type of restriction is supported by the specified index.
@@ -169,6 +170,9 @@ class single_column_restriction::EQ final : public single_column_restriction {
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
+ virtual ::shared_ptr<single_column_restriction> apply_to(const column_definition& cdef) override {
+ return ::make_shared<EQ>(cdef, _value);
+ }

#if 0
@Override
@@ -205,6 +209,9 @@ class single_column_restriction::IN : public single_column_restriction {
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
+ virtual ::shared_ptr<single_column_restriction> apply_to(const column_definition& cdef) override {
+ throw std::logic_error("IN superclass should never be cloned directly");
+ }

#if 0
@Override
@@ -239,6 +246,10 @@ class single_column_restriction::IN_with_values : public single_column_restricti
virtual sstring to_string() const override {
return sprint("IN(%s)", std::to_string(_values));
}
+
+ virtual ::shared_ptr<single_column_restriction> apply_to(const column_definition& cdef) override {
+ return ::make_shared<IN_with_values>(cdef, _values);
+ }
};

class single_column_restriction::IN_with_marker : public IN {
@@ -264,6 +275,10 @@ class single_column_restriction::IN_with_marker : public IN {
virtual sstring to_string() const override {
return "IN ?";
}
+
+ virtual ::shared_ptr<single_column_restriction> apply_to(const column_definition& cdef) override {
+ return ::make_shared<IN_with_marker>(cdef, _marker);
+ }
};

class single_column_restriction::slice : public single_column_restriction {
@@ -275,6 +290,11 @@ class single_column_restriction::slice : public single_column_restriction {
, _slice(term_slice::new_instance(bound, inclusive, std::move(term)))
{ }

+ slice(const column_definition& column_def, term_slice slice)
+ : single_column_restriction(column_def)
+ , _slice(slice)
+ { }
+
virtual bool uses_function(const sstring& ks_name, const sstring& function_name) const override {
return (_slice.has_bound(statements::bound::START) && abstract_restriction::term_uses_function(_slice.bound(statements::bound::START), ks_name, function_name))
|| (_slice.has_bound(statements::bound::END) && abstract_restriction::term_uses_function(_slice.bound(statements::bound::END), ks_name, function_name));
@@ -361,6 +381,9 @@ class single_column_restriction::slice : public single_column_restriction {
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
+ virtual ::shared_ptr<single_column_restriction> apply_to(const column_definition& cdef) override {
+ return ::make_shared<slice>(cdef, _slice);
+ }
};

// This holds CONTAINS, CONTAINS_KEY, and map[key] = value restrictions because we might want to have any combination of them.
@@ -483,6 +506,9 @@ class single_column_restriction::contains final : public single_column_restricti
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
+ virtual ::shared_ptr<single_column_restriction> apply_to(const column_definition& cdef) override {

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 12:54:03 PM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, avi@scylladb.com, duarte@scylladb.com
Partition and clustering key restrictions sometimes need to be converted
and this commit provides a way to do that.
---
.../single_column_primary_key_restrictions.hh | 23 ++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/cql3/restrictions/single_column_primary_key_restrictions.hh b/cql3/restrictions/single_column_primary_key_restrictions.hh
index 5f6655ab4..eb6de4413 100644
--- a/cql3/restrictions/single_column_primary_key_restrictions.hh
+++ b/cql3/restrictions/single_column_primary_key_restrictions.hh
@@ -63,6 +63,8 @@ class single_column_primary_key_restrictions : public primary_key_restrictions<V
using range_type = query::range<ValueType>;
using range_bound = typename range_type::bound;
using bounds_range_type = typename primary_key_restrictions<ValueType>::bounds_range_type;
+ template<typename OtherValueType>
+ friend class single_column_primary_key_restrictions;
private:
schema_ptr _schema;
bool _allow_filtering;
@@ -80,6 +82,27 @@ class single_column_primary_key_restrictions : public primary_key_restrictions<V
, _in(false)
{ }

+ // Convert another primary key restrictions type into this type, possibly using different schema
+ template<typename OtherValueType>
+ explicit single_column_primary_key_restrictions(schema_ptr schema, const single_column_primary_key_restrictions<OtherValueType>& other)
+ : _schema(schema)
+ , _allow_filtering(other._allow_filtering)
+ , _restrictions(::make_shared<single_column_restrictions>(schema))
+ , _slice(other._slice)
+ , _contains(other._contains)
+ , _in(other._in)
+ {
+ for (const auto& entry : other._restrictions->restrictions()) {
+ const column_definition* other_cdef = entry.first;
+ const column_definition* this_cdef = _schema->get_column_definition(other_cdef->name());
+ if (!this_cdef) {
+ throw exceptions::invalid_request_exception(sprint("Base column %s not found in view index schema", other_cdef->name_as_text()));
+ }
+ ::shared_ptr<single_column_restriction> restriction = entry.second;
+ _restrictions->add_restriction(restriction->apply_to(*this_cdef));
+ }
+ }
+
virtual bool is_on_token() const override {
return false;
}
--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 12:54:05 PM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, avi@scylladb.com, duarte@scylladb.com
is_all_eq is later needed to decide if restrictions can be used
in an indexed query.
---
cql3/restrictions/primary_key_restrictions.hh | 3 +++
cql3/restrictions/single_column_primary_key_restrictions.hh | 4 ++++
2 files changed, 7 insertions(+)

diff --git a/cql3/restrictions/primary_key_restrictions.hh b/cql3/restrictions/primary_key_restrictions.hh
index 2437e13f5..4ba401f23 100644
--- a/cql3/restrictions/primary_key_restrictions.hh
+++ b/cql3/restrictions/primary_key_restrictions.hh
@@ -100,6 +100,9 @@ class primary_key_restrictions: public abstract_restriction,
bool has_unrestricted_components(const schema& schema) const;

virtual bool needs_filtering(const schema& schema) const;
+ virtual bool is_all_eq() const {
+ return false;
+ }
};

template<>
diff --git a/cql3/restrictions/single_column_primary_key_restrictions.hh b/cql3/restrictions/single_column_primary_key_restrictions.hh
index eb6de4413..974fcfac3 100644
--- a/cql3/restrictions/single_column_primary_key_restrictions.hh
+++ b/cql3/restrictions/single_column_primary_key_restrictions.hh
@@ -123,6 +123,10 @@ class single_column_primary_key_restrictions : public primary_key_restrictions<V
return _in;
}

+ virtual bool is_all_eq() const override {

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 12:54:06 PM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, avi@scylladb.com, duarte@scylladb.com
If both index and partition key is used in a query, it should not
require filtering, because indexed query can be narrowed down
with partition key information. This commit appends partition key
restrictions to index query.
---
cql3/statements/select_statement.cc | 35 +++++++++++++++++++++++++++++++----
1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/cql3/statements/select_statement.cc b/cql3/statements/select_statement.cc
index 442075d8a..17b0a7cc0 100644
@@ -800,7 +801,33 @@ read_posting_list(service::storage_proxy& proxy,
partition_ranges.emplace_back(range);
}
}
+
partition_slice_builder partition_slice_builder{*view_schema};
+
+ if (!base_restrictions->has_partition_key_unrestricted_components()) {
+ auto single_pk_restrictions = dynamic_pointer_cast<restrictions::single_column_primary_key_restrictions<partition_key>>(base_restrictions->get_partition_key_restrictions());
+ // Only EQ restrictions on base partition key can be used in an index view query
+ if (single_pk_restrictions && single_pk_restrictions->is_all_eq()) {
+ auto clustering_restrictions = ::make_shared<restrictions::single_column_primary_key_restrictions<clustering_key_prefix>>(view_schema, *single_pk_restrictions);
+ // Computed token column needs to be added to index view restrictions
+ const column_definition& token_cdef = *view_schema->clustering_key_columns().begin();
+ auto base_pk = partition_key::from_optional_exploded(*base_schema, base_restrictions->get_partition_key_restrictions()->values(options));
+ bytes token_value = dht::global_partitioner().token_to_bytes(dht::global_partitioner().get_token(*base_schema, base_pk));
+ auto token_restriction = ::make_shared<restrictions::single_column_restriction::EQ>(token_cdef, ::make_shared<cql3::constants::value>(cql3::raw_value::make_value(token_value)));
+ clustering_restrictions->merge_with(token_restriction);
+
+ if (!base_restrictions->has_unrestricted_clustering_columns()) {
+ auto single_ck_restrictions = dynamic_pointer_cast<restrictions::single_column_primary_key_restrictions<clustering_key>>(base_restrictions->get_partition_key_restrictions());
+ if (single_ck_restrictions) {
+ auto clustering_restrictions_from_base = ::make_shared<restrictions::single_column_primary_key_restrictions<clustering_key_prefix>>(view_schema, *single_pk_restrictions);
+ clustering_restrictions->merge_with(clustering_restrictions_from_base);
+ }
+ }
+
+ partition_slice_builder.with_ranges(clustering_restrictions->bounds_ranges(options));
+ }
+ }
+
auto cmd = ::make_lw_shared<query::read_command>(
view_schema->id(),
view_schema->version(),
@@ -828,7 +855,7 @@ indexed_table_select_statement::find_index_partition_ranges(service::storage_pro
schema_ptr view = get_index_schema(proxy, _index, _schema, state.get_trace_state());
auto now = gc_clock::now();
auto timeout = db::timeout_clock::now() + options.get_timeout_config().*get_timeout_config_selector();
- return read_posting_list(proxy, view, _schema, _index, _restrictions->index_restrictions(), options, get_limit(options), state, now, timeout).then(
+ return read_posting_list(proxy, view, _schema, _index, _restrictions, options, get_limit(options), state, now, timeout).then(
[this, now, &options, view] (service::storage_proxy::coordinator_query_result qr) {
std::vector<const column_definition*> columns;
for (const column_definition& cdef : _schema->partition_key_columns()) {
@@ -880,7 +907,7 @@ indexed_table_select_statement::find_index_clustering_rows(service::storage_prox
schema_ptr view = get_index_schema(proxy, _index, _schema, state.get_trace_state());
auto now = gc_clock::now();
auto timeout = db::timeout_clock::now() + options.get_timeout_config().*get_timeout_config_selector();
- return read_posting_list(proxy, view, _schema, _index, _restrictions->index_restrictions(), options, get_limit(options), state, now, timeout).then(
+ return read_posting_list(proxy, view, _schema, _index, _restrictions, options, get_limit(options), state, now, timeout).then(
[this, now, &options, view] (service::storage_proxy::coordinator_query_result qr) {
std::vector<const column_definition*> columns;
for (const column_definition& cdef : _schema->partition_key_columns()) {
--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 12:54:07 PM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, avi@scylladb.com, duarte@scylladb.com
If full partition key (or full primary key) is used in an indexed
query, it should not require filtering, because queries like that
can be efficiently narrowed down with stricter index restrictions.
---
cql3/restrictions/statement_restrictions.cc | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/cql3/restrictions/statement_restrictions.cc b/cql3/restrictions/statement_restrictions.cc
index 30b272949..d90aaa370 100644
--- a/cql3/restrictions/statement_restrictions.cc
+++ b/cql3/restrictions/statement_restrictions.cc
@@ -430,7 +430,15 @@ bool statement_restrictions::need_filtering() const {
for (auto&& restrictions : _index_restrictions) {
number_of_restricted_columns_for_indexing += restrictions->size();
}
- int number_of_all_restrictions = _partition_key_restrictions->size() + _clustering_columns_restrictions->size() + _nonprimary_key_restrictions->size();
+
+ int number_of_filtering_restrictions = _nonprimary_key_restrictions->size();
+ // If the whole partition key is restricted, it does not imply filtering
+ if (_partition_key_restrictions->has_unrestricted_components(*_schema) || !_partition_key_restrictions->is_all_eq()) {
+ number_of_filtering_restrictions += _partition_key_restrictions->size();
+ if (_clustering_columns_restrictions->has_unrestricted_components(*_schema)) {
+ number_of_filtering_restrictions += _clustering_columns_restrictions->size();
+ }
+ }

if (_partition_key_restrictions->is_multi_column() || _clustering_columns_restrictions->is_multi_column()) {
// TODO(sarna): Implement ALLOW FILTERING support for multi-column restrictions - return false for now
@@ -442,7 +450,7 @@ bool statement_restrictions::need_filtering() const {
|| (number_of_restricted_columns_for_indexing == 0 && _partition_key_restrictions->empty() && !_clustering_columns_restrictions->empty())
|| (number_of_restricted_columns_for_indexing != 0 && _nonprimary_key_restrictions->has_multiple_contains())
|| (number_of_restricted_columns_for_indexing != 0 && !_uses_secondary_indexing)
- || (_uses_secondary_indexing && number_of_all_restrictions > 1);
+ || (_uses_secondary_indexing && number_of_filtering_restrictions > 1);
}

void statement_restrictions::validate_secondary_index_selections(bool selects_only_static_columns) {
--
2.14.4

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 18, 2018, 12:54:09 PM7/18/18
to scylladb-dev@googlegroups.com, Piotr Sarna, avi@scylladb.com, duarte@scylladb.com

Duarte Nunes

<duarte@scylladb.com>
unread,
Jul 18, 2018, 2:02:04 PM7/18/18
to Piotr Sarna, scylladb-dev@googlegroups.com
I'm not sure I follow. Don't we now have infrastructure to detect this? How do we know whether a query needs filtering?

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 19, 2018, 4:55:09 AM7/19/18
to Duarte Nunes, scylladb-dev@googlegroups.com
I meant that back when I was implementing first ALLOW FILTERING series,
I lacked a neat API to extract the biggest prefix part out of all
primary_key_restrictions<clustering_key>, so I decided to leave it for
later in order to narrow down the scope of my series.
And here in this case, I think there is no API to check that "these
primary key restrictions on a clustering key are also a prefix".
So, since the two look related, I figured that I could just send a
follow-up miniseries that creates some kind of prefix-related API for
primary key restrictions and solves both of these issues. But that was
just my vague idea, I'm not going to insist that this is out of scope
for my current series (because it kind of is in scope) - I can just add
the prefix logic to v3.

Duarte Nunes

<duarte@scylladb.com>
unread,
Jul 19, 2018, 5:10:27 AM7/19/18
to Piotr Sarna, scylladb-dev@googlegroups.com
Alright, can be done as a follow-up.

Piotr Sarna

<sarna@scylladb.com>
unread,
Jul 19, 2018, 5:16:43 AM7/19/18
to Duarte Nunes, scylladb-dev@googlegroups.com

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 11:55:22 AM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: next

cql3: make primary key restrictions' values unambiguous

using directive must be used to disambiguate the overridden method.

---
diff --git a/cql3/restrictions/primary_key_restrictions.hh
b/cql3/restrictions/primary_key_restrictions.hh
--- a/cql3/restrictions/primary_key_restrictions.hh
+++ b/cql3/restrictions/primary_key_restrictions.hh
@@ -88,6 +88,7 @@ public:

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 11:55:23 AM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: next

cql3: add apply_to() method to single column restriction

This method allows copying single column restriction,
possibly with a new column definition.

---
diff --git a/cql3/restrictions/single_column_restriction.hh
b/cql3/restrictions/single_column_restriction.hh
--- a/cql3/restrictions/single_column_restriction.hh
+++ b/cql3/restrictions/single_column_restriction.hh
@@ -95,6 +95,7 @@ public:
virtual bool is_supported_by(const secondary_index::index& index)
const = 0;
using abstract_restriction::is_satisfied_by;
virtual bool is_satisfied_by(bytes_view data, const query_options&
options) const = 0;
+ virtual ::shared_ptr<single_column_restriction> apply_to(const
column_definition& cdef) = 0;
#if 0
/**
* Check if this type of restriction is supported by the specified
index.
@@ -169,6 +170,9 @@ public:
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options&
options) const override;
+ virtual ::shared_ptr<single_column_restriction> apply_to(const
column_definition& cdef) override {
+ return ::make_shared<EQ>(cdef, _value);
+ }

#if 0
@Override
@@ -205,6 +209,9 @@ public:
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options&
options) const override;
+ virtual ::shared_ptr<single_column_restriction> apply_to(const
column_definition& cdef) override {
+ throw std::logic_error("IN superclass should never be cloned
directly");
+ }

#if 0
@Override
@@ -239,6 +246,10 @@ public:
virtual sstring to_string() const override {
return sprint("IN(%s)", std::to_string(_values));
}
+
+ virtual ::shared_ptr<single_column_restriction> apply_to(const
column_definition& cdef) override {
+ return ::make_shared<IN_with_values>(cdef, _values);
+ }
};

class single_column_restriction::IN_with_marker : public IN {
@@ -264,6 +275,10 @@ public:
virtual sstring to_string() const override {
return "IN ?";
}
+
+ virtual ::shared_ptr<single_column_restriction> apply_to(const
column_definition& cdef) override {
+ return ::make_shared<IN_with_marker>(cdef, _marker);
+ }
};

class single_column_restriction::slice : public single_column_restriction {
@@ -275,6 +290,11 @@ public:
, _slice(term_slice::new_instance(bound, inclusive,
std::move(term)))
{ }

+ slice(const column_definition& column_def, term_slice slice)
+ : single_column_restriction(column_def)
+ , _slice(slice)
+ { }
+
virtual bool uses_function(const sstring& ks_name, const sstring&
function_name) const override {
return (_slice.has_bound(statements::bound::START) &&
abstract_restriction::term_uses_function(_slice.bound(statements::bound::START),
ks_name, function_name))
|| (_slice.has_bound(statements::bound::END) &&
abstract_restriction::term_uses_function(_slice.bound(statements::bound::END),
ks_name, function_name));
@@ -361,6 +381,9 @@ public:
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options&
options) const override;
+ virtual ::shared_ptr<single_column_restriction> apply_to(const
column_definition& cdef) override {
+ return ::make_shared<slice>(cdef, _slice);
+ }
};

// This holds CONTAINS, CONTAINS_KEY, and map[key] = value restrictions
because we might want to have any combination of them.
@@ -483,6 +506,9 @@ public:

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 11:55:24 AM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: next

cql3: add explicit conversion between key restrictions

Partition and clustering key restrictions sometimes need to be converted
and this commit provides a way to do that.

---
diff --git a/cql3/restrictions/single_column_primary_key_restrictions.hh
b/cql3/restrictions/single_column_primary_key_restrictions.hh
--- a/cql3/restrictions/single_column_primary_key_restrictions.hh
+++ b/cql3/restrictions/single_column_primary_key_restrictions.hh
@@ -63,6 +63,8 @@ class single_column_primary_key_restrictions : public
primary_key_restrictions<V
using range_type = query::range<ValueType>;
using range_bound = typename range_type::bound;
using bounds_range_type = typename
primary_key_restrictions<ValueType>::bounds_range_type;
+ template<typename OtherValueType>
+ friend class single_column_primary_key_restrictions;
private:
schema_ptr _schema;
bool _allow_filtering;
@@ -80,6 +82,27 @@ public:

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 11:55:26 AM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: next

cql3: add is_all_eq to primary key restrictions

is_all_eq is later needed to decide if restrictions can be used
in an indexed query.

---
diff --git a/cql3/restrictions/primary_key_restrictions.hh
b/cql3/restrictions/primary_key_restrictions.hh
--- a/cql3/restrictions/primary_key_restrictions.hh
+++ b/cql3/restrictions/primary_key_restrictions.hh
@@ -100,6 +100,9 @@ public:
bool has_unrestricted_components(const schema& schema) const;

virtual bool needs_filtering(const schema& schema) const;
+ virtual bool is_all_eq() const {
+ return false;
+ }
};

template<>
diff --git a/cql3/restrictions/single_column_primary_key_restrictions.hh
b/cql3/restrictions/single_column_primary_key_restrictions.hh
--- a/cql3/restrictions/single_column_primary_key_restrictions.hh
+++ b/cql3/restrictions/single_column_primary_key_restrictions.hh
@@ -123,6 +123,10 @@ public:

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 11:55:27 AM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: next

cql3: use primary key restrictions in filtering index queries

If both index and partition key is used in a query, it should not
require filtering, because indexed query can be narrowed down
with partition key information. This commit appends partition key
restrictions to index query.

---
diff --git a/cql3/statements/select_statement.cc
b/cql3/statements/select_statement.cc
@@ -800,7 +801,33 @@ read_posting_list(service::storage_proxy& proxy,
partition_ranges.emplace_back(range);
}
}
+
partition_slice_builder partition_slice_builder{*view_schema};
+
+ if (!base_restrictions->has_partition_key_unrestricted_components()) {
+ auto single_pk_restrictions =
dynamic_pointer_cast<restrictions::single_column_primary_key_restrictions<partition_key>>(base_restrictions->get_partition_key_restrictions());
+ // Only EQ restrictions on base partition key can be used in an
index view query
+ if (single_pk_restrictions && single_pk_restrictions->is_all_eq())
{
+ auto clustering_restrictions
= ::make_shared<restrictions::single_column_primary_key_restrictions<clustering_key_prefix>>(view_schema,
*single_pk_restrictions);
+ // Computed token column needs to be added to index view
restrictions
+ const column_definition& token_cdef =
*view_schema->clustering_key_columns().begin();
+ auto base_pk =
partition_key::from_optional_exploded(*base_schema,
base_restrictions->get_partition_key_restrictions()->values(options));
+ bytes token_value =
dht::global_partitioner().token_to_bytes(dht::global_partitioner().get_token(*base_schema,
base_pk));
+ auto token_restriction
= ::make_shared<restrictions::single_column_restriction::EQ>(token_cdef, ::make_shared<cql3::constants::value>(cql3::raw_value::make_value(token_value)));
+ clustering_restrictions->merge_with(token_restriction);
+
+ if (!base_restrictions->has_unrestricted_clustering_columns())
{
+ auto single_ck_restrictions =
dynamic_pointer_cast<restrictions::single_column_primary_key_restrictions<clustering_key>>(base_restrictions->get_partition_key_restrictions());
+ if (single_ck_restrictions) {
+ auto clustering_restrictions_from_base
= ::make_shared<restrictions::single_column_primary_key_restrictions<clustering_key_prefix>>(view_schema,
*single_pk_restrictions);
+
clustering_restrictions->merge_with(clustering_restrictions_from_base);
+ }
+ }
+
+
partition_slice_builder.with_ranges(clustering_restrictions->bounds_ranges(options));
+ }
+ }
+
auto cmd = ::make_lw_shared<query::read_command>(
view_schema->id(),
view_schema->version(),
@@ -828,7 +855,7 @@
indexed_table_select_statement::find_index_partition_ranges(service::storage_pro
schema_ptr view = get_index_schema(proxy, _index, _schema,
state.get_trace_state());
auto now = gc_clock::now();
auto timeout = db::timeout_clock::now() +
options.get_timeout_config().*get_timeout_config_selector();
- return read_posting_list(proxy, view, _schema, _index,
_restrictions->index_restrictions(), options, get_limit(options), state,
now, timeout).then(
+ return read_posting_list(proxy, view, _schema, _index, _restrictions,
options, get_limit(options), state, now, timeout).then(
[this, now, &options, view]
(service::storage_proxy::coordinator_query_result qr) {
std::vector<const column_definition*> columns;
for (const column_definition& cdef :
_schema->partition_key_columns()) {
@@ -880,7 +907,7 @@

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 11:55:28 AM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: next

cql3: make index+primary key restrictions filtering-independent

If full partition key (or full primary key) is used in an indexed
query, it should not require filtering, because queries like that
can be efficiently narrowed down with stricter index restrictions.

---
diff --git a/cql3/restrictions/statement_restrictions.cc
b/cql3/restrictions/statement_restrictions.cc

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 11:55:30 AM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: next

tests: add index + partition key test

Tests covering querying both index and partition keys are added
- it's checked that such queries do not require filtering.

---
diff --git a/tests/secondary_index_test.cc b/tests/secondary_index_test.cc

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 2:43:13 PM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: master

cql3: make primary key restrictions' values unambiguous

using directive must be used to disambiguate the overridden method.

---
diff --git a/cql3/restrictions/primary_key_restrictions.hh
b/cql3/restrictions/primary_key_restrictions.hh
--- a/cql3/restrictions/primary_key_restrictions.hh
+++ b/cql3/restrictions/primary_key_restrictions.hh

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 2:43:14 PM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: master

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 2:43:15 PM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: master

cql3: add explicit conversion between key restrictions

Partition and clustering key restrictions sometimes need to be converted
and this commit provides a way to do that.

---
diff --git a/cql3/restrictions/single_column_primary_key_restrictions.hh
b/cql3/restrictions/single_column_primary_key_restrictions.hh
--- a/cql3/restrictions/single_column_primary_key_restrictions.hh
+++ b/cql3/restrictions/single_column_primary_key_restrictions.hh

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 2:43:17 PM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: master

cql3: add is_all_eq to primary key restrictions

is_all_eq is later needed to decide if restrictions can be used
in an indexed query.

---
diff --git a/cql3/restrictions/primary_key_restrictions.hh
b/cql3/restrictions/primary_key_restrictions.hh
--- a/cql3/restrictions/primary_key_restrictions.hh
+++ b/cql3/restrictions/primary_key_restrictions.hh
@@ -100,6 +100,9 @@ public:
bool has_unrestricted_components(const schema& schema) const;

virtual bool needs_filtering(const schema& schema) const;
+ virtual bool is_all_eq() const {
+ return false;
+ }
};

template<>
diff --git a/cql3/restrictions/single_column_primary_key_restrictions.hh
b/cql3/restrictions/single_column_primary_key_restrictions.hh
--- a/cql3/restrictions/single_column_primary_key_restrictions.hh
+++ b/cql3/restrictions/single_column_primary_key_restrictions.hh

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 2:43:18 PM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: master

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 2:43:19 PM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: master

Commit Bot

<bot@cloudius-systems.com>
unread,
Jul 19, 2018, 2:43:20 PM7/19/18
to scylladb-dev@googlegroups.com, Piotr Sarna
From: Piotr Sarna <sa...@scylladb.com>
Committer: Piotr Sarna <sa...@scylladb.com>
Branch: master

Duarte Nunes

<duarte@scylladb.com>
unread,
Oct 3, 2018, 3:46:31 PM10/3/18
to Piotr Sarna, scylladb-dev@googlegroups.com, Commit Bot
Why is this so different than _partition_key_restrictions->need_filtering(), which does:

inline bool primary_key_restrictions<partition_key>::needs_filtering(const schema& schema) const  {
    return !empty() && !is_on_token() && (has_unrestricted_components(schema) || is_contains() || is_slice());
}

?

Piotr Sarna

<sarna@scylladb.com>
unread,
Oct 4, 2018, 3:47:10 AM10/4/18
to Duarte Nunes, scylladb-dev@googlegroups.com, Commit Bot
By just looking at it I don't see any caveats, so it could be replaced
with primary_key_restrictions<partition_key>::needs_filtering() indeed.
(and as for testing the change, this scenario is tested by
secondary_index_test.cc::test_index_with_partition_key)

Duarte Nunes

<duarte@scylladb.com>
unread,
Oct 4, 2018, 4:17:53 AM10/4/18
to Piotr Sarna, Commit Bot, scylladb-dev@googlegroups.com
Ch 

What I don't understand is why they check different things? Like contains and is token :/

Piotr Sarna

<sarna@scylladb.com>
unread,
Oct 4, 2018, 4:32:53 AM10/4/18
to Duarte Nunes, Commit Bot, scylladb-dev@googlegroups.com

Oh, alright. (is_contains() || is_slice()) is roughly an equivalent of !is_all_eq(), but as for the token part - when we consider filtering + SI working together and the whole partition key is restricted, then we can use it in the indexed query and avoid filtering - by building index view key restrictions from base primary key + computed token. We do not have the code that creates it from base token restriction, so that's why the condition above checks for exactly what we can handle. What should be done is implementing this optimization for a token restriction too, which is conveniently the first clustering column of the index view table, and then the code above should just use primary_key_restrictions<partition_key>::needs_filtering().

Duarte Nunes

<duarte@scylladb.com>
unread,
Oct 4, 2018, 5:09:07 AM10/4/18
to Piotr Sarna, Commit Bot, scylladb-dev@googlegroups.com
I see. Can you open an issue about this so we don't forget?

Piotr Sarna

<sarna@scylladb.com>
unread,
Oct 4, 2018, 5:23:39 AM10/4/18
to Duarte Nunes, Commit Bot, scylladb-dev@googlegroups.com
Reply all
Reply to author
Forward
0 new messages