Hi,
does the max_cached_partition_size_in_bytes setting (found athttps://www.scylladb.com/2018/07/26/how-scylla-data-cache-works/) still exist in scylla? I cannot find it in the sourcecode and wonder if it perhaps got removed? If so, is there any alternative to it?
It was removed, and partitions of any size should be cached. Note
that caching happens on row granularity, some rows in a partition
can be cached while others are not.
My issue is that I have large partitions that are not cached any more. But I have plenty of memory for them to be cached.
Please provide more details. Any repeated read of a row should hit cache, unless enough time passed between repetitions for the row to be aged out.
You should be able to see whether cache is hit or not by using
tracing.
regards,Christian
--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/4cbb2b70-1002-4cbd-a8a8-d206db33b7fcn%40googlegroups.com.
Hi Avi,
perhaps my read pattern is the problem: I am reading random rows from these large partitions. But rarely the same row twice. So there are no cache-hits. With the Cassandra-style block cache, I would probably have cache hits, because of blocks containing multiple rows.
Scylla only reads the rows that are asked (sometimes it has to
over-read to fit sector boundary, but it doesn't over-parse from
sstables, so it doesn't see these rows).
Is there any way to make the scylla-caching more aggressive? E.g. to make it cache all rows it loaded from disk and not just the one that was requested.
Well, you can read all the data in a partition scan or full scan
with CL=ALL. Of course that's not a good solution.
I should have plenty of memory for all data, but it seems I have to first read everything once to have it cached. And since the application is doing single-reads, this takes time.
Another issue might be: A lot of queries are for missing rows. I don't assume missing rows are cached?
If you read single rows (ck=5), then a miss isn't cached. If you
read a range (ck>=3 AND ck<=7), and later read a single row,
it will detect the missing row in cache (if it has entries for
ck=4 and ck=6, it also knows that there is nothing between them).
Is the row cache populated also by writes?
Yes, but with limitations. If a row is present in cache, it can
be updated. If a row is not present in cache, but is present in
sstables, it cannot be updated from a write. The reason is that we
might need to merge the row with data from sstables, and we don't
want to issue an sstable read just for that.
I don't assume there is any way to enable the Linux block-caching in scylla?
No.
I will keep an eye on scylla_cache_row_hits/misses...
How large is your data? total data size, average row size, average rows per partition?
Scylla was designed for workloads where the data is much larger
than memory, and so page caching isn't effective.
regards,Ch
--On Wednesday, 3 February 2021 at 13:07:53 UTC+1 Avi Kivity wrote:
On 2/3/21 1:47 PM, hor...@gmail.com wrote:
Hi,
does the max_cached_partition_size_in_bytes setting (found athttps://www.scylladb.com/2018/07/26/how-scylla-data-cache-works/) still exist in scylla? I cannot find it in the sourcecode and wonder if it perhaps got removed? If so, is there any alternative to it?
It was removed, and partitions of any size should be cached. Note that caching happens on row granularity, some rows in a partition can be cached while others are not.
My issue is that I have large partitions that are not cached any more. But I have plenty of memory for them to be cached.
Please provide more details. Any repeated read of a row should hit cache, unless enough time passed between repetitions for the row to be aged out.
You should be able to see whether cache is hit or not by using tracing.
regards,Christian
--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/4cbb2b70-1002-4cbd-a8a8-d206db33b7fcn%40googlegroups.com.
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/697b7d19-a9e3-4ad0-a2ac-5ac9dd4b80f2n%40googlegroups.com.
On 2/3/21 2:27 PM, hor...@gmail.com wrote:
Hi Avi,
perhaps my read pattern is the problem: I am reading random rows from these large partitions. But rarely the same row twice. So there are no cache-hits. With the Cassandra-style block cache, I would probably have cache hits, because of blocks containing multiple rows.
Scylla only reads the rows that are asked (sometimes it has to over-read to fit sector boundary, but it doesn't over-parse from sstables, so it doesn't see these rows).
Is there any way to make the scylla-caching more aggressive? E.g. to make it cache all rows it loaded from disk and not just the one that was requested.
Well, you can read all the data in a partition scan or full scan with CL=ALL. Of course that's not a good solution.
I should have plenty of memory for all data, but it seems I have to first read everything once to have it cached. And since the application is doing single-reads, this takes time.
Another issue might be: A lot of queries are for missing rows. I don't assume missing rows are cached?
If you read single rows (ck=5), then a miss isn't cached. If you read a range (ck>=3 AND ck<=7), and later read a single row, it will detect the missing row in cache (if it has entries for ck=4 and ck=6, it also knows that there is nothing between them).
Is the row cache populated also by writes?
Yes, but with limitations. If a row is present in cache, it can be updated. If a row is not present in cache, but is present in sstables, it cannot be updated from a write. The reason is that we might need to merge the row with data from sstables, and we don't want to issue an sstable read just for that.
I don't assume there is any way to enable the Linux block-caching in scylla?
No.
I will keep an eye on scylla_cache_row_hits/misses...
How large is your data? total data size, average row size, average rows per partition?
Scylla was designed for workloads where the data is much larger than memory, and so page caching isn't effective.
Scylla only reads the rows that are asked (sometimes it has to over-read to fit sector boundary, but it doesn't over-parse from sstables, so it doesn't see these rows).
Well, you can read all the data in a partition scan or full scan with CL=ALL. Of course that's not a good solution.
If you read single rows (ck=5), then a miss isn't cached. If you read a range (ck>=3 AND ck<=7), and later read a single row, it will detect the missing row in cache (if it has entries for ck=4 and ck=6, it also knows that there is nothing between them).
How large is your data? total data size, average row size, average rows per partition?
How large is your data? total data size, average row size, average rows per partition?
Scylla was designed for workloads where the data is much larger than memory, and so page caching isn't effective.
We could handle such workloads better by using the spare memory for sstable block caches. This way the cache would get warmer faster if the workload fits in memory.
https://github.com/scylladb/scylla/issues/363
For such a small table, we can also scan it during startup, based on a table setting (prewarm cache).
But if it's so small, then I expect it will be brought into cache
after a short while. Maybe it's the lack of negative entries in
some circumstances that prevents the cache from being effective.
Maybe we can convert a cache miss for a row or range into a dummy
range/row.
On Wednesday, 3 February 2021 at 13:42:58 UTC+1 Avi Kivity wrote:
Scylla only reads the rows that are asked (sometimes it has to over-read to fit sector boundary, but it doesn't over-parse from sstables, so it doesn't see these rows).
In my special case it would be good if it could configured to also parse & cache those read rows (more aggressive caching). For me its reading data from disk like crazy, but throws most of it away. But it might be a very special case :-)
Well, you can read all the data in a partition scan or full scan with CL=ALL. Of course that's not a good solution.
yes :-)
Do try it, it's an interesting experiment. Also please share
cache hit/miss statistics before and after.
If you read single rows (ck=5), then a miss isn't cached. If you read a range (ck>=3 AND ck<=7), and later read a single row, it will detect the missing row in cache (if it has entries for ck=4 and ck=6, it also knows that there is nothing between them).
If at least the miss-information from the over-read was available, that would help a lot. Something like this:- User requests CK=5- A block of data is read, which contains CK=1,5,9- Data for 5 is cached, miss for 2-3 and 6-8 is cachedIf this was the case then any updates to the other keys would be cached on write.
Unfortunately this is counterproductive for other workloads. Because of the way that data is spread across many sstables, there is a lot of work needed to gather the information.
We might opportunistically over-read and cache, but for other
workloads, it would just evict stuff from cache. We'll need some
way for the user to describe what kind of locality to expect. For
your case, do you have anything more specific than "cache
everything"?
How large is your data? total data size, average row size, average rows per partition?
Its quite small.
Table: boe3SSTable count: 21SSTables in each level: [21/4]Space used (live): 3308337317Space used (total): 3308337317
How much memory do you have? After decompression this is 14GB, and the in-memory representation has significant overhead.
btw, Scylla Enterprise has an in-memory feature which keeps
sstables mirrored in memory. This isn't a cache, the sstables are
permanently memory-resident (as well as stored on disk).
--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/f94516b2-5d84-4d05-a5d6-ead8140c9975n%40googlegroups.com.