Does it ever make sense to turn off caching of BigQuery queries?

1,113 views
Skip to first unread message

Tal Shprecher

unread,
Jul 19, 2017, 4:18:37 PM7/19/17
to Google Cloud Developers
I may be seeing stale data being returned from BigQuery, but I'm not sure. The BigQuery documentation states that queries are not cached "If any of the referenced tables or logical views have changed since the results were previously cached"


Reading the documentation, I get the impression that caching affects performance and cost, but not correctness and that it can only help you. If I run a cached query, should I expect to see the latest values from my table even if it was updated a few seconds ago? I'd like to get a better idea of when turning off the cache makes sense.

While investigating, I came across this old post and am wondering if there are open issues around cache coherence.


Thanks,

Tal


Jordan (Cloud Platform Support)

unread,
Jul 20, 2017, 3:57:20 PM7/20/17
to Google Cloud Developers
You are correct, a hash is computed for the 'last modified time' of tables referenced by a cached query result. If the underlying data of a table that was referenced by the cached query has been modified, the 'last modified time' of the table will change which will effectively flush the now out-of-date cached query result. 

If you are seeing stale data (aka queries that return from cache even though changes have been made to their referenced tables), I recommend filing an issue report with the BigQuery team so that they may investigate the cause. Providing job IDs of queries that returned stale data will be helpful in their investigations. 
Reply all
Reply to author
Forward
0 new messages