Manually invalidating a cached SQL result set

196 views
Skip to first unread message

Vahrokh Vain

unread,
Oct 4, 2016, 3:32:55 PM10/4/16
to Fat-Free Framework
Hello,

I am creating a very complex application that, among other things, has got a quite SQL heavy users privileges system.

Loading user privileges with a cached query sounds a good idea... however administrators have a backend facility where they can change other users privileges. When this happens, the privileges cached query result set should be invalidated and the new privileges loaded (and cached again).

Is there a way to achieve this? Or, at least, a way to get the last query's cache hash code so I can see if I can remove it by hand?

Thanks in advance.

xfra35

unread,
Oct 4, 2016, 5:33:40 PM10/4/16
to Fat-Free Framework
Hi,

In order to invalidate the cached result of a query, you can set the TTL to a negative value. E.g:

function loadPrivileges($refresh=FALSE) {
  $query
= 'SELECT * FROM privileges WHERE ....';
 
return $db->exec($query,$args,$refresh ? -1 : 86400);
}

NB: setting the TTL to 0 would fetch a fresh result without updating existing cache data, so subsequent queries would keep on getting an outdated result.

ikkez

unread,
Oct 5, 2016, 7:57:07 AM10/5/16
to Fat-Free Framework
I did this for a custom DB wrapper once. I'll add this to the F3 db later the day ;)

Vahrokh Vain

unread,
Oct 5, 2016, 8:09:17 AM10/5/16
to Fat-Free Framework
Thank you both!

ikkez

unread,
Oct 11, 2016, 6:36:40 PM10/11/16
to Fat-Free Framework
feature added here:
https://github.com/bcosca/fatfree-core/commit/3122b5881ee792f4ab3ad71522444f1900e7820c

you can now keep track of used cache keys and invalidate them if needed, i.e.:

$results = $mapper->find(['foo > ?',5],null,3600);
$f3
->merge('cache_tags.images',$f3->get('DB')->hash(), TRUE);

and remove them with

if ($f3->exists('cache_tags.images', $keys))
   
foreach($keys as $key)
       
\Cache::instance()->clear($key);

you should care about where to store the cache tags across requests yourself. the new method could also return more than one hash.
Adding TRUE as 1st argument will clear the memorized hash keys in the db object, in case you want to measure a specific action.




Am Mittwoch, 5. Oktober 2016 14:09:17 UTC+2 schrieb Vahrokh Vain:
Thank you both!

xfra35

unread,
Oct 12, 2016, 5:05:58 AM10/12/16
to f3-fra...@googlegroups.com
That doesn't give a fine-grained control such as with the -1 ttl, because:
  • you need to run the query first, before being able to clear it
  • you never know exactly which queries are actually removed from the cache (all since the beginning of the request, not only the last one)
What's wrong with the -1 ttl? It's easy to use, fits perfectly well with mappers and gives choice between two slightly different behaviors (0 and -1 ttl):

$mapper->find($filter,$options,86400); // usual setting for front page

$mapper
->find($filter,$options,0); // get a fresh set, but don't invalidate existing cached data

$mapper
->find($filter,$options,-1); // get a fresh set + invalidate existing cached data

It just needs to be documented ^^

ikkez

unread,
Oct 12, 2016, 7:16:47 AM10/12/16
to f3-fra...@googlegroups.com
I do not say that the -1 ttl approach is not good. It's there for a reason and it's a good thing. But I think there are cases when this is not easy to manage.
For example you have a complex frontend, like a shop, and a lot of widgets displaying products. Probably your models are built to use caching all over the place. But when you now change the product price, how do you want to solve to call all queries again where you filtered or received that specific product, like list-view, best-sellers widget, special-sales widgets, etc. simply calling the model load/find with -1 ttl seems like an impossible task in such a case (only call it once! and run it with the same $args as the widgets, which probably used random values). Instead you could have a cached array variable that contains all relevant cache entries, keyed by usage tags or  product ids and you are ready to clear those specific cache entries when needed (I'm currently doing it this way for a shop script, though it does not use the SQL db, but works neat)

Vahrokh Vain

unread,
Oct 12, 2016, 10:05:19 AM10/12/16
to Fat-Free Framework
Hello

First of all thank you again!

I have a little curiosity: in which situations would I get an array by calling:

$f3->get('DB')->hash()

?

Query and subquery?

xfra35

unread,
Oct 12, 2016, 11:30:10 AM10/12/16
to f3-fra...@googlegroups.com
@Vahrokh:

$customer=new DB\SQL\Mapper($db,'customers');
$customer
->find($filter);
echo count
($db->hash()); // 2 (one for the schema and one for the query)
$customer
->getOrders();
echo count
($db->hash()); // 3 (one for the schema, two for the queries)


@ikkez

What you're saying is that we need to add an extra statement after each cached query. Right?

Like:
// controller1
$mapper
->find($filter1,$opts,$ttl);

$f3
->merge('cache_tags.images',$f3->get('DB')->hash(),TRUE);
// controller2
$mapper
->find($filter2,$opts,$ttl);

$f3
->merge('cache_tags.images',$f3->get('DB')->hash(),TRUE);
// controller3
$mapper
->find($filter3,$opts,$ttl);

$f3
->merge('cache_tags.images',$f3->get('DB')->hash(),TRUE);

That looks heavy. Not to mention that we need to persist the 'cache_tags' somehow. Also it doesn't seem reliable as there could be some unexpected queries embedded in hash(), such as the ones induced by schema() or the onload() hook.

What about giving a bit of control on the cache key suffix, such as:

$db->exec($sql,$args,$ttl,'product'); // <-- tag that query as 'product' related
$mapper
->find($filter1,$opts,$ttl,'product'); // idem
$mapper
->find($filter2,$opts,$ttl,'product');
$mapper
->find($filter3,$opts,$ttl,'product');

Now all product-related queries would be cached with the suffix .product.sql and could easy be invalidated with Cache->reset().

Ideally it would be nice to be able to give multiple tags (for join queries for example), but Cache->reset() can't handle it.


ikkez

unread,
Oct 12, 2016, 5:31:22 PM10/12/16
to Fat-Free Framework
What you're saying is that we need to add an extra statement after each cached query. Right?

yes, but of course only when you need a cache tag for a reason.

Not to mention that we need to persist the 'cache_tags' somehow.

This could depend on the business code, since the hash is always the same for the same $args that are used. So clearing could also happen in an afterroute hook for example, if the use case fit, but yes maybe persisting them is needed.
 
 
Also it doesn't seem reliable as there could be some unexpected queries embedded in hash(), such as the ones induced by schema() or the onload() hook.

yes that's true, hence it could become even heavier by appending an additional call to clear the tags first, like:

$f3->get('DB')->hash(TRUE);
$mapper
->find($filter2,$opts,$ttl);

$f3
->merge('cache_tags.images',$f3->get('DB')->hash(),TRUE);
 
What about giving a bit of control on the cache key suffix

I really like the idea, and it makes much more sense. What about adjusting the $ttl for this purpose to accept an array? maybe like:
 
$db->exec($sql,$args,[$ttl,'product']);
 
$mapper->find($filter1,$opts,[$ttl,'product']);

xfra35

unread,
Oct 13, 2016, 2:46:04 AM10/13/16
to Fat-Free Framework
+1 for the array

This way, we don't break backward compatibility.

Vahrokh Vain

unread,
Oct 13, 2016, 1:06:49 PM10/13/16
to Fat-Free Framework
Hello,

I don't know about the Mapper a lot. I am using exec() because I need the top possible speed with no middle layer.
My question comes up because if we have got an array that grows by each new query, how would I deal in case I need to perform a lot of simple queries? Am I going to get a gargantuan array of hash codes?

Vahrokh Vain

unread,
Nov 13, 2016, 1:29:24 PM11/13/16
to f3-fra...@googlegroups.com
Hello again,

I am using this magnificent feature since a while.

However today I have implemented a CLI based script and have noticed that this simple code works when called from a web page but not under CLI:

function getBaseCarPlateCodes(&$cacheTags)
{
    $db
= $this->getDb();
    $sql
= 'SELECT CodiceProvinciaID, cp_SiglaAuto ' .
       
'FROM BASE_CodiciProvince ' .
       
'ORDER BY cp_SiglaAuto';
    $result
= $db->exec($sql, null, 3600);
    $cacheTags
= $db->hash();
   
// var_dump($cacheTags);
   
return $result;
}

In fact, when being called from a web page, $cacheTags gets populated, when called from CLI it does not and the var_dump($cacheTags); shows that.
$result gets correctly filled by the query results regardless of being called from CLI or not.

ikkez

unread,
Nov 13, 2016, 5:39:48 PM11/13/16
to Fat-Free Framework
Not sure what's wrong here, but this feature has been refactored and DB->hash is not available anymore... instead it's now:

$db->exec($sql, null, [3600,'customCacheTag']);

Vahrokh Vain

unread,
Nov 14, 2016, 7:29:52 AM11/14/16
to f3-fra...@googlegroups.com
Hello,

I am not sure about how to delete a tag created this new way.
The old way, all I had to do was to store the generated tag name somewhere.

I have looked in the changelog and I see: "NEW: support for cache tags in SQL" but no further details.
I have looked at the source code and I see the tag seems to get some leading prefix generated, then the "customCacheTag" appended.
I have looked at the source code of the "clear()" function but I have found nothing new.

So... how do I get the generated tag?

Or, better, what's the code to invalidate / force delete customCacheTag from the cache?

Thanks again in advance!

ikkez

unread,
Nov 14, 2016, 7:24:04 PM11/14/16
to Fat-Free Framework
creating goes like this:


$results
= $mapper->find(null,null,[3600,'customCacheTag']);


and deleting them:


Cache::
instance()->reset('customCacheTag');

Vahrokh Vain

unread,
Nov 14, 2016, 7:59:37 PM11/14/16
to f3-fra...@googlegroups.com
What happens in case two users call the same query with different parameters?

customCacheTag, unlike the old "hash()" name, is fixed, what about collisions?


Hmm let me explain better with a simplified use case.

I have a companies table, each user can only see his own company's customer companies:

company_id, company_name and :master_company_id

I know it's not the optimal normalized schema, but maintaining legacy software is just like this.

I have users A, B and C browsing each their own companies listing, identified by :master_company_id = 35. All 3 of them belong to a company whose ID = 35.
I have users D, E and F browsing each their own companies listing, identified by :master_company_id = 47
I have users G, H and I browsing each their own companies listing, identified by :master_company_id = 93

So far so good. With the old system, each of the users groups get their own cached query results based on that :master_company_id.

At this point, user B, deletes / adds / changes one of the companies whose :master_company_id = 35.

With the old system, I'd just go and delete the unique cache tag I earlier got by callling hash(). The first query cache would be reset for users A, B and C but nothing would happen to users D..I 's cached values.

How does this new system deal with this situation?
Does memcached (I use this cache storage) and / or F3 transparently store "
customCacheTag" + "something additional" I don't see, which still keeps those 3 queries cached each on its own?
Or do I end up overwriting each other's users groups cached content because
"customCacheTag" is taken "as is" and reused at every query call which specifies "customCacheTag" as tag?

Thanks again, I acknowledge I am really abusing of your time!

ikkez

unread,
Nov 15, 2016, 3:13:43 AM11/15/16
to Fat-Free Framework
the cache tag is just a little suffix that is added to the existing cache keys. So no problem with collision here, it just helps identifying the cache entries.
if you want to specifc the cache tags per company, nothing prevents you from setting something like

$mapper->load(['id = ?', $cID],null,[3600,'company_'+$cID]);

so you can identify single company cache entries.

Vahrokh Vain

unread,
Nov 15, 2016, 5:53:45 AM11/15/16
to Fat-Free Framework
Thank you!

Vahrokh Vain

unread,
Nov 15, 2016, 5:39:44 PM11/15/16
to f3-fra...@googlegroups.com
Hello,

sorry for bothering you again.

The cache tag works perfectly, but the suggested way to clear it



Cache::
instance()->reset('customCacheTag');



it does not, for me.

Here are the steps to reproduce:

1) Set this in config.ini:

[globals]
DEBUG
=3
CACHE
="memcache=localhost:11211" | 2592000
...
[routes]
GET
|POST /cache-tag-reset=<SNIP>\Helper\DebugClass->cacheTagReset



2) Create a very simple cache clear function:

class DebugClass
{
...
   
public function cacheTagReset($f3, $params)
   
{
       
Cache::instance()->reset('nlo_car_plate_codes');
        echo
"Cache cleared\n";
   
}
}


3) Create a simple database consumer class. I have this (from the examples I posted above).
The following (I am using MVC, so this is the Model file) gets called by a controller that may as well just consist of a call to this function + a render.

class blahModel
{
   
function getCarPlateCodes($ttl = 7200)
   
{
        $db
= $this->getDB(); // this returns the PDO SQL instance

        $sql
= 'SELECT CodiceProvinciaID, cp_SiglaAuto ' .
           
'FROM BASE_CodiciProvince ' .
           
'ORDER BY cp_SiglaAuto';

        $cacheTags
= 'nlo_car_plate_codes';
       
return $db->exec($sql, null, [$ttl, $cacheTags]);
   
}
}

Now, if I call 3), it correctly creates the memcached tag. Using a memcached backend tool, I can see it and it's as shown in the screenshot attached to this post.


Then I call 2), which uses the same cache tag and should remove the cached entry. I see the Cache cleared message.

However the key stays in memcached. The memcached screenshot actually has been taken after I have called the


Cache::instance()->reset('nlo_car_plate_codes');

Multiple times.


Is there anything I am missing here?



On Tuesday, November 15, 2016 at 8:13:43 AM UTC, ikkez wrote:

2016-11-15 22_23_56-phpMemcachedAdmin 1.2.2.png

Vahrokh Vain

unread,
Nov 15, 2016, 6:35:02 PM11/15/16
to f3-fra...@googlegroups.com
Hello again,

By using XDebug in a reset() call, I have found out where F3's base.php code shows an issue.

I have attached two pictures: the exact base.php line where the issue is located and a snapshot of the various variables and statements values right at that line.
The problem is here, exactly at the highlighted condition:

if (is_array($data))
     
foreach ($data as $key=>$val)
         
if (preg_match($regex,$key) &&
             
$val[1]+$lifetime<time())
              memcache_delete
($this->ref,$key);


As you can see by yourself (Watches screenshot), the $data is found.
Inside $data you can see the key and the nlo_car_plate_codes cache tag being properly included (See my previous post).

preg_match($regex,$key)

returns 1 (Watches screenshot) so the first portion of the AND statement is fulfilled.

However, if you look at:

$val[1]+$lifetime<time()

this statement is always false (see Watches screenshot).
In fact, $lifetime is set to zero (that is, irrelevant).
$val[1] is set to a timestamp that is always GREATER than time() (see Watches screenshot).

Therefore since $val[1] is greater than time(), the above statement fails because it checks for LESS THAN.

I have no control over $val[1], so the issue seems related to the key expiration timestamp being somehow improperly set up for memcache.


I have an idea. Considering I want to ALWAYS remove the key, so why have the

$val[1]+$lifetime<time()

check at all?
2016-11-15 23_11_22-fatfree-core-editor.png
2016-11-15 23_07_21-fatfree-core-watches.png
Reply all
Reply to author
Forward
0 new messages