Cache Proposals

129 views
Skip to first unread message

Robert Hafner

unread,
Feb 25, 2013, 10:48:07 PM2/25/13
to php...@googlegroups.com

Sorry to jump into this after such a long absence. Last year I had to have surgery, and that threw me out of things for awhile. Now that I'm recovered and caught up with work I want to get back into this properly though.

First, I'd like to point out that there are three different proposals that currently seem to be the most active.

1. Evert's "Object Cache" proposal, which defines a single Cache class. - https://github.com/evert/fig-standards/blob/master/proposed/objectcache.md

2. Florin's "Yet Another" proposal, which uses a few different classes and tries to combine a few approaches discussed on this list. - https://github.com/php-fig/fig-standards/pull/63

3. My own proposal, which is focused on two classes (the Pool and Item classes). - https://github.com/php-fig/fig-standards/pull/17



The biggest drawback to my proposal is that it was seen as too complicated. After discussing this with a few people it seemed that the "extensions" portion of it were distracting from the core proposal. I've taken their advice and removed the extensions from the proposal, putting that portion into a different branch ( https://github.com/tedivm/fig-standards/blob/Cache-Extensions/proposed/PSR-CacheExtensions.md ) to be discussed later.

What that leaves is a simple proposal focusing just on the Pool and Item classes. This differs from Evert's proposal in that there are two separate classes, and from Florin's proposal in where the functions live (Florin's tends to load all of his functions into a single class, with the Item class being more of a return object than a representation of a Cache Object in the sense that mine is). 

Now that I'm in better health, I'd like to push forward with my proposal, including any feedback from Florin or Evert as well as ideas from their proposals. I feel that Florin's in particular is moving closer to what mine represents, and would be happy to work to merge in anything that makes sense.

Robert

Paul Dragoonis

unread,
Feb 26, 2013, 5:13:05 PM2/26/13
to php...@googlegroups.com
Thanks for coming back to us Robert, I hope you're well after surgery.

Your PR is massive and I can't see the end-product could you link me?


--
You received this message because you are subscribed to the Google Groups "PHP Framework Interoperability Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-fig+u...@googlegroups.com.
To post to this group, send email to php...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Robert Hafner

unread,
Feb 26, 2013, 6:12:26 PM2/26/13
to php...@googlegroups.com

Paul Dragoonis

unread,
Feb 26, 2013, 6:26:47 PM2/26/13
to php...@googlegroups.com
Hi Robert,

Why does Pool->getItemIterator() return Iterator, ok i know the method name kinda implies this but I mean this is kinda like a getMultiple() call which is a better name and should just return an array || Traversable object.

Terminology needs to be reworked though, flush() needs to be something like empty() or clear() or wipe(). The flush argument recently is valid because flush in doctrine for example pushes changes, and in C fflush() will push changes to a buffer it's the expectation of pushing changes rather than wiping them.

Also a "Pool" is the expectation of an existing pool of cache objects but that's not the case here because the cache data is actually held in the caching server not here, this is just a 'Driver' or a 'Handler' to talk to that caching server. I think the word Driver is quite an accurate name here, but feel free to be creative with this if you can think of something better.

isMiss() is a strange direction of logic because it implies isFalse(), and we discussed this a lot in the past for intuitiveness reasons it should is isValid() or isExists() or something to say "is this here? is this valid? has this expired?".


This is my feedback here on your proposal which I do like but needs some love, outlined above.



Florin Patan

unread,
Feb 26, 2013, 6:34:25 PM2/26/13
to php...@googlegroups.com
Hi Robert,


Good to see you are back, hope you feel much better now.

I think this line should not be present: "Implementations are allowed to use a lower time than passed, but should not use a longer one.". It opens a potential can of worms in terms of implementations as switching from one to another will cause the system to behave differently. Also, I like to be in control of the things I write so whenever a library, application, whatnot thinks it's smarter that me it usually ends up bad for the said thingy. If I want to store something for 1 minute and the caching library decides to store it for 30 seconds only, I'd be pissed. Then if I'd compensate for it adding 30 seconds and then the library decides to store the item for 45 seconds I'd probably lookup the author in the phone book. Caching shouldn't be something that users don't have full control over.

I'll try and come up with a working implementation on both your PSR and mine and see if there are any other drawbacks on either and come with feedback.


Best regards,
Florin

Florin Patan

unread,
Feb 26, 2013, 6:48:23 PM2/26/13
to php...@googlegroups.com
Also, the current format doesn't specify how to save the item once you create it and set the value nor how to set the key for it.


Regards,
Florin

On Wednesday, February 27, 2013 1:12:26 AM UTC+2, Robert Hafner wrote:

Robert Hafner

unread,
Feb 27, 2013, 1:17:43 AM2/27/13
to php...@googlegroups.com
Terminology needs to be reworked though, flush() needs to be something like empty() or clear() or wipe(). The flush argument recently is valid because flush in doctrine for example pushes changes, and in C fflush() will push changes to a buffer it's the expectation of pushing changes rather than wiping them.

I changed this to "empty", which should work better.


isMiss() is a strange direction of logic because it implies isFalse(), and we discussed this a lot in the past for intuitiveness reasons it should is isValid() or isExists() or something to say "is this here? is this valid? has this expired?".

I changed this to "isValid".


Why does Pool->getItemIterator() return Iterator, ok i know the method name kinda implies this but I mean this is kinda like a getMultiple() call which is a better name and should just return an array || Traversable object.

I just thought this would be cleaner, but I'm not particularly set on it. I do feel that implementing libraries should be able to provide an object, not just an array, but a traversable object would fit that. I'd like to get some more input  or thoughts on this before I make the change though.


Also a "Pool" is the expectation of an existing pool of cache objects but that's not the case here because the cache data is actually held in the caching server not here, this is just a 'Driver' or a 'Handler' to talk to that caching server. I think the word Driver is quite an accurate name here, but feel free to be creative with this if you can think of something better.

I think Driver and Handler are also misnomers. I tend to view the "Pool" object similar to the way you'd view a PDO or Twig object- the underlying drivers (pgsql, sqlite, mysql for PDO, or the array, filesystem and string loaders for Twig) are abstracted away underneath the main class. In other words, the "driver" for the cache (whether it be memcache, filesystem, apc, or something else) should not affect or change the way the caching library is used. In fact, I can imagine that some implementations of this will use a single Pool class that gets it's drivers injected into it (obviously outside of the scope of this proposal).

Robert Hafner

unread,
Feb 27, 2013, 1:26:36 AM2/27/13
to php...@googlegroups.com

Users will never have full control over their caches, which is why caching should be so fault tolerant. There are a lot of reasons why an object would be cached for less time than provided. If you attempted to cache an object in memcache for 30 minutes, but it was never retrieved and the memcache servers filled up 20 minutes later, then a call for that item at 25 minutes would probably fail to retrieve it as memcache would have evicted it. 

An example that fits more into "userland" would be the way some libraries prevent cache stampedes from occurring. If your default TTL is 15 minutes and you start with an empty or fresh cache, then every 15 minutes you're going to see a huge spike in resources as all of those items become stale. By altering the TTL by removing a random amount of time from it you can distribute those misses more evenly (especially after running for some time), and spread out the load.

Robert


To view this discussion on the web visit https://groups.google.com/d/msg/php-fig/-/GD2mw65ZYyAJ.

Robert Hafner

unread,
Feb 27, 2013, 1:28:44 AM2/27/13
to php...@googlegroups.com

I don't follow, are you saying your proposal doesn't or mine doesn't? My proposal saves the item on "set", sets the item using "set", and the Item class is returned by the "getItem($key)" function of the Pool class (meaning it's up to the Pool to create the Item, allowing Implementing Authors to handle that how they please).

Robert



To view this discussion on the web visit https://groups.google.com/d/msg/php-fig/-/Abx466AvZ4kJ.

Florin Patan

unread,
Mar 1, 2013, 3:03:20 PM3/1/13
to php...@googlegroups.com
Hello,


Sorry for not being clear and missing the 'save' part of the set function.
There's no way for the user to specify the key value for the item currently in your proposal.
One could implement it via constructor, via setKey($key) (which would be a bit unnatural but lets say it would be possible) or the key could be autogenerated by the implementation. This part is missing currently.
Hope this is a bit clear now.


Regards,
Florin

Florin Patan

unread,
Mar 1, 2013, 3:37:55 PM3/1/13
to php...@googlegroups.com
I understand the problem that you are trying to solve but I'm sure that this should not be done by a library by default, without the explicit consent from the user.

Allowing authors to write such things is dangerous as the end users might not grasp intention or the concept.

This also brings a unpredictable behavior into a system in which predictability is the most needed thing and while people are striving to deliver predictable responses from systems this will induce an unpredictable result.

If the X cache instance gets full then the memory pool should be increased. This should not be the job of a library to assume things happen.

If, and I stress if, I would expect my systems to go down when no caching, the way I'd solve this problem in a totally different manner (I won't give the details I might infringe some legal points in my contract) but lets just say that there were a couple of well designed layers and a cronjob involved.

People/users should be thought how to think their architecture / solve problems like this and not have magical things done for them by various libraries in such cases, imho.

I strive hard to have my software as predictable as possible as it makes everything easier, like testing and debugging. If I were to want instability and unpredictability I'd use Chaos Monkey, which is an awesome tool :)

Also, if I read this right, the pool doesn't connect to the cache system but the item does so. Then this will totally deny the advantage of having an iterator / getMultiple equivalent.

From performance point of view this also implies that if I want to retrieve/store/delete more that one item from cache I'll need to make separate requests which may not be fun ;)

And finally, from a 'don't surprise your users' point of view, there's the issue that the actual operations are done by the Item class rather that the Pool class which is not very intuitive on a first look basis / without reading the actual code (and it implies again a predictability issue).


Best regards,
Florin

Josh Hall-Bachner

unread,
Mar 1, 2013, 6:45:57 PM3/1/13
to php...@googlegroups.com
"Implementations are allowed to use a lower time than passed, but should not use a longer one."

This is just describing an existing behavior restriction inherited from the general set of caching systems. If you are utilizing Memcached directly and you store an item with a specified TTL, you cannot operate under any guarantee that it will be stored for at least that long -- if the system is restarted, or your item was purged due to a full store, it won't be available regardless of the TTL you set.

In other words: you already have to treat any black-box cache implementation as having the potential to choose lower TTLs than you request, so whether that happens at the PHP level or the underlying level shouldn't matter.

Robert Hafner

unread,
Mar 1, 2013, 8:58:49 PM3/1/13
to php...@googlegroups.com

Also, if I read this right, the pool doesn't connect to the cache system but the item does so. Then this will totally deny the advantage of having an iterator / getMultiple equivalent.

You're not reading this right. Of course the Pool will connect to the cache system. How would it be able to empty/purge it otherwise? 

So I'm assuming here that the read of your points are based off of this misconception, so I'm not going to give a point by point breakdown. I don't quite understand where you got this idea, but it's not accurate.


If the X cache instance gets full then the memory pool should be increased. This should not be the job of a library to assume things happen.

This is *exactly* my point. It's not the job of the library to handle the actual back end caching. That means that there are circumstances where the caching platform itself decides to remove items- either because of some internal algorithms or due to system admin intervention, or some other reason. That's why the TTL is a *maximum*, not an exact number, because those systems aren't in the user's control.


I understand the problem that you are trying to solve but I'm sure that this should not be done by a library by default, without the explicit consent from the user.

This interfaces aren't the only part of the system that the developer will have access to- they're the common API between different frameworks and systems. Explicit consent for various features and functionality can easily exist in the caching libraries themselves without being addresses by this standard, as long as those libraries follow the interoperability guidelines set forth in this standard. In other words, this proposal shouldn't limit what the libraries can do, it should just make them more cross compatibility with each other.


Robert




To view this discussion on the web visit https://groups.google.com/d/msg/php-fig/-/bcSEAGqJK3sJ.

Robert Hafner

unread,
Mar 1, 2013, 9:06:26 PM3/1/13
to php...@googlegroups.com
Florin,


One could implement it via constructor, via setKey($key) (which would be a bit unnatural but lets say it would be possible) or the key could be autogenerated by the implementation. This part is missing currently.

This part is missing on purpose. The way someone retrieves an item is through the Pool->getItem() call. That call is supposed to return an item that is already locked to a key. Since Item's aren't generated directly, there's no need to define how they're made- libraries won't do it, so they don't need a common way to do it (other than the Pool->getItem call of course). 

This allows libraries to handle it however they want. They may just add the key to the constructor, or to another function, they may add some sort of dependency-injected driver, or have an "options" object that gets passed in behind the scene. The point is that we don't really care about that, because it's not a problem of interoperability so much as the library author's own preferences.

Robert




To view this discussion on the web visit https://groups.google.com/d/msg/php-fig/-/M2vabOyiCdcJ.

Beau Simensen

unread,
Mar 1, 2013, 11:48:45 PM3/1/13
to php...@googlegroups.com
Hi Robert,

Thanks for this. :) I'm going to try and provide some feedback for each, though mostly seems like people are discussing Robert's proposal in this thread currently.


Roberts Proposal

I've always been a fan of your proposal but always got lost in the extended details. Glad you have made it easier for people to follow. I have a few comments and questions to clarify my understanding on your revised proposal.

I'd probably prefer naming Pool interface Cache interface. It would have been far less confusing to me upon first looking at your proposal last year and I seem to recall a lot of people have asked what "pool" means since then. This alone might be a good reason to change it. On the user side, I think that type hinting (CacheInterface $cache) and using it by way of $cache->get('thing') is going to make more sense to people (at least initially) than (PoolInterface $pool) and $pool->get('thing'). This isn't a huge deal to me because I've become pretty comfortable with the Pool terminology but I wanted to point out that I had to become comfortable with the Pool terminology. :) (I agree with your assessment on Driver and Handler as not being good alternative names for this interface.)

On the getItemIIterator() naming question, I'd like to suggest getItems() as an alternate name.

On usage, I want to make sure I understand how one would set a value. Given a completely empty Pool (read: completely cold, no data cached whatsoever), the method I would use to do this conceptual thing:

    $cache->set('foo', 'bar', 300);

... would be:

    $item = $pool->get('foo');
    $item->set('bar', 300);

... or

    $pool->get('foo')->set('bar', 300);


Especially after just having reviewed the other two proposals (I'm writing this about half an hour after writing the above comments, specifically the one about the name Pool) I think that if all three were set out side by side other people would be more likely to choose yours if they knew what Pool represented and how they would use it to set a value for a key in a pool. Without that knowledge I can easily see people making another decision, especially if they haven't been able to spend as much time looking at it as those of us who have been super interested in getting a cache proposal through the queue have spent. Maybe a handful of examples on how common things would be done using the pool terminology would help make things more clear?



Evert's Proposal

After having seen Evert's proposal again, I have some comments on it as well. Most notably I'm finding myself more and more in favor of a "cache item" approach. So I'll try to keep this constructive as suggesting "please do it with a cache item?" means you'd basically be either writing Florin's proposal or Rober's. Or even Spring's. So I won't go there, 'k? :)

How common is the bulk set use case in the target audience? In looking at Florin's proposal over the last month or so I've been wanting to suggest that the multiple stuff be broken out into its own interface but seeing it in practice here makes me feel like it doesn't belong. Should supporting bulk operations be an extension to the cache proposal down the line or should it just be a part of the Base interface to begin with? I'm not sure, but if you're going for a very basic bare bones cache in your proposal you could cut your proposal to 1/3 its size by dropping the Multiple interface and its not-even-a-part-of-the-standard trait.

It has no clear() functionality that I can see.



Florin's Proposal

I've said a lot on this one already. I'm not sure where we're at on it. :)

Based on having a fresh look at the other proposals, I'd say it is safe to just skip $options and have set's method signature be $key, $value, $ttl and be done with the whole notion of TtlAwareAnything's. The $options thing was a fun rabbit hole but since nobody else is worried about support anything more than ttl at this point (or ever) let's just go with that.

This looks really close to Spring's cache with a TTL at this point. I'm sure this is a good thing for some people and a really bad thing for others. This is mostly just an observation. :)

I'm not sure about the bulk operations. I haven't said anything about them to this point because nobody else has talked about it. But after seeing it on Evert's proposal I'm finally going to voice my thoughts on this. Is this a common enough use case that it should be baked into the core cache interface?

Robert Hafner

unread,
Mar 4, 2013, 1:48:18 AM3/4/13
to php...@googlegroups.com

With regards to the Cache versus Pool issue, I actually *did* have it as Cache at one point but that confused people as well- they thought that, being the Cache class, it would behave more like the current Item class does. I'm not stuck on the name (and just started another conversation about it) so if something better exists I'll happily change it.

I'm getting close to settling on getItems, but still am not sure about it's return value. I'd love some more input on this.

Change the pool->get to pool-getItem and you've got it. The Pool has a "getItem" function which returns an Item object with the appropriate Key, while the Item object has a "get" function that returns the value of that particular item. Otherwise your example is dead on.

    $item = $pool->getItem('foo');
    $item->set('bar', 300);

... or

    $pool->getItem('foo')->set('bar', 300);


You mention that examples might help- what's the best way to handle that? In the proposal itself, or as a separate thing?

Robert



--
You received this message because you are subscribed to the Google Groups "PHP Framework Interoperability Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-fig+u...@googlegroups.com.
To post to this group, send email to php...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/php-fig/-/5MfSVOD048MJ.

Paul Dragoonis

unread,
Mar 4, 2013, 7:06:35 AM3/4/13
to php...@googlegroups.com
We're trying not to confuse the proposals with implementation details but I think there should be an 'examples' section at the bottom on the usage examples. It would help a lot.

Moisa Teodor

unread,
Mar 4, 2013, 8:04:29 AM3/4/13
to php...@googlegroups.com
Yes, if the specification provides interfaces, I would even go further and add a reference implementation (it could even be published as a package), and link to it from the specification. The RI can serve as a hands-on example for discussions on the spec itself, or as guideline for future specification implementors.
--
Doru Moisa
web: http;//moisadoru.wordpress.com
tel: +40 720 861 922
Bucharest, Romania

Beau Simensen

unread,
Mar 4, 2013, 11:57:01 AM3/4/13
to php...@googlegroups.com
I am mostly concerned that without usage examples people might vote for another proposal. I'm not sure if the usage examples belong in the proposal itself but I think that examples of how to use this particular API would be quite helpful.

The underlying concern I have is that even though I think Robert's proposal is technically the best of the crop it might be difficult for people to follow. I'm not exactly sure what this means? Simplicity is important and if something looks too complicated that people never use it it won't matter that it is technically the better solution. I'd love to see this concern addressed before it goes up for a vote.

Paul Dragoonis

unread,
Mar 4, 2013, 2:45:18 PM3/4/13
to php...@googlegroups.com
On Mon, Mar 4, 2013 at 4:57 PM, Beau Simensen <sime...@gmail.com> wrote:
I am mostly concerned that without usage examples people might vote for another proposal. I'm not sure if the usage examples belong in the proposal itself but I think that examples of how to use this particular API would be quite helpful.

The underlying concern I have is that even though I think Robert's proposal is technically the best of the crop it might be difficult for people to follow. I'm not exactly sure what this means? Simplicity is important and if something looks too complicated that people never use it it won't matter that it is technically the better solution. I'd love to see this concern addressed before it goes up for a vote.

I second this point of Robert's being very well designed but it's complexities threw me off a bit.
The Item class and the Pool class both have the cache driver inside of them to make cache calls, it didn't seem right at all not to mention a duplication of cache objects. I'm sure we can figure it out. :-)
 
To view this discussion on the web visit https://groups.google.com/d/msg/php-fig/-/dC1vG_hXgQsJ.
Reply all
Reply to author
Forward
0 new messages