Ticket #2149 (new patch)

Opened 12 months ago

Last modified 7 months ago

Write a single interface for Caching

Reported by: wakeless Owned by: sminnee
Priority: medium Milestone:
Component: Sapphire Framework Version:
Severity: medium effort / impact Keywords:
Cc: Hours:

Description

A single interface is needed for Cacheing. Currently, arrays are used which is horrible for debugging. If there was a central interface, a 2 stage cache could be created wherein at the end of the request, stuff could be pushed to a more permanent cache. ie. Memcache or files

example....

Cache::setBackingStore("memcache", "127.0.0.1");

Cache::set("SiteTree-34", $siteTreeObj);

Cache::invalidate("SiteTree-34");

Cache::get("SiteTree-34");

Attachments

Cache.php (1.8 kB) - added by wakeless 7 months ago.
CacheInterface.php (183 bytes) - added by wakeless 7 months ago.
DefaultCacheBackend.php (1.7 kB) - added by wakeless 7 months ago.
DBCacheBackend.php (1.6 kB) - added by wakeless 7 months ago.

Change History

Changed 12 months ago by sminnee

Perhaps we could use different classes to implement the different kinds of caching?

interface CacheRepository {
  /**
    * Return the cached item with the given identifier
    */
  function get($identifier);

  /**
    * Return the unix timestamp when the given cached item was created
    */
  function createdOn($identifier);

  /**
    * Add an item to the cache
    */
  function set($identifier, $value);

  /**
    * Remove the given item from the cache
    */
  function clear($identifier);
}

I don't think we want to have expiry dates; it will be up to the caller of the caching system to figure out a more clever way of knowing when not to use the cache.

Changed 12 months ago by wakeless

I'd be inclined to duplicate the memcached interface. It's solid and has been proven to work for a lot of different situations.

http://www.danga.com/memcached/

Changed 12 months ago by sminnee

That sounds good, with the exception of all the stuff about adding/removing Memcache servers - http://nz.php.net/memcache

  • add
  • set
  • get
  • increment/decrement
  • flush

I'm not sure if we need separate add & set functions, though.

Changed 7 months ago by wakeless

Is anyone interested in this, I've got a far more mature version of this to add if you are.

Changed 7 months ago by tcopeland

Hey Michael

Would love to see what you're working on here. We are doing a large amount of performance related work at the moment as well (http://www.silverstripe.com/extending-hacking-silverstripe-forum/flat/82778?showPost=82854) and I think the work you are doing could be quite complementary.

- Tim

Changed 7 months ago by sminnee

  • type changed from enhancement to patch

Changed 7 months ago by wakeless

Right. I'm attaching the most upto date version of this stuff. Currently I'm just using it to cache generated JSON as it has been a performance hotspot for a site we are working on. However I forsee a cache being used within DataObject::get (and wherever else data is cached).

This way when tracking down memory leaks etc we can track calls to Cache->get and Cache->set. Currently they just use arrays which are untrackable.

Also using a memcache backend we would be able to cache across servers.

You will also note that this isn't a singleton/static design. This way different backends can be used for different purposes ie. DataObject? will initially use an array, and SSViewer would use memcache/db/array

Changed 7 months ago by wakeless

Changed 7 months ago by wakeless

Changed 7 months ago by wakeless

Changed 7 months ago by wakeless

Changed 7 months ago by sminnee

Have you patched any of the core systems to make use of this new cache, Michael?

Changed 7 months ago by wakeless

Just responding to a question asked in IRC regarding why there is a seaparate cache object which just passes stuff through to the backend. This is so that other types of specialised caches with the same backends could be build. This could be a Logging cache that logs all references to the cache or somoething that may want to override time outs or keys. I realise this may be over engineering, but I think the added flexibility coidl be very useful.

Changed 7 months ago by sminnee

I would suggest that we take a couple of the most common use-cases and work out how to be implement them.

In my mind, the two most likely use-cases would be:

  • Debug logging
  • Cache expiry

Cache expiry

I would suggest that cache expiry would probably be better implemented as part of the back-end, since it will involve storing additional meta-data, and storage of that meta-data would need to be backend-specific. An alternative would be to wrap the cached resource in another structure that tracked resource and expiry, but that strikes me as unnecessary layering.

One good reason to abstract out the cache expiry code is if people can envision that the caching system would have more sophisticated expiry logic than simple time-outs. For example, SSViewer's cache expires when based on the modification times of the dependent template files.

It might be worthwhile to implement this kind of "complex expiry" in the cache, but we'd need to figure out how.

Debug logging

Debug logging is something that gets turned on and off within the system; it makes sense that it is a flag on existing objects - it needs to be really easy to enable it for uses within core code without altering that code. Cache::enable_logging() or a config option (if/when we have those) seems like a good way of doing that.

Having a separate LoggingCache would mean that you would need to replace all calls to Cache::* with LoggingCache::*, or add some kind of a cache factory, which would add a 3rd layer.

I don't think that the separation adds value for debug logging, but it may do if you would want to have multiple logging implementations to choose from. Is this realistic?

Changed 7 months ago by wakeless

First of all, all your examples here reference static calls, I don't believe this is the correct approach and it should be possible to instantiate different caches. I'm sure we spoke about that on IRC, but thought I should post here for the record.

Cache expiry

I based this exactly on the memcache api. I really don't think we should over complicate this. In the case of complex expiry that you are speaking of, the application should invalidate the sections of the cache it needs to. So it should do the modification checks for the template file and calls Cache->delete(templatekey);

I don't believe there is a better way to do this, the timeout can be set to unlimited of course.

Logging/Debugging

I agree, this logging needs to be able to be turned on or off regardless of the backend. This is really important, and is one major reason to insert this into the backend (apart from easing into different backends)

Changed 7 months ago by sminnee

How do we want to control logging? Currently, we're using static methods for configuring this kind of thing. In the interests of consistency, I would suggest that we stick with that, with a view to migrating it to a new configuration system if & when that new configuration system is implemented.

Given that, there are a number of different ways you enable logging:

if(Director::isDev()) {
  Cache::enable_logging(); // Logging as an all-or-nothing option
  MemCache::enable_logging(); // Turn logging on & off on a backend-by-backend basis
  Cache::enable_logging('MemCache'); // The same, but means that you don't have to reimplement enable_logging() on each backend
  Cache::enable_logging('DataObject.get_one'); // Select logging by the service being cached
}

A related question is how to map cache back-ends to the different things that need caching. For example, if you were wanting to use a memcache backend for caching DataObject::get_one responses, you could do this:

static function get_one($a, $b, $c) {
   $cache = new MemCacheBackend();
   $cache->get('bla');
   $cache->set('bla', 'val');
}

However, if we did something more like this:

$cache = Cache::factory('DataObject.get_one', 'MemCacheBackend');
$cache->get('bla');
$cache->set('bla', 'val');

Then you would be able to reconfigure your site to use different back-ends. This would be particularly handy for MemCache?, since I don't think requiring a memcache server is appropriate for smaller sites. In this example I have replaced the default MemCacheBackend? with a FileCacheBackend?.

In _config.php:

Cache::use_backend('DataObject::get_one', 'FileCacheBackend');

This whole idea of having "named cache services" fits in nicely with the enabling of cache logging on a service by service basis.

So, where are we at after all of that? Something like this would work:

  • The Cache class is a collection of static method for registering the mapping between cache services and cache backends
  • The CacheBackend? class is a self contained cache object, it's not really a back-end at all.
class Cache {
  static function factory($service, $default) {
  }
  static function use_backend($service, $backend) {
  }
  static function enable_logging($service, $enabled = true) {
  }
}

abstract class CacheBackend implements ArrayAccess {
  abstract function set, get, delete, etc - like you've defined them

  // include methods for ArrayAccess
}

class MemCacheBackend extends CacheBackend {
}

class ArrayCacheBakend extends CacheBackend {

}

It might be better to rename Cache to CacheFactory? and CacheBackend? to Cache.

What do you think of this approach?

Note: See TracTickets for help on using tickets.