Craig Rowe

Techlead / Developer

25th October 2008

Cheap Scheduling with Cache Expiration

Often there's a need to do some activity independent of page requests. Windows service creation or even old school vbscript task scheduling can provide the answer. However there is another way...

When building this site I always intended to hook into an abundance of APIs. Already on this site delicious, twitter, brightkite and google maps can be seen. However with their usage comes a number of issues.

Continuing with the cargowire example, if a user were to hit the interesting links page they would be requesting information from Twitter for the header status, Delicious for the links and brightkite + google for the map under the nav. This of course is all wrapped up in the request for images/css/content direct from cargowire.

Caching is the Answer

The obvious approach therefore is to cache the api responses. This not only helps in terms of speed of page response but also avoids breaking any api limits on requests if your site gains a lot of attention.

In the example below I'm using a static property as the accessor to a cached XPath document. RandomApi is merely a placeholder for what could be any API wrapper (twitter etc) that returns the XML after accessing the web service.

                  private static XPathDocument CachedXPath
                  {
                      get
                      {
                          XPathDocument xpathCache = HttpContext.Current.Cache[CacheKey] as XPathDocument;

                          if (xpathCache == null)
                          {
                              xpathCache = new RandomAPI(ConfigSettings.RandomAPIUsername,
                              ConfigSettings.RandomAPIPassword).ToXML();

                              HttpContext.Current.Cache.Add(CacheKey, xpathCache, null,
                                              DateTime.Now.AddMinutes(ConfigSettings.RandomAPITimeoutMinutes),
                                              Cache.NoSlidingExpiration,
                                              CacheItemPriority.Default, null);
                          }
                          return xpathCache;
                      }
                  }
              
fig. 1.0

The code implements a relatively well used pattern whereby the cache item is attempted to be accessed, if available it is returned, if not it is created and put in the cache for the next request. The cache expiration is absolute to ensure that no matter the frequency of visitors the cache will expire at regular intervals (ensuring up to date content is retrieved from the web service). If sliding expiration was used the cache may never expire as each visitor restarts the timer.

Problems with Caching

However even caching has its problems, particularly if your site has low regularity of traffic. Consider the following example:

  1. A user hits your site.
  2. The site requires delicious, twitter and brightkite but no-one has been to the site since the last cache expired.
  3. All three are requested from the relevant services and stored in the cache for the next user. An expiration time is set for each so that they are kept up to date (for example 5 minutes for twitter or maybe a day for delicious).
  4. No user comes within the expiration time.
  5. A user comes after the expiration time - the cache is useless as it's non existant - so all items are requested again meaning the user has to wait for delicious, twitter and brightkite before getting their page. Your caching strategy has given no real benefits.

Scheduled caching is the real answer

Ideally then what is needed is a scheduled job that refreshes the cache independently from the page requests. So that when a page request comes in the cache is already there and the server can respond immediately...

At the point of considering some kind of cron job or windows scheduled task/service we should probably stop and take a step back.

In my view it's best to keep things within asp.NET if possible. Not necessarily for a specific technical reason but why skip out of a single project if you don't need to?

Cache Expiration

If we look again at the Cache.Add function we can see a null final parameter. This parameter enables you to define a CacheItemRemovedCallback. Essentially we're talking about an event handler that is fired when the .net process removes the item from the cache.

Below is a reorganised code sample. A CacheItemRemovedCallback delegate is created and passed in to the Cache.Add function. When the cache item is removed the expiration handler is then called. If the reason is that it was invalided or expired due to time/usage then it seems viable to attempt to recreate it. If it was manually removed then you may not wish to recreate it (otherwise you don't leave yourself any option to get rid of it!). There's also the possibility of stack overflow issues with recreating a cache item after manual removal.

Essentially if you reinsert the item when the reason was 'Removed' the re-insert will overwrite the currently existing item (that is removed when this event finishes) before re-adding it. This removal will of course fire the item removed callback, which will remove/add it back again causing an infinite loop.

                  private static XPathDocument CachedXPath
                  {
                      get
                      {
                          XPathDocument xpathCache = WebCache.Cache[CacheKey] as XPathDocument;

                          if (xpathCache == null)
                          {
                              CacheItemRemovedCallback expired = new CacheItemRemovedCallback(Expired);
                              xpathCache = new RandomAPI(ConfigSettings.RandomAPIUsername,
                                                          ConfigSettings.RandomAPIPassword).ToXML();
                              WebCache.Cache.Add(CacheKey, xpathCache, null,
                                              DateTime.Now.AddMinutes(ConfigSettings.RandomAPITimeoutMinutes),
                                              System.Web.Caching.Cache.NoSlidingExpiration,
                                              System.Web.Caching.CacheItemPriority.Default, expired);
                          }
                          return xpathCache;
                      }
                  }
                  private static void Expired(string key, object val, CacheItemRemovedReason reason)
                  {
                      if (reason == CacheItemRemovedReason.DependencyChanged
                          || reason == CacheItemRemovedReason.Expired
                              || reason == CacheItemRemovedReason.Underused)
                      {
                          XPathDocument xp = CachedXPath;
                      }
                  }
              
fig. 2.0

There is another difference to note. Lines 5 and 11 both reference a WebCache static class instead of the HttpContext.Current.Cache used in the first example. This is for one reason only. The cache can expire at any point during the application i.e. outside of a request for the page. This is by design, and it helps us refresh the cache without forcing a user to wait for us to do so. However there's no HttpContext outside of a page request so a call to CachedXPath will throw an exception (one that will not be seen by a site visitor but an exception none the less causing the cache refresh to fail).

One way around this is to ensure that there is a static way of accessing the application cache. An ideal time to store this is before there is any possibility that the cache will be requested. By simply adding a few lines to the Global.asax a static variable can be set for use throughout the application.

                  protected void Application_BeginRequest(object sender, EventArgs e)
                  {
                      if(WebCache.Cache == null)
                          WebCache.Cache = Context.Cache;
                  }
              
fig. 2.1

[Edit] You can also make use of the HttpRuntime.Cache accessor to Cache, which will be available outside of the page request.

Conclusion

This article describes just one way to take advantage of the caching abilities of .NET. A regular refresh is valid if you need to keep data almost always in memory but regularly updated... a twitter cache that never refreshes would render it's output irrelevant.

The .NET implementation of expiration handling would allow you to run any kind of scheduled activity. Merely by calling the required function on cache item expiration, before refreshing the cache with the desired timeout you can begin and cause a pseudo timer based process.

However, using caching in this way is not always appropriate. A windows service will have higher local privileges and can be set to auto start and be manually started. The cache timeout will only begin after the first request for that cache item, and a further request will be required if the app pool recycles for any reason.

Please do catch me on twitter if you have any thoughts/feedback before I get any proper commenting online.

All article content is licenced under a Creative Commons Attribution-Noncommercial Licence.