Painless Caching, Memoization for .NET
When searching for .NET discussions of I-don't-even-remember, I somehow came across Dustin Campbell's blog post about automatic memoization with C#. I instantly remembered first reading about this topic in Higher Order Perl, which at the time introduced me to this handy technique using the delegate pattern. After looking over Dustin's code, I liked the simplicity of wrapping any delegate with a single method call. This makes for an academic discussion, but I thought this could be much more practical if we abstracted the concept a bit more.
If you're stumped, you're probably wondering "what's memoization?" Well, you're in luck because Wikipedia has you covered:
- "In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously-processed inputs."
So, it's just a write-once read-many dictionary keyed off the input parameter. This makes memoization ideal for caching values from deterministic functions, like math functions or silly Fibonacci sequence generators. You could also relegate memoization for other resources, such as caching user input, file contents, network data, etc., just as long as you don't care that the underlying data might change later.
But it doesn't have to be just that. So far, there is no mention for expiring the data once it has been memoized. Like a web browser, you might want to cache data until such a time that it makes sense to reload it.
Implementation details
Here's the base memoize simplified to almost golf fashionability:
public static Func<TArg, TResult> Memoize<TArg, TResult>(Func<TArg, TResult> function) { return Memoize(function, new Dictionary<TArg, TResult>()); } public static Func<TArg, TResult> Memoize<TArg, TResult>(Func<TArg, TResult> function, IDictionary<TArg, TResult> cache) { return key => cache.ContainsKey(key) ? cache[key] : (cache[key] = function(key)); }
If you wanted to memoize the
sin()
function, you could use it as such:var Sin = Memoize<double, double>(Math.Sin); var x = Sin(Math.Pi);
Oh snap, you want some crazy factorial function? Yeah, whatever, just use this variation for memoizing recursive functions:
public static Func<TArg, TResult> Memoize<TArg, TResult>(Func<TArg, Func<TArg, TResult>, TResult> function) { return Memoize(function, new Dictionary<TArg, TResult>()); } public static Func<TArg, TResult> Memoize<TArg, TResult>(Func<TArg, Func<TArg, TResult>, TResult> function, IDictionary<TArg, TResult> cache) { Func<TArg, TResult> memoizeFunction = null; return memoizeFunction = key => cache.ContainsKey(key) ? cache[key] : (cache[key] = function(key, memoizeFunction)); }
The only difference is the delegate signature has one additional input parameter:
var factorial = Memoize<int, long>((x, self) => x > 1 || x <= -1 ? x * self(Math.Abs(x)-1) : 1); var x = factorial(10);
Mix in validation
This seems to work well. But, what if I wanted to extend to cache data from a costly network operation, like a web request for an RSS feed, then reload it after a while?
var timeoutDuration = TimeSpan.FromMinutes(10); var dict = new ValidationDictionary<Uri, IValidatable<string>>(pair => pair.Value.IsValid()); var httpGet = new Func<Uri, IValidatable<string>>(delegate(Uri requestUri) { var response = (HttpWebResponse)HttpWebRequest.Create(requestUri).GetResponse(); var content = new StreamReader(response.GetResponseStream()).ReadToEnd(); return new Timeoutable<string>(content, timeoutDuration); }); var memoizedHttpGet = Memoize(httpGet, dict); var content = memoizedHttpGet(new Uri("http://feeds.digg.com/digg/popular.rss"));
This is a fair bit more complicated, so here's the gist of the strategy. The
Memoize()
method has an alternative syntax that lets you provide your own dictionary for storage. We can leverage this by providing a ValidationDictionary
object. This container, included with the project, implements IDictionary
such that all accesses are predicated by a validator delegate. If an element is deemed invalidated, it will be automatically refreshed by the memoized function.
The validator is the hook to do a timeout check. But, in this example, that's only half the story. You can easily impose a global timeout by comparing DateTime.Now
to a static value. However, I want each element to have its own timeout. To implement, the dictionary value type must be IValidatable<T>
. This is a simple wrapper that allows you to implement an IsValid()
method to each element. So naturally, we have a Timeoutable<T>
class that tacks on a timeout, which is validated by IsValid()
.
When memoizedHttpGet
is first called, the HTTP request is made, tagged with a timeout of 10 minutes, and stored in ValidationDictionary
. If another request is made to the same Uri
, the validator predicate will check the timeout and Memoize()
will reload if invalidated.
Free to use under Creative Commons
The full Visual Studio 2008 project with source and unit tests is attached below and is available under Creative Commons Attribution-Share Alike license. In the future, any other releases I distribute I will be sure to tag with this license.
If you find this code helpful, I would be interested in hearing how you've used it.
Attachment | Size |
---|---|
Memoize Visual Studio 2008 project | 16.72 KB |
- Login to post comments
About Shawn Poulson / Exploding Coder
Software developer and designer
I'm a software developer in the Philadelphia/Baltimore metro area that loves web technology and creating compelling and useful applications. This blog was created to showcase some of my ideas and share my passion and experiences with others in the same field. Read more...
Comments
This is awesome!! Thanks a
This is awesome!! Thanks a lot.
I like this, particularly the timeouts
Memoization is pretty basic stuff (although this terse and elegant syntax in .Net is pretty nice), but I particularly like the arrangement to use a dictionary with individualized timeouts -- I don't think that I have seen that before, and I can see how it would be extremely useful. In fact, the place I think it might be MOST useful would be in Javascript, where rich web applications these days tend to use lots of AJAX calls, but the calls can be quite time consuming. Anyway, thanks: I learned something here.