Why JS needs native memoization

Nick Gard
5 min readJan 4, 2021

--

I have a proposal for a new feature for JavaScript: built-in memoization. (This proposal isn’t even at Stage 0 because no TC39 Champion has picked it up yet.)

Edit 9/28/2022: There is a Stage 1 proposal for Function Memoization. Check out their proposal page, add use cases and issues, and generally help and thank the champions, J.S. Choi and Hemanth HM.

const memoFn = Function.memoize(fn, {hash, compare, select})

To be clear, I am not advocating for automatic memoization of functions, as that is not something that the JS engine could or should do. JS engines should not memoize functions automatically, because they can’t tell whether or not a function is pure. The author of the function knows if it has side-effects, though, so they should have the option of telling the JS engine that the function is memoizable.

What is memoization?

Memoization is a technique for running a function faster when it is called multiple times. The first time the function runs, it is run as usual and the result of the run is stored in a cache. These values in the cache are keyed by the arguments (or some derivative of the arguments) used. Subsequent runs of the same function first look through this cache for matching keys (arguments) and return the stored value rather than re-running the logic in the function itself.

If the cache lookup is faster than running the function, this is a speed boost. If.

What could go wrong?

  • Cache lookup could be slower than running the function. This could happen if the function is particularly small or fast, or if the cache is particularly big or slow.
  • Cache lookups could miss much more often than they hit. The cache lookup happens on every call, so it is a (hopefully small) penalty to pay when the key is not found. If the cache misses too often, this penalty compounds and costs more than the savings of cache hits.
  • Cache uses up too much memory. Memoization is essentially trading space (memory) for time (computation speed). Memory is still finite, though, and an aggressive cache can consume too much, slowing down or even crashing the browser.
  • The function that is memoized could be impure, having side-effects or not returning the same value for the same arguments. Memoizing such a function could break the application in a hard-to-debug manner.

Key equivalence is one source of these problems. Memory management is another. To naively memoize a function, the cache keys should include all of the arguments and performing a cache lookup should compare the incoming arguments with the cached ones (the keys). Doing this comparison is quick for primitives and referentially equal objects. If the author wants to treat deeply equal objects (or “equal enough” objects) as equal, then the comparison gets slower.

Most memoization implementations in JS mitigate this slow comparison in some way. Lodash only uses the first argument by default, so memoAdd(1,2) would return the same as memoAdd(1,3). Reselect’s implementation only stores one value in cache, capping the lookup time but also limiting its usefulness. Almost all implementations allow the author to pass in a comparison function to determine if keys/arguments are equal, but I’ve not seen one that checks if doing this comparison is slower than running the function.

Almost no memoization implementations store their cache keys or values weakly. The caches act as strong references for all objects passed as arguments or returned as values. Lodash allows the cache constructor to be changed to a WeakMap but this limits what kind of functions can be memoized since WeakMaps cannot have primitive values as keys. An implementation I’ve seen in production uses WeakMap and wraps primitives in objects but has to unwrap them to compare keys correctly. My polyfill uses WeakRefs for storing non-primitive keys and values, but this comes with a speed penalty as it has to detect and deref() those references for comparison.

No implementation that I found observes the system’s (browser’s) memory in order to manage the cache sizes. The common method to cap memory consumption is to limit the cache sizes without regard to the system’s resources. This technique limits the usefulness of memoization, puts the burden on the author to determine ahead-of-time how many cache entries they might need, and ultimately won’t eliminate this memory leak as a thousand 1-item-cache functions use as much memory as a single 1000-item-cache function.

Browsers know what resources are available and are best positioned to manage this. Chrome and a few other browsers expose their memory metrics to the author via the Performance.memory object, which is what my polyfill relies on.

What’s wrong with my polyfill?

It’s slow. It is much slower than most library implementations and it doesn’t compare compute times with lookup times to mitigate speed loss.

It’s naive in cache clearing when the system indicates that is needed. It completely clears the cache instead of preserving the last-used key-value pair, or reducing the cache to be a more lookup-efficient size.

It’s dependent on non-standard browser APIs (Performance.memory) and APIs with poor cross-browser compatibility (WeakRef, FinalizationRegistry).

What can you do?

If you are a TC39 representative, please consider championing this proposal. If you have opinions or ideas about a native JS memoization implementation, contribute to the conversation on the TC39 Discourse channel for it. Thank you for reading!

My wishlist of features for a native memoization utility:

  • Invisible to users. The function that the user desires to memoize should not have to be modified to be memoized (e.g. boxing primitives, or currying functions)
  • Flexible cache. The user should not have to manage the cache. It should grow as needed without user intervention.
  • Compute-time-safe. The user should not have to worry if the cache-lookup is slower than naively running the memoized function. The implementation should opt to run the function un-memoized (or reduce the cache to a lookup-efficient size) if it detects the lookup is taking longer than the original call.
  • Memory-safe. The user should not have to worry about the memoization cache being a memory leak. No arguments, generated keys (defined by a hash function or however), or function results should be strongly referenced. This should extend to the memoization function handling excessively large caches. Whether it completely flushes the cache periodically, or removes the least-used key/values, or does some other memory management, this should be invisible to the users beyond a React-style disclaimer that memoization does not guarantee the function won't be re-run.
  • User-defined equality between sets of arguments. The user may know when the function should return the same result based on arguments, so they may wish to define a way to determine this. This could be a hash function that reduces the set of arguments to a single primitive (probably a string). This could be a compare function that compares arguments pairwise for equality (e.g. given the argument sets a1, a2, a3 and b1, b2, b3, the compare function would be called on [a1, b1], [a2, b2], and [a3, b3]) and returns the cached value if they all return true. This could be a select function that takes the set of arguments as an input and outputs an iterable of values to use as cache keys (similar to how React.memo uses the props passed in to the component).

--

--