React Snapshot Testing: The Bad Parts

Jest’s snapshot testing has been hailed as a quick and easy way to fully test React components’ UI, but my experience using them has exposed several severe shortcomings. I no longer use snapshot testing and cannot recommend using them. Here are my reasons:

Inability to Follow TDD/BDD Guidelines

A key rule in Test Driven Development and Behavior Driven Development is to write tests first, essentially outlining a behavior/API contract. The tests should initially fail (because the thing being tested has not been written yet) but will pass when the code finally satisfies the contract.

Snapshot tests initially fail. Good, check that off. Now write the component and watch the tests. They never pass. In fact, snapshot tests will not ever pass until you run jest --update-snapshots, which will immortalize the component’s current markup as the True Markup™, regardless of whether or not the component is in a complete and valid state.

“Fix” Tests Without Changing Code

Do you have failing snapshot tests? No problem. Run jest --update-snapshots and now they’re passing. Your colleagues may scream at you for not reading the snapshot failure output and understanding what failed, but you had made a minor CSS change and were halfway expecting a few snapshot failures. When tests are this easy to “fix,” why take the time to debug tests?

Enigmatic Failures

Ok, you take the time to read the snapshot test failure output. After all, it’s human-readable and should look like a diff between lines of regular HTML. However, say your change was to a CSS rule in your CSS-in-JS library du jour (say Glamor) and the output is:

Since snapshots check what markup JSX/React outputs and not what pixels a browser actually renders, any insignificant change to CSS — like changes to whitespace or property declaration order — that results in an attribute or class name hash change (which is how most CSS-in-JS libraries work) will cause a failing test. How sure are you that this hash change encompasses only the changes you intended to make?

Vague assertions lead to vague failures which lead to vague fixes.

Poor Test Behavior

Tests are only as useful as the information they give us. A good test suite tells us that changes to code did not introduce regression bugs. This is why code coverage is important (it tells us how much of the application will stay relatively bug-free) and also why bug fixes should be accompanied by a test that captures the bug (so it will not be re-introduced by later changes).

Validity is one of the four key characteristics of good tests. (The other characteristics are reliability, objectivity, and usability.) A valid test not only measures what it purports to measure (think, “behavior, not implementation”) but also registers a failure if and only if the tested aspect fails. If the tests fail when there are no bugs, that is a false negative. If the tests pass when there are bugs present, that is a false positive. If a test registers a failure only if a bug actually exists in the area being tested, then it is valid. Snapshot testing fails both of these definitions of validity.

Snapshots are supposed to test UI, but they actually compare final markup (HTML) between the previous and current component renders. Markup is only part of what makes up UI. Even if you ignore behavior for snapshot testing, styling is still unaccounted for. To call snapshot tests “UI tests” is misleading at best.

By testing the output of a component’s render method, snapshot testing can trigger most, if not all, lifecycle methods. This counts every line in these methods toward line, function, and branch coverage statistics, as it should. However, there may be side-effects in those methods that don’t contribute to the rendered output — such as registering a listener or fetching data — and those behaviors aren’t captured in the snapshot test. Test coverage reports will claim that these lines are covered, though. This is a false positive.

As mentioned before, snapshot tests can easily fail for changes that don’t introduce bugs. This is a false negative. I’m sure that snapshot tests can be useful, but they are too often invalid.

UI is not Static

Snapshot tests are based on the premise that there is a True Markup™ for a component and it should not change. But changing UI is what we do as web developers. That header needs to be a different color for the weeks leading up to the World Cup. This spacing is too narrow according to the new designer. That markup isn’t very accessible as written, so it needs aria-* attributes. No, wait… that aria attribute isn’t very well supported — change it to this one.

When we check vaguely and generally that nothing has changed, rather than that certain things haven’t changed, we end up with fragile tests. After all, the only constant is change. This is especially true in Front-End Web Development.

Web Developer, Oregonian, husband

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store