Risk analysis for docs

When you have a large body of documentation, it’s impossible to keep all of your docs up-to-date. When you’re deciding which docs to prioritize updating, it’s common to focus on the most-viewed docs: they usually correspond to your most popular features, account for the most traffic, and may also be the most-viewed docs by new users, who need help getting started with your product.

Unfortunately, this often results in a smaller subset of the most popular docs continually receiving all of the priority for updates. This means the rest of your docs (the less-viewed, less-popular docs) are left to languish, until they’re so out of date, someone on your team (or even worse, a customer) has to complain about it for anything to change.

In this post, I’ll share my process for conducting “risk analysis”, aka how to check if the content of a doc might hurt customer trust (hence the “risk” factor). This involved evaluating our docs based on factors beyond pageviews, which enabled us to balance higher and lower priority docs work, and avoid neglecting certain docs until it’s to the point of hurting customer trust.

Inspiration: docs metrics

The inspiration for this process came from me reviewing several of the metrics separately. These metrics are:

  • Pageviews: A no brainer, we look at the most-viewed docs as one of multiple factors in deciding which docs to prioritize updating. This is represented as a whole number.
  • Post modified date: The last day that the doc was updated by any team member.
  • Clicks on thumbs up/down buttons: Every Heap doc has thumbs up / thumbs down buttons to allow users to give feedback on any doc. We track clicks on these buttons to see which docs are receiving the most positive or negative feedback. These clicks are represented as whole numbers.
  • SEO score: We use Yoast SEO to track the effectiveness of our SEO efforts to see if they can be made more discoverable. This is represented as a score of 0, 30, 60, or 90 (highest).

As I slowly accrued these separate metrics on my docs dashboard over time, I realized it made more sense to bring them together so that I could audit the effectiveness of each doc by comparing these metrics. These all felt like ways to check the “risk” of a doc being bad in some way; thus, the idea for “risk analysis” as a process was born.

Creating the first docs risk analysis sheet

Since I get these metrics from several different sources (pageviews from Heap, date last updated from WordPress, thumbs up/down from nice reply, and SEO from Yoast) I opted to export these pieces of data (grouped by page URL, for past 90 days) to one single spreadsheet.

Though it sounds straightforward, anyone who has tried to format data from different places into one single source of truth knows how much of a headache it can be. Two of these sources had erroneous duplicate line items (translations into different languages, drafts, and docs which had been deleted or renamed) so I had to manually QA the accuracy of the data by reviewing it line-by-line against the sheet. I also had to learn how to use pivot tables and certain formula types on the fly to format some of the data correctly.

My first-ever pivot table! I would be embarrassed to admit how long it took me to get this right.

It took hours to set up the very first risk analysis sheet, though I’ve since whittled this process down to only take 15 minutes (I would have wholesale abandoned this project if it took this long every single time).

Once I made sure I had all of the right data, I applied some formulas to generate the most useful numbers. For freshness (days since the doc was last updated), I subtracted the last updated date from today (the day I was running docs risk analysis).

I also added a new metric, the ratio of clicks on thumbs up vs. thumbs down. To calculate this metric, I divided the number of thumbs up clicks from the number of thumbs down clicks.

After all was said and done, the end result was a sheet with the following columns:

  • Categories
  • Title
  • URL
  • Post Modified Date
  • Pageviews
  • Thumbs Up
  • Thumbs Down
  • Freshness (post modified date – today’s date)
  • Ratio (thumbs up – thumbs down)
  • SEO Score
  • Review Notes

I color-coded several columns to make it easy to spot “good” vs. “bad” numbers. Here were the conditional rules I applied:

  • Pageviews: A green range from the highest (most green) to lowest (white) numbers.
  • Thumbs Up: A green range from the highest (most green) to lowest (white) numbers.
  • Thumbs Down: A red range from the highest (most red) to lowest (white) numbers.
  • Freshness: A red-to-green range from the highest (most green) to lowest (most red) numbers. (Since these are negative numbers, “highest” is 0, and “lowest” is something like -1000).
  • Ratio: A red-to-green range from the highest (most green) to lowest (most red) numbers.
  • SEO Score: A red-to-green range from the highest (most green) to lowest (most red) numbers.

The end result was this (mostly) readable, color-coded sheet. I opted to sort the sheet as follows:

  • SEO (Z-A) (fourth priority)
  • Ratio (Z-A) (third priority)
  • Freshness (A-Z) (second priority)
  • Pageviews (Z-A) (first priority)

Piloting the docs risk analysis process

I piloted docs risk analysis at the same time as I was piloting docs table setting week. I set up one hour of time for my direct report and I to review these numbers, pick out individual rows in the sheet (by highlighting them) that we wanted to hone in on, then spend time reading those docs to see if there was anything that needed to be updated. We decided if something was going to take more than 5 minutes to fix, we would file a doc request to fix it at another time.

The results of the very first session were eye-opening. Several important docs which were not about one specific feature, ex. best practices for using a handful of features together, were discovered by their freshness to be woefully out-of-date, since those features (and corresponding feature guides) had been updated, but not those docs. For the guides with the worst ratio scores, we dug into the comments left in those thumbs up clicks (users can add optional comments) and spotted patterns that allowed us to prioritize improvements.

Last but not least, for docs which had very few pageviews AND low ratio, freshness, and/or SEO scores, we asked ourselves if they were worth keeping from a bandwidth and maintenance standpoint. Some had been relevant when they were first published but had become less relevant over time, such as a guide on migrating from one feature to another.

In some cases, we kept the doc, as we could identify use cases where it would be valuable to someone. In others, we archived it (which gives us the option of bringing it back later) as a way to declutter content that was taking up space.

Overall, we were able to use this one hour of our time to:

  • Make a number of quick fixes to 15 docs
  • Filed 6 doc requests to prioritize bigger updates for docs
  • Archived 4 rarely-viewed, out-of-date docs to test whether they were actually necessary (we ended up unarchiving only 1 at a later date)

As part of piloting docs table setting week, at the end of each session, I ask if that felt like a good use of our time. We both enthusiastically agreed that this was valuable – not just for us as docs writers, but also for all of our docs readers.

Iteration and expansion

Since this was piloted, I’ve conducted docs risk analysis 4 times, always at the start of the quarter as part of our table setting week. I’ve made a number of improvements to the process, including:

  • Removing the SEO Score, which proved to be too vague and difficult to interpret to have a meaningful impact on our process.
  • Hiding the Post Modified Date row, as it’s less useful than the actual freshness score (and mainly only exists to generate that score).
  • Reorganizing the template to have the metrics listed in order priority, which is currently Pageviews > Freshness > Ratio > Thumbs Up/Down.
  • Tweaking the color coding to better align with our sentiment or what is a “good” or “bad” score (Ex. in some cases it made sense to have green-to-white vs. green-to-red).

We also standardized the way that we leave comments in the docs to make it easy to remember what we said and did last time. In some cases, we would run across a doc and ask ourselves “didn’t one of us audit this last round?” So we got in the habit of leaving our name and notes on what we decided to do (even if the outcome was that we changed nothing). This also allows us to note changes which already have a ticket or ongoing conversation.

This is what our most recent docs risk analysis (conducted a week prior to writing this post) sheet looked like.

Last but not least, to streamline this process of setting up the sheet, I created a template that I can duplicate, and wrote up a guide in our team Confluence on how to set up the risk analysis sheet. It wound up having many steps, as the process of exporting data from multiple sources and then formatting it into one place is quite detailed. Though it does now take only 15 minutes, it does take the entirety of that 15 minutes.

In the future, I’d like to automate the creation of this sheet as much as possible, and consider adding additional metrics – perhaps from AI tools, such as a reading score, or a count of spelling, grammar, or other errors.

Any suggestions on how I can make risk analysis even better? Ideas for metrics to add? Feel free to share them in the comments.

Leave a Reply

Your email address will not be published. Required fields are marked *