Updated on Jun 3rd, 20213 min readjavascriptmarkdownreact

Read Time Feature For React + Markdown Blog

I just added a feature to this blog that outputs the estimated read time for a post.

The first thing I did was a Google search.

How many words can a person read per minute?

I stumbled upon this Medium Article which outlines the basics and describes how Medium engineers implemented a read time feature for their site.

They acknowledge that read time is just an estimate, one that does not take into consideration the ability of the reader or the complexity of the article's topic. Nevertheless, this feature has become widely adopted.

The following article will outline how I incorporated my own read time feature into this site - a NextJS and Markdown powered blog.

Site Structure

This blog uses Markdown files as the source material for blog content. A posts folder contains one file for each post. The site is built ahead of time by NextJS, which reads each file and transforms my Markdown into React elements. I will use simple JavaScript just before this transformation to calculate the read time for each post.

The name of each of my Markdown files becomes the slug for the webpage. The file responsible for this post is react-markdown-blog-read-time-feature.md. The following getPostBySlug function is called to read my Markdown files into memory.

lib/api.js
import fs from 'fs'
import { join } from 'path'
import matter from 'gray-matter'
import getReadTime from './read-time'

const postsDirectory = join(process.cwd(), '_posts')

export function getPostBySlug(slug, fields = []) {
  const realSlug = slug.replace(/\.md$/, '')
  const fullPath = join(postsDirectory, `${realSlug}.md`)
  const fileContents = fs.readFileSync(fullPath, 'utf8')
  const { data, content } = matter(fileContents)

  const readTime = getReadTime(content)

  const items = {}

  fields.forEach((field) => {
    if (field === 'slug') {
      items[field] = realSlug
    }
    if (field === 'content') {
      items[field] = content
    }

    if (field === 'readTime') {
      items[field] = readTime
    }

    if (data[field]) {
      items[field] = data[field]
    }
  })

  return items
}

For those interested in creating a blog like this one, I recommend the Gray Matter library. It allows me to add yml data to the top of a Markdown file to attach extra metadata, while also separating the content of the file. This content is simply the raw text of my post. It is this raw text that I will pass to my readTime function.

I can then pass on the results of the read time feature as part of the blog post object, making it simple to compose into my blog page later.

Markdown Content

For those unfamiliar with Markdown, it is basically text with a few special characters to help format the output. It is popular for blogs and other online text inputs because the content creator can easily add headings, links, images and more without having to develop a full blown website. Programs called parsers process the Markdown and output HTML.

The following is a short excerpt to illustrate the syntax.

# Heading
## Sub Heading

**Bold Text**

- List Item
- List Item
- List Item

[A link to somewhere](https://somewhere.com)

Most text is just unformatted paragraph text.

Read Time Function

The end goal, or output, of the read time function is the number of minutes it will take to read the post.

I am using the average reader's words per minute (WPM) found in the Medium article I cited earlier. This value is 275, which I divide by 60, since we need to account for images using the seconds unit.

lib/read-time.js
export default function readTime(content) {
  const WPS = 275 / 60

  var images = 0
  const regex = /\w/

  let words = content.split(' ').filter((word) => {
    if (word.includes('<img')) {
      images += 1
    }
    return regex.test(word)
  }).length

  var imageAdjust = images * 4
  var imageSecs = 0
  var imageFactor = 12

  while (images) {
    imageSecs += imageFactor
    if (imageFactor > 3) {
      imageFactor -= 1
    }
    images -= 1
  }

  const minutes = Math.ceil(((words - imageAdjust) / WPS + imageSecs) / 60)

  return minutes
}

My basic approach was to split the raw text on each space and then filter out anything that doesn't consist of letters or numbers. This prevents the Markdown syntax, spaces, and other non words from being counted. I accomplish this with JavaScript's built in Regular Expression feature.

The character class \w represents [a-zA-Z0-9_], and the RegExp.prototype.test() method is used to filter any string that does not contain an alphanumeric character.

While the filtering is happening, I also count the img HTML tags. This allows for the proceeding while loop that adds up imageSecs, or the number of extra seconds estimated to view images based on the aforementioned logic. I also subtract 5 words for each image to account for the src, alt, and img tag itself.

Finally, everything comes together to solve for minutes. This is the estimated read time which is attached to the post object and displayed at the top of each post.

Final Thoughts

While nowhere near perfect, I do feel this formula provides a useful estimate and is a relatively simple implementation. Moreover, knowing roughly how much time commitment an article will require to read is useful.

Benjamin Brooke Avatar
Benjamin Brooke

Hi, I'm Ben. I work as a full stack developer for an eCommerce company. My goal is to share knowledge through my blog and courses. In my free time I enjoy cycling and rock climbing.