In this article, we are going to build together, step by step, a cache layer using Redis and add it to an existing Node application. If you have a Node web server that uses MongoDB via `mongoose` and you would like to add a cache layer to optimize your queries, I hope this article will help you reach your goal.
If you are more comfortable to look directly at the repository here is the link.

To run this application locally you need to install docker on your machine if you are not familiar with docker and docker-compose you should visit docker official website.

Application Overview

The existing application we are going to work on is a simple Node app with Express, MongoDB and mongoose.

Node app mongodb

The application has two endpoints:

  • GET /api/books — get a list of books from the DB.
  • POST /api/books — add a new book.
const mongoose = require("mongoose");
const Book = mongoose.model("Book");

module.exports = app => {
  app.get("/api/books", async (req, res) => {
    let books;
    if (req.query.author) {
      books = await Book.find({ author: req.query.author })
    } else {
      books = await Book.find()
    }

    res.send(books);
  });

  app.post("/api/books", async (req, res) => {
    const { title, content, author } = req.body;

    const book = new Book({
      title,
      content,
      author
    });

    try {
      await book.save();
      res.send(book);
    } catch (err) {
      res.send(400, err);
    }
  });
};

Example of simple express routes (source)

All the transactions executed over MongoDB through an ORM package called mongoose.

const mongoose = require("mongoose");
const { Schema } = mongoose;

const bookSchema = new Schema({
  title: String,
  content: String,
  createdAt: { type: Date, default: Date.now },
  author: String
});

mongoose.model("Book", bookSchema);

Example of a Mongoose model (source)

Everything works now, right? Now let’s imagine we have a tremendous amount of books in our DB. Every time we try to query a list of books, we struggle with performance issues. We can tackle this problem from several different directions, but we are going to focus on a cache layer solution.

What’s a cache layer?

Cache is a high-speed data storage layer which stores a subset of data, so that future requests for that data are served up faster than is possible by accessing the data’s primary storage location. Caching allows you to efficiently reuse previously retrieved data.

Long story short, you can think it’s a tiny database that runs in the memory of your machine and allows you to read and write data quickly and efficiently. There are a couple of products that give you a cache layer but we gonna focus on Redis. Redis is an in-memory data structure project implementing a distributed, in-memory key-value database with optional durability.

Let’s see how the cache layer fits in our application. Now instead of querying our MongoDB for every request, we will query the Redis first and only if we didn’t find what we are looking for, we will query MongoDB.

cache layer redis mongodb

App Implementation

First, we will add a cache configuration and connect our application to the Redis server, for this purpose we will add the Redis package from npm.

const mongoose = require("mongoose");
const redis = require("redis");
const keys = require("../config/keys");

const client = redis.createClient({
  host: keys.redisHost,
  port: keys.redisPort,
  retry_strategy: () => 1000
});

Example of Redis initialization (source)

Second, we need to add the logic that queries Redis, and if there is no answer it will query our MongoDB. After we got new data from MongoDB we’ll store it in Redis. Because we want to reuse this logic for different queries, we will have to add a hook in Mongoose’s query generation and execution process.

const mongoose = require("mongoose");
const redis = require("redis");
const util = require("util");
const keys = require("../config/keys");

const client = redis.createClient({
  host: keys.redisHost,
  port: keys.redisPort,
  retry_strategy: () => 1000
});
client.get = util.promisify(client.get);
const exec = mongoose.Query.prototype.exec;

mongoose.Query.prototype.exec = async function() {
  const key = JSON.stringify({
    ...this.getQuery()
  });
 
  const cacheValue = await client.get(key);

  if (cacheValue) {
    const doc = JSON.parse(cacheValue);

    console.log("Response from Redis");
    return Array.isArray(doc)
      ? doc.map(d => new this.model(d))
      : new this.model(doc);
  }

  const result = await exec.apply(this, arguments);
  client.set(key, JSON.stringify(result));

  console.log("Response from MongoDB");
  return result;
};

Cache file when exec function its the function we override (source)

Every query in mongoose uses a Query class to build the query and executes it via exec function. We use the javascript prototype ability to add our reuse cache logic inside exec function. To create the unique key in our cache, we use the query itself via getQuery() function that Query class exports (this function includes “where clause” ). Redis key has to be a string and getQuery() function returns an object, so we have to convert it to a string with JSON.stringify() function. Also, you can see in line 7 that we add Promisify function to wrap Redis get function with a promise (it works with callback). When we get the response back from Redis, the response type is a string. The expected result is the mongoose model, and we have to be consistence with this expectation. We use the constructor of the model we are querying to convert the response. In case we didn’t get data from Redis cache, we will call the regular exec function that queries MongoDB, store the result in Redis and return it.

Is everything perfect?

In this solution, we have a few more issues to solve.

  1. Do we want to store the results in Redis forever?
    No. Of course not.
    Imagine that you query MongoDB for all the books of a specific author, and store the results in Redis. The next time you’re going to query for the same author, you’ll go straight to Redis. But what will happen if a new book was added to the DB for this author? You’ll never gonna get it!
    So we have two ground rules in this case. First, add a TTL for each key in Redis. Second, clear the relevant keys in Redis each time the DB gets updated.
  2. Is the query key stored in Redis unique enough?
    No.
    For example, we have another collection of authors in our MongoDB. we query mongo with the id of 1 in the books collection, so the key in Redis will be 1. After that, we query the authors collection with the same id of 1. Because we already have the id 1 stored in Redis we will get a response of books collection with the same id. To handle this problem we need to store each query for a specific collection. For this, we can use Redis hashes.
  3. Do we want our cache mechanism to work for each query?
    I don’t think so.
    We need to pass an argument that represents our willing to query Redis or query directly MongoDB. This argument will be used as a flag.
const mongoose = require("mongoose");
const redis = require("redis");
const util = require("util");
const keys = require("../config/keys");

const client = redis.createClient({
  host: keys.redisHost,
  port: keys.redisPort,
  retry_strategy: () => 1000
});
client.hget = util.promisify(client.hget);
const exec = mongoose.Query.prototype.exec;

mongoose.Query.prototype.cache = function(options = { time: 60 }) {
  this.useCache = true;
  this.time = options.time;
  this.hashKey = JSON.stringify(options.key || this.mongooseCollection.name);

  return this;
};

mongoose.Query.prototype.exec = async function() {
  if (!this.useCache) {
    return await exec.apply(this, arguments);
  }

  const key = JSON.stringify({
    ...this.getQuery()
  });

  const cacheValue = await client.hget(this.hashKey, key);

  if (cacheValue) {
    const doc = JSON.parse(cacheValue);

    console.log("Response from Redis");
    return Array.isArray(doc)
      ? doc.map(d => new this.model(d))
      : new this.model(doc);
  }

  const result = await exec.apply(this, arguments);
  console.log(this.time);
  client.hset(this.hashKey, key, JSON.stringify(result));
  client.expire(this.hashKey, this.time);

  console.log("Response from MongoDB");
  return result;
};

module.exports = {
  clearKey(hashKey) {
    client.del(JSON.stringify(hashKey));
  }
};

A full example of our cache service (source)

After the above changes, we need to fix our routes file as so:

const mongoose = require("mongoose");
const { clearKey } = require("../services/cache");
const Book = mongoose.model("Book");

module.exports = app => {
  app.get("/api/books", async (req, res) => {
    let books;
    if (req.query.author) {
      books = await Book.find({ author: req.query.author }).cache();
    } else {
      books = await Book.find().cache({
        time: 10
      });
    }

    res.send(books);
  });

  app.post("/api/books", async (req, res) => {
    const { title, content, author } = req.body;

    const book = new Book({
      title,
      content,
      author
    });

    try {
      await book.save();
      clearKey(Book.collection.collectionName);
      res.send(book);
    } catch (err) {
      res.send(400, err);
    }
  });
};

A relevant change in our routes function with a different example of cache use (source)

Conclusion

We added a cache layer by expending mongoose Query exec function via a javascript prototype. Now we are able to query in an efficient way our DB and cache for every query we want without to change our code.

More from our blog:

CNCF Tools Overview: Fluentd – Unified Logging Layer

Distributed Tracing in Asynchronous Applications

Control your AWS Lambda with Provisioned Concurrency

AWS CloudWatch – Part 1/3: Logs and Insights

AWS Lambda and Node.js 12: Support and Benchmark