How We Seed & Test our Mongo-based App with Cypress

How We Seed & Test our Mongo-based App with Cypress

Written by Michael Shi Michael Shi • Published on Oct 28, 2022
How We Seed & Test our Mongo-based App with Cypress

When first setting up our Cypress test suite - we had to figure out how we wanted to seed test data into our DB. To our surprise, this wasn’t a well written topic so I wanted to share what we learned as we’ve built up our E2E test suite.

What we wanted

We wanted to optimize for a few things:

  1. Minimize test set up time
  2. Avoid test-specific code in our app
  3. Easily notice seed data drift when our app changes

We eliminated depending on UI interactions to seed test data, as that would make our tests incredibly slow. Additionally, we evaluated that internal API endpoints to seed tests data could cause silent drift drift from our API implementation and we’d need to build/guard CI-specific endpoints to do some test-specific actions (ex. wipe DB between tests).

What we chose

One common solution is to directly use mongoose in your tests, but we realized we could actually do better than that by re-using our existing DB abstraction layer from our app server (built on mongoose) to seed test data.

We import our existing DB functions from our app directly as a Cypress task, and are able to call them easily within a test.

Here’s a simple example of what that might look like:

// cypress.config.ts
// Import from our existing DB abstraction layer, this already insulates us
// from being affected by low-level DB changes in the future.
import { findUserByEmail } from '../src/user';
on('task', {
  // Create a Cypress task using our imported function
  // Since this function call is typed, any breaking changes to our DB will show
  // up as a Typescript error here
  async assignUserToTeam({ email, team }: { email: string, team: string }) {
    const user = await findUserByEmail(email);
    if (user) { = team;
    return user;
  // ...
beforeEach(() => {
it('denies login from team members with invalid auth method', () => {
  // We can seed and manipulate our Mongo db easily now inside a test!
  cy.task('assignUserToTeam', {
    email: '',
  // ...

This gives us full flexibility to do test-specific actions, without needing to write one-off endpoints just for tests. Clearing the DB or reassigning a user to a team is just a simple task/function call away.

Additionally, since all of our code and Mongoose models are typed with Typescript, we can rely on Typescript to immediately let us know what tests we need to update when we change something in the DB layer (ex. update the email field). This ensure our test data won’t silently rot like static seed data might.


Overall, we haven’t felt many downsides of this approach. A few minor inconveniences we noticed were that:

  1. You do have to write a new task for every new DB action, and some tasks end up doing a bit more than just calling our existing abstraction layer (such as in the assignUserToTeam example above). However in practice we only needed a handful of Cypress tasks defined for our test cases.
  2. You’ll need to have your Cypress test runner import some of your application code/connect to the DB directly. This meant our Cypress test runner has inherited some of our app dependencies which complicates our test runner setup by a bit.
  3. Our cy.task calls still aren’t typed, so there could be drift between the task definition inside the node process and the task caller in the test script, though it’s something we can add with time, and it’s straightforward to ensure that all cy.task calls are refactored when the task definition is changed.

We’ve enjoyed using this pattern internally for seeding test data and hope you’ll find the same when using Cypress to test a Mongo-based application!

Eliminate Flaky Cypress Tests with DeploySentinel
Debug & fix Cypress tests with full DOM, network and console events captured from your CI runs. Test parallelization and analytics included.
View the Live Demo

As a bonus, we also love the technique Metabase uses to generating DB snapshots dynamically from user actions (opens in a new tab). It may be a better option depending on the tradeoffs your team is looking to make.