Working Effectively with Data Factories Using FactoryBot

This tutorial has been updated by Thiago Araújo Silva on 20 April 2018.

Introduction

In test-driven development, data is one of the requirements for a successful and thorough test. In order to be able to test all use cases of a given method, object or feature, you need to be able to define multiple sets of data required for the test.

This is where the data factory pattern steps into test-driven development. Data Factory (or factory in short) is a blueprint that allows us to create an object, or a collection of objects, with predefined sets of values.

You can have multiple sets of predefined values for a single factory. In fact, you should create them for each use case of the factory in order to be able to test all use cases of a given method, object or feature.

Let’s take a look at a common example — we have an Article factory and we need to have multiple sets of predefined values that represent its state and/or use cases. We might have the following variations:

Unpublished article
Published article
Article scheduled to be published in the future
Article published in the past

These are examples of predefined sets of values that need to be defined for an Article factory. The next section provides the implementation details.

Introduction to FactoryBot

There are several tools you can use to create a factory. In this article, we are using FactoryBot.

FactoryBot was built using the Ruby programming language. However, you can use it for multiple frameworks, such as Ruby on Rails, Sinatra, and Padrino. This article covers the implementation of FactoryBot in a Ruby on Rails (i.e. Rails) application.

Installation

The following installation is specific for Rails. For frameworks other than Rails, please consult the installation documentation.

Add the following gem to your Gemfile inside the proper group.

group :development, :test do
  gem "factory_bot_rails"
end

Assuming you are using the RSpec testing framework, add the following code to the spec/support/factory_bot.rb file:

# spec/support/factory_bot.rb
RSpec.configure do |config|
  config.include FactoryBot::Syntax::Methods
end

Enable the autoloading of the support directory by uncommenting the following line in your spec/rails_helper.rb:

Dir[Rails.root.join('spec/support/**/*.rb')].each { |f| require f }

Usage

Let’s take another look at the example of an Article model introduced above:

# app/model/article.rb
class Article < ApplicationRecord
  enum status: [:unpublished, :published]
end

You can create the following factory:

# spec/factories/articles.rb
FactoryBot.define do
  factory :article do
    trait :published do
      status :published
    end

    trait :unpublished do
      status :unpublished
    end

    trait :in_the_future do
      published_at { 2.days.from_now }
    end

    trait :in_the_past do
      published_at { 2.days.ago }
    end
  end
end

The above factory assumes the Article has the following attributes:

status with value :published and :unpublished. For ActiveRecord enums, this field must be an Integer.
published_at with type DateTime

To use the factory, you can use any of the following statements inside your spec:

# build creates an Article object without saving
build :article, :unpublished

# build_stubbed creates an Article object and acts as an already saved Article
build_stubbed :article, :published

# create creates an Article object and saves it to the database
create :article, :published, :in_the_future
create :article, :published, :in_the_past

# create_list creates a collection of objects for a given factory
# you can also use build_list and build_stubbed_list
create_list :article, 2

For a more detailed explanation of FactoryBot usage, you can consult the Getting Started Guide.

Effective Patterns on Data Factory

There are several best practices for using data factories that will improve performance and ensure test consistency if applied properly. The patterns below are ordered based on their importance:

Factory linting
Just enough data
Build and build_stubbed over create
Explicit data testing
Fixed time-based testing

If you are already confident with these practices, feel free to skip the ones you are already familiar with.

Factory Linting

Linting is the process of analyzing code to detect potential errors, and factory linting is the process of detecting potential errors by validating attributes set in the factory.

Factory linting is good for avoiding least expected bugs due to false positive test results, since invalid data is tested against a valid use case.

Why Linting

To gain a good understanding of why factory linting is important, let’s take another look at our example. In the following example, we will enable caching in our test to simulate a common configuration in the production environment.

The following code will enable caching in the test environment:

# config/environments/test.rb
# change the following line to true if not already set
config.action_controller.perform_caching = true

Given the following models and factories:

# app/models/author.rb
class Author < ApplicationRecord
  has_many :articles
  validates :name, presence: true
end

# app/models/article.rb
class Article < ApplicationRecord
  belongs_to :author
  validates :title, presence: true
end

# spec/factories/authors.rb
FactoryBot.define do
  factory :author do
    name 'The amazing author'
  end
end

# spec/factories/articles.rb
FactoryBot.define do
  factory :article do
    title 'The amazing article title'
  end
end

Now, let’s take a look at the following view and its accompanying test:

# app/views/articles/_article.html.erb
<% cache article do %>
  <article>
    <div class="title"><%= article.title %></div>
    <div class="author"><%= article.author.name %></div>
  </article>
<% end %>

# spec/views/articles/_article.html.erb_spec.rb
require 'rails_helper'

RSpec.describe "views/articles/_article.html.erb" do
  context "with author" do
    let(:author)  { build :author }
    let(:article) { build :article, title: 'article title', author: author }

    it "render title" do
      render article
      expect(rendered).to have_content 'article title'
    end
  end
end

The above test seems to be working, and we have a passing test result.

$ bundle exec rspec spec/views/articles/_article.html.erb_spec.rb
.

Finished in 0.02756 seconds (files took 0.99951 seconds to load)
1 example, 0 failures

Now, let’s add another test context for articles without an author:

# spec/views/articles/_article.html.erb_spec.rb
  ...

  context "without author" do
    let(:article) { build :article, title: 'article title' }

    it "render title" do
      render article
      expect(rendered).to have_content 'article title'
    end
  end

  ...

Surprisingly, we still have a positive, passing test result.

$ bundle exec rspec spec/views/articles/_article.html.erb_spec.rb
..

Finished in 0.04788 seconds (files took 0.98726 seconds to load)
2 examples, 0 failures

This is a false positive test result. The expected test result should fail because the view is accessing the name attribute of a non-existent author. If we remove the cache for the article, we can clearly see the error.

Remove the cache from the article partial:

# app/views/articles/_article.html.erb
<% # you can either remove or comment the cache to disable it %>
<% # cache article do %>

  ...

<% # end %>

And then run the test again to see the error after disabling the cache:

$ bundle exec rspec spec/views/articles/_article.html.erb_spec.rb
.F

Failures:

  1) views/articles/_article.html.erb without author render title
     Failure/Error: <div class="author"><%= article.author.name %></div>

     ActionView::Template::Error:
       undefined method `name' for nil:NilClass
     # ./app/views/articles/_article.html.erb:4:in `block in _app_views_articles__article_html_erb___3498982763894348925_70101110542560'
     # ./app/views/articles/_article.html.erb:1:in `_app_views_articles__article_html_erb___3498982763894348925_70101110542560'
     # ./spec/views/articles/_article.html.erb_spec.rb:19:in `block (3 levels) in <top (required)>'

Finished in 0.07124 seconds (files took 1.76 seconds to load)
2 examples, 1 failure

Failed examples:

rspec ./spec/views/articles/_article.html.erb_spec.rb:18 # views/articles/_article.html.erb without author render title

Caching a partial is a common practice in real-life code, and not using it can result in performance loss with varying results. To fix this problem without sacrificing your application’s performance, you can follow the next section on setting up factory linting.

Setting Up Factory Linting

When setting up factory linting, all required attributes need to be set in the factory. This can be done easily for a new project, but note that this is not an easy task when working on an existing project. You need to consider the cost vs. the benefit of linting the factory of an already running project, and make sure that linting the existing factories and fixing the related tests does not require too much time.

Before setting up factory linting, you need database_cleaner to clean up your database after the linting process.

Add the following to your Gemfile:

gem :database_cleaner, group: :test

Install the gem:

bundle install

Now we need to create a rake task to perform the linting. Add the following code to lib/tasks/factory_bot.rake:

namespace :factory_bot do
  desc "Verify that all FactoryBot factories are valid"
  task lint: :environment do
    if Rails.env.test?
      DatabaseCleaner.cleaning do
        FactoryBot.lint
      end
    else
      system("bundle exec rake factory_bot:lint RAILS_ENV='test'")
      fail if $?.exitstatus.nonzero?
    end
  end
end

Note that the FactoryBot.lint command can be set to run in the same process as RSpec, but that may negatively impact performance and feedback when running single tests.

Upon running the rake task, we should get an error and be immediately asked to lint our factory to remove potential errors in our tests:

$ bundle exec rake factory_bot:lint
rake aborted!
FactoryBot::InvalidFactoryError: The following factories are invalid:

* article - Validation failed: Author must exist (ActiveRecord::RecordInvalid)
/home/hu/sandbox/linting-example/lib/tasks/factory_bot.rake:6:in `block (3 levels) in <top (required)>'
/home/hu/sandbox/linting-example/lib/tasks/factory_bot.rake:5:in `block (2 levels) in <top (required)>'
/home/hu/.asdf/installs/ruby/2.5.1/bin/bundle:23:in `load'
/home/hu/.asdf/installs/ruby/2.5.1/bin/bundle:23:in `<main>'
Tasks: TOP => factory_bot:lint
(See full trace by running task with --trace)
rake aborted!

You might want to set up the rake task to run in Semaphore CI as a step before the full test suite, and get into the habit of running it frequently during development.

This is a simple demonstration of why we need factory linting. Although in this example we should use create instead of build for the object rendered in the cached view, for the sake of simplicity, this should be sufficient to demonstrate the importance of factory linting as the first guard against any potential errors that might be introduced in our test.

Just Enough Data

Leaving only required data inside your factory is key to having a reliable test. An unexpected bug could be introduced by putting unnecessary data inside your factory.

Consider the following example of assigning an optional article publish date in the factory.

# app/models/article.rb
class Article < ApplicationRecord
  validates :title, presence: true
end

# spec/factories/articles.rb
FactoryBot.define do
  factory :article do
    title "The amazing article title"
    published_at { DateTime.now }
  end
end

The following seemingly innocent view will pass the test:

# app/views/articles/_article.html.erb
<article>
  <div class="title"><%= article.title %></div>
  <div class="publish-date"><%= article.published_at.to_date %></div>
</article>

# spec/views/articles/_article.html.erb_spec.rb
require 'rails_helper'

RSpec.describe "articles/_article.html.erb" do
  include ActiveSupport::Testing::TimeHelpers

  context "with publish date" do
    let(:article) { build :article }

    before { travel_to Time.current }
    after  { travel_back }

    it "render article title and publish date" do
      render article
      expect(rendered).to have_content article.title
      expect(rendered).to have_content article.published_at.to_date
    end
  end
end

However, the above test isn’t testing the model correctly, since the article publish date is optional and is not required. An article without a publish date can cause errors in production. Therefore, we need to restrict the factory to set only the required attributes.

If you need to add a set attribute to the factory, you can add it as an alternate state of the factory-using trait. In the example used above, you can add a trait to create an alternate state of the article factory that includes the article publish date. You also need to add a test for the alternate state.

# spec/factories/articles.rb
FactoryBot.define do
  factory :article do
    title "The amazing article title"

    trait :with_publish_date do
      published_at { DateTime.now }
    end
  end
end

You can use it as follows:

# app/views/articles/_article.html.erb
<article>
  <div class="title"><%= article.title %></div>
  <% if article.published_at %>
    <div class="publish-date"><%= article.published_at.to_date %></div>
  <% end %>
</article>

# spec/views/articles/_article.html.erb_spec.rb
require 'rails_helper'

RSpec.describe "articles/_article.html.erb" do
  include ActiveSupport::Testing::TimeHelpers

  context "with publish date" do
    before { travel_to Time.current }
    after  { travel_back }

    context "with publish date" do
      let(:article) { build :article, :with_publish_date }

      it "render article title and publish date" do
        render article
        expect(rendered).to have_content article.title
        expect(rendered).to have_content article.published_at.to_date
      end
    end

    context "without publish date" do
      let(:article) { build :article }

      it "render article title and truncated body" do
        render article
        expect(rendered).to have_content article.title
        expect(rendered).not_to have_selector ".publish-date"
      end
    end
  end
end

Even better, if you need to set an additional attribute in the factory, it might be a sign that the attribute is in fact required. Therefore, we need to change the model to also validate the presence of the attribute that we are about to add to the factory.

Explicit Data Testing

Test expectations need to use explicit factory attributes, set to provide useful information on the test. This means that the things you want to test should be set in the test files, and should reflect the state of the factory being tested. Therefore, you should not rely on factory defaults for data that is relevant to the test at hand.

Here is a simple demonstration of this:

# app/models/article.rb
class Article < ApplicationRecord
  enum status: [:unpublished, :published]

  def self.published_in_the_past
    # we expect this method to fail first
    where(nil)
  end
end

# spec/factories/articles.rb
FactoryBot.define do
  factory :article do
    status :unpublished

    trait :published do
      status :published
    end

    trait :in_the_past do
      published_at { 2.days.ago }
    end

    trait :in_the_future do
      published_at { 2.days.from_now }
    end
  end
end

# spec/models/articles_spec.rb
require 'rails_helper'

RSpec.describe Article do
  describe ".published_in_the_past" do
    let!(:unpublished_article)     { create :article }
    let!(:published_in_the_past)   { create :article, :published, :in_the_past }
    let!(:published_in_the_future) { create :article, :published, :in_the_future }

    it { expect(Article.published_in_the_past).to include published_in_the_past }
    it { expect(Article.published_in_the_past).not_to include unpublished_article }
    it { expect(Article.published_in_the_past).not_to include published_in_the_future }
  end
end

A failing test result of the articles_spec.rb doesn’t help much, because it will dump all the attributes of each factory, and we will have to skim for specific attributes, such as status and published_at. Instead, you can add a title that reflects the factory you’re testing to each of the factories.

# spec/models/articles_spec.rb
require 'rails_helper'

RSpec.describe Article do
  describe ".published_in_the_past" do
    let!(:unpublished_article)     { create :article, title: 'unpublished article' }
    let!(:published_in_the_past)   { create :article, :published, :in_the_past, title: 'published in the past' }
    let!(:published_in_the_future) { create :article, :published, :in_the_future, title: 'published in the future' }

    it { expect(Article.published_in_the_past).to include published_in_the_past }
    it { expect(Article.published_in_the_past).not_to include unpublished_article }
    it { expect(Article.published_in_the_past).not_to include published_in_the_future }
  end
end

With this change, the error message can help you by showing the expected article according to its title:

  2) Article.published_in_the_past should not include #<Article id: 46, title: "published in the future", status: "published", published_at: "2018-04-15 14:45:55",
 author_id: nil, created_at: "2018-04-13 14:45:55", updated_at: "2018-04-13 14:45:55">
     Failure/Error: it { expect(Article.published_in_the_past).not_to include published_in_the_future }

       expected #<ActiveRecord::Relation [#<Article id: 44, title: "unpublished article", status: "unpublished", publ...5 14:45:55", author_id: nil, created_at: "2
018-04-13 14:45:55", updated_at: "2018-04-13 14:45:55">]> not to include #<Article id: 46, title: "published in the future", status: "published", published_at: "20
18-04-15 14:45:55", author_id: nil, created_at: "2018-04-13 14:45:55", updated_at: "2018-04-13 14:45:55">
       Diff:
       @@ -1,2 +1,25 @@
       -[#<Article id: 46, title: "published in the future", status: "published", published_at: "2018-04-15 14:45:55", author_id: nil, created_at: "2018-04-13 14:4
5:55", updated_at: "2018-04-13 14:45:55">]
       +[#<Article:0x00007fc927bf04a8
       +  id: 44,
       +  title: "unpublished article",
       +  status: "unpublished",
       +  published_at: nil,
       +  author_id: nil,
       +  created_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00,
       +  updated_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00>,
       + #<Article:0x00007fc927bf0318
       +  id: 45,
       +  title: "published in the past",
       +  status: "published",
       +  published_at: Wed, 11 Apr 2018 14:45:55 UTC +00:00,
       +  author_id: nil,
       +  created_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00,
       +  updated_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00>,
       + #<Article:0x00007fc927bf0188
       +  id: 46,
       +  title: "published in the future",
       +  status: "published",
       +  published_at: Sun, 15 Apr 2018 14:45:55 UTC +00:00,
       +  author_id: nil,
       +  created_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00,
       +  updated_at: Fri, 13 Apr 2018 14:45:55 UTC +00:00>]

     # ./spec/models/article_spec.rb:11:in `block (3 levels) in <top (required)>'

However, there’s still an improvement we can make. Our test is concerned with what articles are returned, not with their attributes. Let’s take a step forward and run our expectations directly against the titles with the help of Array#map:

require 'rails_helper'

RSpec.describe Article do
  describe ".published_in_the_past" do
    before do
      create :article, title: 'unpublished article'
      create :article, :published, :in_the_past, title: 'published in the past'
      create :article, :published, :in_the_future, title: 'published in the future'
    end

    subject(:article_titles) { Article.published_in_the_past.map(&:title) }

    it { expect(article_titles).to include 'published in the past' }
    it { expect(article_titles).not_to include 'unpublished article' }
    it { expect(article_titles).not_to include 'published in the future' }
  end
end

Note that the subject of our test is now article_titles, and we can use before instead of let! to create the articles, since the corresponding objects are no longer used to run the expectations.

And this change results in an even better error message, which increases the feedback quality of our test suite:

  1) Article.published_in_the_past should not include "unpublished article"
     Failure/Error: it { expect(article_titles).not_to include 'unpublished article' }
       expected ["unpublished article", "published in the past", "published in the future"] not to include "unpublished article"
     # ./spec/models/article_spec.rb:14:in `block (3 levels) in <top (required)>'

  2) Article.published_in_the_past should not include "published in the future"
     Failure/Error: it { expect(article_titles).not_to include 'published in the future' }
       expected ["unpublished article", "published in the past", "published in the future"] not to include "published in the future"
     # ./spec/models/article_spec.rb:15:in `block (3 levels) in <top (required)>'

Build and build_stubbed Over create

At times, you don’t need to use the create method. Since it saves to the database, it adds overhead to the test, and if you were to abuse the factory creation, then the overhead would be significant enough to slow down our test.

Use build or build_stubbed for tests that don’t need to be written to the database, tests that don’t do queries, or tests that use stubs for abstracting away the complexity of the queries.

Consider the following example that uses build over create and leverages stubs for abstracting the internal implementation of the method being tested.

Here’s an example of a naive implementation of Article.recent:

# app/models/article.rb
class Article < ApplicationRecord
  def self.recent
    promoted + latest
  end

  def self.promoted
    # Find promoted articles
  end

  def self.latest
    # Find latest articles
  end
end

Since the implementation uses promoted and latest, you can stub each method and return articles created using build instead of create as follows:

# spec/models/article_spec.rb
require 'rails_helper'

RSpec.describe Article do
  describe ".recent" do
    let(:latest)   { build :article, :published, title: :latest  }
    let(:promoted) { build :article, :published, title: :promoted }

    before do
      allow(Article).to receive(:latest).and_return([latest])
      allow(Article).to receive(:promoted).and_return([promoted])
    end

    it { expect(Article.recent).to include latest }
    it { expect(Article.recent).to include promoted }
  end
end

Keep in mind that if you are testing your view and it uses caching, you must use create for your factory, or else you can have an inconsistent test result due to the stubbing done by build_stubbed in FactoryBot.

However, there’s something important to keep in mind when using build: it will create any associations declared in your factory. Suppose your article factory has an author association:

# spec/factories/articles.rb
FactoryBot.define do
  factory :article do
    name 'The amazing article'
    author
  end
end

When running build(:article), the articles count won’t increase, but the authors count will: which means an author will be created in the database. To overcome this surprising limitation, it’s recommended to use build_stubbed over build. We could rewrite the above example to use build_stubbed:

# spec/models/article_spec.rb
require 'rails_helper'

RSpec.describe Article do
  describe ".recent" do
    let(:latest)   { build_stubbed :article, :published, title: :latest  }
    let(:promoted) { build_stubbed :article, :published, title: :promoted }

    before do
      allow(Article).to receive(:latest).and_return([latest])
      allow(Article).to receive(:promoted).and_return([promoted])
    end

    it { expect(Article.recent).to include latest }
    it { expect(Article.recent).to include promoted }
  end
end

Fixed Time-based Testing

The relative time helper from Rails such as: 2.seconds.ago, 5.minutes.ago, or other helpers, can cause split-second test inconsistencies when used to assert time-related data. To avoid this, try to manually specify the time, instead of using the relative time helper from Rails. Consider the following example:

create :article, published_at: "2015-04-04T17:30:05+0700"

If you prefer to use the relative time helper, consider using tools like the the ActiveSupport time helpers. You can freeze the time and run the helper without risking split-second test result inconsistencies.

You can include the ActiveSupport::Testing::TimeHelpers module globally in your RSpec configuration or directly in any tests using it. The first alternative can be setup like follows:

# spec/rails_helper.rb
RSpec.configure do |config|
  config.include ActiveSupport::Testing::TimeHelpers
end

To use it, add the following to your before/after test context:

before do
  travel_to Time.current
end

after do
  travel_back
end

Conclusion

FactoryBot is not the only tool you can use to create a data factory. There are other tools that you can choose as an alternative. You can use Fabrication, or you can check out the full list of alternatives in The Ruby Toolbox.

To gain a better understanding of how FactoryBot works, you can read its Getting Started guide, or, if you’re curious about how things are implemented, go straight to the source. As a bonus, if you are keen to gain a good understanding of why factory is good for your application compared to the other test data strategies, consider trying an alternative strategy using Rails Fixtures.

Using the pointers from this tutorial, you should be able to write a better factory that can be used to test your application more effectively.

P.S. Would you like to learn how to build sustainable Rails apps and ship more often? We’ve recently published an ebook covering just that — “Rails Testing Handbook”. Learn more and download a free copy.

Working Effectively with Data Factories Using FactoryBot

Introduction

Introduction to FactoryBot

Installation

Usage

Effective Patterns on Data Factory

Factory Linting

Just Enough Data

Explicit Data Testing

Build and build_stubbed Over create

Fixed Time-based Testing

Conclusion

Learn CI/CD the practical way 🎓

Leave a Reply Cancel reply

Introduction

Introduction to FactoryBot

Installation

Usage

Effective Patterns on Data Factory

Factory Linting

Just Enough Data

Explicit Data Testing

Build and build_stubbed Over create

Fixed Time-based Testing

Conclusion

Semaphore Uncut Podcast 🎙️

CI/CD Weekly Newsletter 🔔

Learn CI/CD the practical way 🎓

Leave a Reply Cancel reply

CI/CD Weekly Newsletter 🔔