I love Rails. Most of it I truly appreciate as a developer, but there are some aspects that I do not like. The most heinous of these from my perspective is default_scope. I dislike it so much that I was even compelled to write about it!

In Rails, you use scopes to define commonly used queries, such as getting active users, archived posts, and so on.

class User < ActiveRecord::Base
  scope :active, -> { where(active: true) }
  scope :with_location, -> { where.not(location: nil) }
end
 
class Post < ActiveRecord::Base
  scope :archived, -> { where.not(archived_at: nil) }
end

This works really well and was used for all projects I worked with. The neat thing about scopes is that you can chain them to construct more complex queries, while keeping the code readable:

# Get all active users with a location
User.active.with_location

# Limit to users who have archived posts
User.active.with_location.joins(:posts).merge(Post.archived)

Rails saw that this is good, and thought that it would be even better if there is a “default” scope that is always applied to the model, so developers do not have to type User.active everywhere. Enter default_scope:

class User < ActiveRecord::Base
  scope :with_location, -> { where.not(location: nil) }

  default_scope { where(active: true) }
end

This simple declaration ensures that when we query against the User table, we always get active users and ignore the deactivated ones. Sounds good and everyone is happy!

Until years later, you scratch your head debugging an issue with code that looks like it should work.

Hidden Costs

As developers were quite pleased at how they don’t need to add in a commonly-used scope everywhere, this feature gets sprinkled on several models in the application. But this action may prove to be a headache as the application grows.

Let’s say we have an Event model that handles records related to conferences, meetups, and the like. Since all public pages should only display upcoming events, it makes sense to add a default scope for this:

class Event < ActiveRecord::Base
  default_scope { where(start_date: Time.zone.today..) }
end

After some months, you added some post-processing in the event data using a background job:

event = Event.find(event_id)
event.some_processing!

One day, an error was detected that says an event could not be found. The Sidekiq server died and delayed the processing of the jobs. Now it cannot find the event since the start date is already in the past!

In another page we want to list event comments that have been previously reviewed and approved (so as not to show junk/spam):

class Comment < ActiveRecord::Base
  STATUSES = %w[pending approved].freeze

  default_scope { where(approved: true) }
end

If you are declaring the scope like this, you may not be aware that the default_scope condition is also applied to new records. Our understanding of scope is that these are queries on existing records, but default_scope also applies to newly-instantiated ones. So when you create a new comment:

comment = Comment.new
comment.approved?
=> true

If you are not careful, you will end up with auto-approved new comments!

In addition, querying denied comments no longer work if you use the relation directly, like:

user.comments.where(approved: false)
=> []

This always returns an empty result, even though there are denied comments. This is because calling user.comments use the default scope, so it only returns comments where approved is true.

pluck

The pluck method is one of my favorite in Ruby/Rails. This method provides an easy boost to query performance compared to using map when transforming lists of objects. But sadly, this little tool can also get wrecked by default_scope.

Even if you’re just doing simple sorting with the default scope, you can get bitten. For example, let’s say your default scope is just order(:created_at), but then you need to do a join or an aggregate function on that table. Chances are, created_at can’t be in your results, so it fails.

https://www.reddit.com/r/ruby/comments/3llnlk/comment/cv7lhlr/

To give a simple example, let’s say we have this model:

class Post < ActiveRecord::Base
  default_scope { order(created_at: :desc) }
end

This seemingly simple scope will now prevent you from using distinct and pluck within your app. These usually occur when you have code related to metrics or reporting.

Post.where(status: 'draft').distinct.pluck(:id)

> in `<main>': PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list (ActiveRecord::StatementInvalid)

Workarounds

Thankfully, there are ways to disable the behavior of default_scope:

unscoped

From the Rails documentation, unscoped returns the model without any previously applied scopes. So in our previous example where the Event cannot be found if the worker is delayed, we can change it so that it ignores the default_scope declaration:

event = Event.unscoped.find(event_id)
event.some_processing!

This fixes the problem with querying past events, but using unscoped is like choosing the nuclear option. Using this method also wipes out any other scopes, including associations. So if you are using it like this:

user.posts.unscoped

This does not mean that you just removed the default_scope from Post, but you also removed the Posts.where(user:) bit in user.posts. So in essence, this will be the same as calling Post.all (which is not what you originally intended).

unscope

Another (less nuclear) option would be to use unscope. This removes a specific relation only so it does not affect the query if the relations are chained, like in our user.posts example.

For example, if we want to get the user’s unapproved events, and not just approved events (which was defined as a default_scope), we can do this instead:

user.events.unscope(where: :approved).where(approved: false)

It can become problematic if the defined scope uses multiple columns (like approved and created_at. For this to work, we will need to define all columns that the default_scope query uses. If the default_scope is updated (by adding more queries) and the unscope calls are not updated, then you are back at the original problem.

An ounce of prevention…

All the words above just to drive this simple point: think more than twice before deciding to use default_scope. Any advantages to using it now is likely not worth it when your application grows in code and complexity.

But, I need to use this for soft deletions!” is a common argument to using this in Rails. In which case, use something like Discard instead. And yes, someone actually built another soft-deletion library because using default_scope is problematic.

I’ll finish rambling with this one last anecdote that basically summarizes what I was trying to explain:

The way default scopes interact with associations and relations is not intuitive. It will cause problems, you’ll have to hack around them, and by the time you realize you shouldn’t have used default_scope, your application has tons and tons of implicit dependencies on and workarounds for its behavior which are almost impossible to track down and refactor. I assure you that any speed or efficiency gains you feel like default_scope brings will be more than offset by the pain it causes developers months or years later.

SnarkyNinja


Leave a Reply

Your email address will not be published.