I love Rails. Most of it I truly appreciate as a developer, but there are some aspects that I do not like. The most heinous of these from my perspective is default_scope. I dislike it so much that I was even compelled to write about it!
In Rails, you use scopes to define commonly used queries, such as getting active users, archived posts, and so on.
class User < ActiveRecord::Base
scope :active, -> { where(active: true) }
scope :with_location, -> { where.not(location: nil) }
end
class Post < ActiveRecord::Base
scope :archived, -> { where.not(archived_at: nil) }
end
This works really well and was used for all projects I worked with. The neat thing about scopes is that you can chain them to construct more complex queries, while keeping the code readable:
# Get all active users with a location
User.active.with_location
# Limit to users who have archived posts
User.active.with_location.joins(:posts).merge(Post.archived)
Rails saw that this is good, and thought that it would be even better if there is a “default” scope that is always applied to the model, so developers do not have to type User.active
everywhere. Enter default_scope
:
class User < ActiveRecord::Base
scope :with_location, -> { where.not(location: nil) }
default_scope { where(active: true) }
end
This simple declaration ensures that when we query against the User
table, we always get active users and ignore the deactivated ones. Sounds good and everyone is happy!
Until years later, you scratch your head debugging an issue with code that looks like it should work.
Hidden Costs
As developers were quite pleased at how they don’t need to add in a commonly-used scope everywhere, this feature gets sprinkled on several models in the application. But this action may prove to be a headache as the application grows.
Let’s say we have an Event
model that handles records related to conferences, meetups, and the like. Since all public pages should only display upcoming events, it makes sense to add a default scope for this:
class Event < ActiveRecord::Base
default_scope { where(start_date: Time.zone.today..) }
end
After some months, you added some post-processing in the event data using a background job:
event = Event.find(event_id)
event.some_processing!
One day, an error was detected that says an event could not be found. The Sidekiq server died and delayed the processing of the jobs. Now it cannot find the event since the start date is already in the past!
In another page we want to list event comments that have been previously reviewed and approved (so as not to show junk/spam):
class Comment < ActiveRecord::Base
STATUSES = %w[pending approved].freeze
default_scope { where(approved: true) }
end
If you are declaring the scope like this, you may not be aware that the default_scope
condition is also applied to new records. Our understanding of scope
is that these are queries on existing records, but default_scope
also applies to newly-instantiated ones. So when you create a new comment:
comment = Comment.new
comment.approved?
=> true
If you are not careful, you will end up with auto-approved new comments!
In addition, querying denied comments no longer work if you use the relation directly, like:
user.comments.where(approved: false)
=> []
This always returns an empty result, even though there are denied comments. This is because calling user.comments
use the default scope, so it only returns comments where approved
is true
.
pluck
The pluck method is one of my favorite in Ruby/Rails. This method provides an easy boost to query performance compared to using map
when transforming lists of objects. But sadly, this little tool can also get wrecked by default_scope
.
Even if you’re just doing simple sorting with the default scope, you can get bitten. For example, let’s say your default scope is just
https://www.reddit.com/r/ruby/comments/3llnlk/comment/cv7lhlr/order(:created_at)
, but then you need to do a join or an aggregate function on that table. Chances are,created_at
can’t be in your results, so it fails.
To give a simple example, let’s say we have this model:
class Post < ActiveRecord::Base
default_scope { order(created_at: :desc) }
end
This seemingly simple scope will now prevent you from using distinct
and pluck
within your app. These usually occur when you have code related to metrics or reporting.
Post.where(status: 'draft').distinct.pluck(:id)
> in `<main>': PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list (ActiveRecord::StatementInvalid)
Workarounds
Thankfully, there are ways to disable the behavior of default_scope
:
unscoped
From the Rails documentation, unscoped
returns the model without any previously applied scopes. So in our previous example where the Event
cannot be found if the worker is delayed, we can change it so that it ignores the default_scope
declaration:
event = Event.unscoped.find(event_id)
event.some_processing!
This fixes the problem with querying past events, but using unscoped
is like choosing the nuclear option. Using this method also wipes out any other scopes, including associations. So if you are using it like this:
user.posts.unscoped
This does not mean that you just removed the default_scope
from Post
, but you also removed the Posts.where(user:)
bit in user.posts
. So in essence, this will be the same as calling Post.all
(which is not what you originally intended).
unscope
Another (less nuclear) option would be to use unscope
. This removes a specific relation only so it does not affect the query if the relations are chained, like in our user.posts
example.
For example, if we want to get the user’s unapproved events, and not just approved events (which was defined as a default_scope
), we can do this instead:
user.events.unscope(where: :approved).where(approved: false)
It can become problematic if the defined scope uses multiple columns (like approved
and created_at
. For this to work, we will need to define all columns that the default_scope
query uses. If the default_scope
is updated (by adding more queries) and the unscope
calls are not updated, then you are back at the original problem.
An ounce of prevention…
All the words above just to drive this simple point: think more than twice before deciding to use default_scope
. Any advantages to using it now is likely not worth it when your application grows in code and complexity.
“But, I need to use this for soft deletions!” is a common argument to using this in Rails. In which case, use something like Discard instead. And yes, someone actually built another soft-deletion library because using default_scope
is problematic.
I’ll finish rambling with this one last anecdote that basically summarizes what I was trying to explain:
The way default scopes interact with associations and relations is not intuitive. It will cause problems, you’ll have to hack around them, and by the time you realize you shouldn’t have used
SnarkyNinjadefault_scope
, your application has tons and tons of implicit dependencies on and workarounds for its behavior which are almost impossible to track down and refactor. I assure you that any speed or efficiency gains you feel likedefault_scope
brings will be more than offset by the pain it causes developers months or years later.