ActiveRecord exists? and blank? under the hood.
When should I use exists? or blank? to check records existence? Does this choice has an impact on our application?
Asking those questions is a good sign of curiosity! So we will see together how those methods work under the hood and without a long introduction let’s unpack ActiveRecord and dive deep into the source code.
Putting exists? under the scope:
Let’s try to call exists? on a Post model:
Post.all.exists?
We will notice this output line in the console:
Post Exists (0.4ms) SELECT 1 AS one FROM "posts" LIMIT ? [["LIMIT", 1]]
So let’s try to see the code behind it to understand how it works, I tried to share only the relevant snippets of code, for reference please visit this file for ActiveRecord 5.2
# lib/active_record/relation/finder_methods.rb
module ActiveRecord
module FinderMethods
ONE_AS_ONE = "1 AS one"
def exists?(conditions = :none)
# conditions verification
....
# in case of eager loading
....
relation = construct_relation_for_exists(conditions)
# name method returns the model name which is used to build
# the log output for the query.
# In our examlpe it will be "Post"
skip_query_cache_if_necessary { connection.select_value(relation.arel, "#{name} Exists") } ? true : false
rescue ::RangeError
false
end
# This method is responsible for constructing the Exist query that we can see in logs
# exp: SELECT 1 AS one FROM "posts" LIMIT ? [["LIMIT", 1]]
#
def construct_relation_for_exists(conditions)
relation = except(:select, :distinct, :order)._select!(ONE_AS_ONE).limit!(1)
case conditions
when Array, Hash
relation.where!(conditions)
else
relation.where!(primary_key => conditions) unless conditions == :none
end
relation
end
end
end
and we need to check select_value method to get the complete picture:
# lib/active_record/connection_adapters/abstract/database_statements.rb
module ActiveRecord
module ConnectionAdapters # :nodoc:
module DatabaseStatements
# Returns a single value from a record
def select_value(arel, name = nil, binds = [])
single_value_from_rows(select_rows(arel, name, binds))
end
def single_value_from_rows(rows)
row = rows.first
row && row.first
end
end
end
end
Now we can understand how ActiveRecord is building the exists? query. We can also see that the query check whether there is a row or not overlook columns by using:
SELECT 1 AS one FROM "posts" LIMIT ? [["LIMIT", 1]]
There are different discussions and opinions about this query, but for us, it’s important to know that this check doesn’t instantiate records and perform the check by confirming the existence of a row in our model table.
Putting blank? under the scope:
The reference source code for the simplified snippet below can be found here.
# lib/active_record/relation.rb
module ActiveRecord
class Relation
alias :loaded? :loaded
def initialize(args)
.....
@loaded = false
....
end
def blank?
records.blank?
end
def records # :nodoc:
load
@records
end
# Causes the records to be loaded from the database if they have not
# been loaded already. You can use this if for some reason you need
# to explicitly load some records before actually using them. The
# return value is the relation itself, not the records.
#
# Post.where(published: true).load # => #<ActiveRecord::Relation>
def load(&block)
exec_queries(&block) unless loaded?
self
end
def exec_queries(&block)
skip_query_cache_if_necessary do
# We care mostly about this part where records are built
@records =
if eager_loading?
apply_join_dependency do |relation, join_dependency|
if ActiveRecord::NullRelation === relation
[]
else
rows = connection.select_all(relation.arel, "SQL")
join_dependency.instantiate(rows, &block) # <== Records instantiation
end.freeze
end
else
klass.find_by_sql(arel, &block).freeze
end
# preload associations
....
@records.each(&:readonly!) if readonly_value
# Note the update of the variable loaded to true
@loaded = true
@records
end
end
end
end
and the code for find_by_sql method:
# lib/active_record/querying.rb
module ActiveRecord
module Querying
def find_by_sql(sql, binds = [], preparable: nil, &block)
result_set = connection.select_all(sanitize_sql(sql), "#{name} Load", binds, preparable: preparable)
column_types = result_set.column_types.dup
columns_hash.each_key { |k| column_types.delete k }
message_bus = ActiveSupport::Notifications.instrumenter
payload = {
record_count: result_set.length,
class_name: name
}
message_bus.instrument("instantiation.active_record", payload) do
# Instantiation of records
result_set.map { |record| instantiate(record, column_types, &block) }
end
end
end
end
We checked together how blank? method loads records to perform this check. But before saying that we should always use exists? let’s step back and analyze the shared code. I suggest checking the load method again, it loads the records from the database only if they haven’t been loaded already.
That comment is very important to grasp. For example, if we are performing an existence check on records that we need as part of the application logic then we should use blank? which results in hitting the database once for the select query. But if we opted for exists? we will end up by hitting the database twice, the first query to perform the check and the second one to fetch the data.
Conclusion
Inspecting the source code of those methods was a good exercise to improve our knowledge and understand how the framework is working under the hood.
We should remember these two points:
- Use blank? if you need to use the records in the application logic.
- Prefer exists? if you care only about the existence check.
That’s it, so have fun and keep coding 🙂