ActiveRecord exists? and blank? under the hood.

When should I use exists? or blank? to check records existence? Does this choice has an impact on our application?

Asking those questions is a good sign of curiosity! So we will see together how those methods work under the hood and without a long introduction let’s unpack ActiveRecord and dive deep into the source code.

Putting exists? under the scope:

Let’s try to call exists? on a Post model:

Post.all.exists?

We will notice this output line in the console:

Post Exists (0.4ms) SELECT 1 AS one FROM "posts" LIMIT ? [["LIMIT", 1]]

So let’s try to see the code behind it to understand how it works, I tried to share only the relevant snippets of code, for reference please visit this file for ActiveRecord 5.2

# lib/active_record/relation/finder_methods.rb

module ActiveRecord
  module FinderMethods
    ONE_AS_ONE = "1 AS one"
 
    def exists?(conditions = :none)
      # conditions verification
      ....

      # in case of eager loading
      ....

      relation = construct_relation_for_exists(conditions)
      
      # name method returns the model name which is used to build
      # the log output for the query.
      # In our examlpe it will be "Post"
      skip_query_cache_if_necessary { connection.select_value(relation.arel, "#{name} Exists") } ? true : false
    rescue ::RangeError
      false
    end

    # This method is responsible for constructing the Exist query that we can see in logs
    # exp: SELECT 1 AS one FROM "posts" LIMIT ? [["LIMIT", 1]]
    #
    def construct_relation_for_exists(conditions)
      relation = except(:select, :distinct, :order)._select!(ONE_AS_ONE).limit!(1)

      case conditions
      when Array, Hash
        relation.where!(conditions)
      else
        relation.where!(primary_key => conditions) unless conditions == :none
      end

      relation
    end
  end
end

and we need to check select_value method to get the complete picture:

# lib/active_record/connection_adapters/abstract/database_statements.rb

module ActiveRecord
  module ConnectionAdapters # :nodoc:
    module DatabaseStatements

      # Returns a single value from a record
      def select_value(arel, name = nil, binds = [])
        single_value_from_rows(select_rows(arel, name, binds))
      end

      def single_value_from_rows(rows)
        row = rows.first
        row && row.first
      end

    end
  end
end

Now we can understand how ActiveRecord is building the exists? query. We can also see that the query check whether there is a row or not  overlook columns by using:

SELECT  1 AS one FROM "posts" LIMIT ?  [["LIMIT", 1]]

There are different discussions and opinions about this query, but for us, it’s important to know that this check doesn’t instantiate records and perform the check by confirming the existence of a row in our model table.

Putting blank? under the scope:

The reference source code for the simplified snippet below can be found here.

# lib/active_record/relation.rb

module ActiveRecord
  class Relation

    alias :loaded? :loaded

    def initialize(args)
      .....
      @loaded = false
      ....
    end

    def blank?
      records.blank?
    end

    def records # :nodoc:
      load
      @records
    end

    # Causes the records to be loaded from the database if they have not
    # been loaded already. You can use this if for some reason you need
    # to explicitly load some records before actually using them. The
    # return value is the relation itself, not the records.
    #
    #   Post.where(published: true).load # => #<ActiveRecord::Relation>
    def load(&block)
      exec_queries(&block) unless loaded?

      self
    end

    def exec_queries(&block)

        skip_query_cache_if_necessary do
          # We care mostly about this part where records are built
          @records =
            if eager_loading?
              apply_join_dependency do |relation, join_dependency|
                if ActiveRecord::NullRelation === relation
                  []
                else
                  rows = connection.select_all(relation.arel, "SQL")
                  join_dependency.instantiate(rows, &block) # <== Records instantiation
                end.freeze
              end
            else
              klass.find_by_sql(arel, &block).freeze
            end
          
          # preload associations
          ....

          @records.each(&:readonly!) if readonly_value

          # Note the update of the variable loaded to true
          @loaded = true
          @records
        end
      end


   end
end

and the code for find_by_sql method:

# lib/active_record/querying.rb

module ActiveRecord
  module Querying
    def find_by_sql(sql, binds = [], preparable: nil, &block)
      result_set = connection.select_all(sanitize_sql(sql), "#{name} Load", binds, preparable: preparable)
      column_types = result_set.column_types.dup
      columns_hash.each_key { |k| column_types.delete k }
      message_bus = ActiveSupport::Notifications.instrumenter

      payload = {
        record_count: result_set.length,
        class_name: name
      }

      message_bus.instrument("instantiation.active_record", payload) do
        # Instantiation of records
        result_set.map { |record| instantiate(record, column_types, &block) }
      end
    end
  end
end

We checked together how blank? method loads records to perform this check. But before saying that we should always use exists? let’s step back and analyze the shared code. I suggest checking the load method again, it loads the records from the database only if they haven’t been loaded already.

That comment is very important to grasp. For example, if we are performing an existence check on records that we need as part of the application logic then we should use blank? which results in hitting the database once for the select query. But if we opted for exists? we will end up by hitting the database twice, the first query to perform the check and the second one to fetch the data.

Conclusion

Inspecting the source code of those methods was a good exercise to improve our knowledge and understand how the framework is working under the hood.

We should remember these two points:

  • Use blank? if you need to use the records in the application logic.
  • Prefer exists? if you care only about the existence check.

That’s it, so have fun and keep coding 🙂