April 29, 2016

Automatic Eager Loading for Mongoid

Even in MongoDB, sometimes your models may still have reference relations. Like ActiveRecord, the popular MongoDB ORM, mongoid, provides #includes method to perform eager loads.

Problems

However, it is a pain to specify them manually in codes. In most scenes, you just need the eager loading when you what to access the relation.

As described in Joel Turkel's post about Goldiloader, the eager loading should neither be a duty of controllers (why should they care about how the data is fetched) nor be one of views (they just present your data!).

Eager loading is more likely a job that the model it self should handle.

Better Practice

Just like Goldiloader for ActiveRecord, there can also be a plugin that makes mongoid do the eager loading automatically.

Say you have the following models for your blog site:

class User
  include Mongoid::Document
  has_one :device
  has_many :posts
end

class Post
  include Mongoid::Document
  belongs_to :user
  has_and_belongs_to_many :tags
end

class Tag
  include Mongoid::Document
  has_and_belongs_to_many :posts, fully_load: true
end

When you call

@user = User.find(id)
@posts = @user.posts.limit(10) # 

# access the tags
@posts.each do |post|
  puts post.title
  puts post.tags.map(&:name).join(', ')
end

Mongoid will perform 12 queries: 1 on users, 1 on posts and 10 on tags:

db.users.find('_id' : id);
db.posts.find('user_id' : user_id);
db.tags.find('post_ids' : post0_id);
db.tags.find('post_ids' : post1_id);
...
db.tags.find('post_ids' : post9_id);

But as an experienced developer, you could easily figure out that the eager loading on :tag relation should be performed. In fact, adding #includes to the criteria is a default action, manually done by the developers.

Why not make it a default to mongoid?

When you access the tag relation and is about to query it in your db, perform an eager loading on :tag just before the query:

# before query a relation
# the query loading method of a relation is
# Mongoid::Relations::Targets::Enumerable#unloaded_documents

def before_unloaded_documents
  owner = relation.base # get the relation owner
  metadata = relation.relation_metadata # get the relation metadata

  # DO THE EAGER LOADING
  # 
  #   Requires some trick to access the owners collection through the owner.
  #   Add an reference to the collection into model
end

Then when you run the previous @posts.each..., it will results to only 3 quires, 1 for each:

db.users.find('_id' : id);
db.posts.find('user_id' : user_id);
db.tags.find('post_ids' : {'$in' : post_ids});

Wow, you do not need to specify any eager loads. (The modified) Mongoid will do it automatically for you!

Conclusion

Eager loads are performance boosters, and should be an default behaviour in most situations. The developers should be aware of them, but not repeatly code .includes(:...) just because you need it everywhere.
I've written a gem mongoload to make your mongoid an automatic eager loader. Check it on GitHub.