Stop including Enumerable, return Enumerator instead

… and check why 5600+ Rails engineers read also this

Stop including Enumerable, return Enumerator instead

Many times I have seen people including Enumerable module into their classes. But I cannot stop thinking that in many cases having methods such as each_with_index or take_while or minmax and many others that are available in Enumerable are not core responsibility of the class that is including them itself.

In such case I prefer to go Java-way and provide external Enumerator for those who need to call one of the many useful Enumerable methods on the collection. I think that we need to ask ourselves a question: Is that class a collection?. If it really is then it absolutely makes sense to include Enumerable. If however it is not a collection, but rather a class which happens contain something else, or providing a collection, well then maybe external Enumerator is your solution.

Standard library

If you call the most famous Array#each method without a block, you will see that you get an enumerator in the response.


e = [1,2,3].each
# => #<Enumerator: [1, 2, 3]:each> 

You can manually fetch new elemens:

e.next
# => 1 

e.next
# => 2 

e.next
#=> 3 

e.next
# StopIteration: iteration reached an end

Or use one of the Enumerable method that Enumerator gladly provides for you


e = [1,2,3].each

e.partition{|x| x % 2 == 0}
# => [[2], [1, 3]] 

Create Enumerator

There are 3 ways to create your own Enumerator:

  • Kernel#to_enum
  • Kernel#enum_for
  • Enumerator.new

But if you look into MRI implemention you will notice that both #to_enum and #enum_for are implemented in the same way:

rb_define_method(rb_mKernel, "to_enum", obj_to_enum, -1);
rb_define_method(rb_mKernel, "enum_for", obj_to_enum, -1);

rb_define_method(rb_cLazy, "to_enum", lazy_to_enum, -1);
rb_define_method(rb_cLazy, "enum_for", lazy_to_enum, -1);

You can check it out here:

And if you look into rubyspec you will also notice that they are supposed to have identicial behavior, so I guess currently there is really no difference between them

Therfore whenever you see an example using one of them, you can just substitue it with the other.

#to_enum & #enum_for

What can #to_enum & #enum_for do for you? Well, they can create the Enumerator based on any method which yields arguments. Usually the convention is to create the Enumerator based on method #each (no surprise here).

a = [1,2,3]
enumerator = a.to_enum(:each)

We will see it in action later in the post.

Enumerator.new

This way (contrary to the previous) has a nice documentation in Ruby doc which I am just gonna paste here:

Iteration is defined by the given block, in which a “yielder” object, given as block parameter, can be used to yield a value:

fib = Enumerator.new do |y|
  a = b = 1
  loop do
    y << a
    a, b = b, a + b
  end
end

fib.take(10) # => [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

The optional parameter can be used to specify how to calculate the size in a lazy fashion. It can either be a value or a callable object.

Here is my example:

polish_postal_codes = Enumerator.new(100_000) do |y|
  100_000.times do |number|
    code    = sprintf("%05d", number)
    code[1] = code[1] + "-"
    y.yield(code)
  end
end

polish_postal_codes.size    # => 100000 
                            # returned without computing
                            # all elements

polish_postal_codes.take(3) # => ["00-000", "00-001", "00-002"]

Why?

Of course returning Enumerator makes most sense when returning collection (such as Array) would be inconvinient or impossible due to performance reasons, like IO#each_byte or IO#each_char.

What do you need to remember?

Not much actually. Whenever your method yields values, just use #to_enum (or #enum_for as you already know there are identical) to create Enumerator based on the method itself, if block code is not provided. Sounds complicated? It is not. Have a look at the example.

require 'digest/md5'

class UsersWithGravatar
  def each
    return enum_for(:each) unless block_given? # Sparkling magic!

    User.find_each do |user|
      hash  = Digest::MD5.hexdigest(user.email)
      image = "http://www.gravatar.com/avatar/#{hash}"
      yield user unless Net::HTTP.get_response(URI.parse(image)).body == missing_avatar
    end
  end


  private

  def missing_avatar
    @missing_avatar ||= begin
      image_url = "http://www.gravatar.com/avatar/fake"
      Net::HTTP.get_response(URI.parse(image_src)).body
    end
  end
end

We are working in super startup having milions of users. And thousands of them can have gravatar. We would prefer not to return them all in an array right? No problem. Thanks to our magic oneliner return enum_for(:each) unless block_given? we can share the collection without computing all the data.

This might be really usefull, especially when the caller does not need to have it all:

class PutUsersWithAvatarsOnFrontPage
  def users
    @users ||= UsersWithGravatar.new.each.take(20)
  end
end

Or when the caller wants to be a bit #lazy :

UsersWithGravatar.
  new.
  each.
  lazy.
  select{|user| FacebookFriends.new(user).has_more_than?(10) }.
  and_what_not # ...

Did i just say lazy? I think I should stop here now, because that is a completely different story.

TLDR

To be consistent with Ruby Standard Library behavior, please return Enumerator for your yielding methods when block is not provided. Use this code

return enum_for(:your_method_name_which_is_usually_each) unless block_given?`

to just do that.

Your class does not always need to be Enumerable. It is ok if it just returns Enumerator.

Would you like to continue learning more?

If you enjoyed the article, subscribe to our newsletter so that you are always the first one to get the knowledge that you might find useful in your everyday Rails programmer job.

Content is mostly focused on (but not limited to) Ruby, Rails, Web-development and refactoring Rails applications.

You might also like