Prototypes in Ruby and the strange story of dup

Today I was working on a feature where I had to create a few similar Active Record objects. I was creating a read model for some financial data. Most of the attributes of created objects were the same but a few were different. In one refactoring step I removed the duplication in passing attributes by using a prototype. Although at that moment I haven’t thought of the object in such way.

Before

The code before refactoring looked similar to this:

Entry.create!(
  fact_id: fact.id,
  time: fact.metadata.fetch(:timestamp),
  level: order.level,
  order_id: order.id,
  column_a: something,
  column_b: something_else,
  column_c: another_computation,
  column_d: one_more,
  column_e: not_yet_finished,

  entry_number: 1,
  entry_type: "REVENUE",
  gross_value: BigDecimal.new("100.00"),
  vat: BigDecimal.new("13.05"),
)

Entry.create!(
  fact_id: fact.id,
  time: fact.metadata.fetch(:timestamp),
  level: order.level,
  order_id: order.id,
  column_a: something,
  column_b: something_else,
  column_c: another_computation,
  column_d: one_more,
  column_e: not_yet_finished,

  entry_number: 2,
  entry_type: "FEE_TYPE_1",
  gross_value: BigDecimal.new("-10.00"),
  vat: BigDecimal.new("-1.30"),
)

There were more columns and more entries (between 2 and 5) being created for the financial ledger.

I could have extracted the common attributes into a Hash but I decided to go with a slightly different direction.

After

base_entry = Entry.new(
  fact_id: fact.id,
  time: fact.metadata.fetch(:timestamp),
  level: order.level,
  order_id: order.id,
  column_a: something,
  column_b: something_else,
  column_c: another_computation,
  column_d: one_more,
  column_e: not_yet_finished,
)

base_entry.dup.update_attributes!(
  entry_number: 1,
  entry_type: "REVENUE",
  gross_value: BigDecimal.new("100.00"),
  vat: BigDecimal.new("13.05"),
)

base_entry.dup.update_attributes!(
  entry_number: 2,
  entry_type: "FEE_TYPE_1",
  gross_value: BigDecimal.new("-10.00"),
  vat: BigDecimal.new("-1.30"),
)

I used dup method which is available for every Object in Ruby. Including in ActiveRecord . When using dup be aware of its differences from clone, especially in ActiveRecord case. Those semantics changed a few years ago in in Rails 3.1.

The most important difference for me turns out to be the record identity.

User.last.dup.id
# => nil

User.last.clone.id
# => 4

#dup is like I want a similar (in terms of attributes) but new record and #clone is like i want a copy pointing to the same db record.

Can I really, really clone/duplicate it?

It is true that every Object has dup implemented so you might be tempted to believe you actually, can duplicate every object.

a = BigDecimal.new("123")
# => #<BigDecimal:282cf60,'0.123E3',9(18)> 

b = BigDecimal.new("123").dup
# => #<BigDecimal:2834148,'0.123E3',9(18)> 

a.object_id
# => 21063600
b.object_id
# => 21078180

a = "text"
# => "text" 

b = a.dup
# => "text" 

a.object_id
# => 21085300 
b.object_id
# => 2106552

And so on, and so on… Unfortunately, the truth is a bit more complicated. There are so-called immediate objects (or immediate values) in Ruby which cannot be duplicated/cloned.

nil.clone
# TypeError: can't clone NilClass
nil.dup
# TypeError: can't dup NilClass

1.clone
# TypeError: can't clone Fixnum
1.dup
# TypeError: can't dup Fixnum

1.0.clone
# TypeError: can't clone Float
1.0.dup
# TypeError: can't dup Float

false.clone
# TypeError: can't clone FalseClass
false.dup
# TypeError: can't dup FalseClass

unless… you are on recently released ruby 2.4…

nil.clone
# => nil
nil.object_id
# => 8
nil.clone.object_id
# => 8

1.clone
# => 1
1.object_id
# => 3
1.clone.object_id
# => 3

in which you can call those methods but instead of returning actual duplicates they return the same instances. Because there is only one instance of nil, false, true, 1, 1.0, etc in your ruby app.

ActiveSupport Object#duplicable?

Rails framework extends every Object with duplicable? method which tells if you can safely call dup and not get an exception sometimes. It’s interesting how duplicable? is implemented.

First, they start by saying you can dup an Object.

class Object
  # Can you safely dup this object?
  #
  # False for method objects;
  # true otherwise.
  def duplicable?
    true
  end
end

And then it is dynamically checked if that’s actually true for some known exceptions such as nil etc.

class NilClass
  begin
    nil.dup
  rescue TypeError

    # +nil+ is not duplicable:
    #
    #   nil.duplicable? # => false
    #   nil.dup         # => TypeError: can't dup NilClass
    def duplicable?
      false
    end
  end
end

As can see the return value of nil.duplicable? will actually depend on the Ruby version you are running on. true or false is not hardcoded (what I expected) but rather dynamically probed. In the case of TypeError exception, the method is overwritten in that specific class.

Mindblown

However, for some reason for a few classes, a different strategy is used by explicitly returning true or false without such check.

class BigDecimal
  def duplicable?
    true
  end
end

class Method
  def duplicable?
    false
  end
end

class Complex
  def duplicable?
    false
  end
end

But the most interesting part is around Symbol.

class Symbol
  begin
    :symbol.dup # Ruby 2.4.x.
    "symbol_from_string".to_sym.dup # Some symbols can't `dup` in Ruby 2.4.0.
  rescue TypeError

    # Symbols are not duplicable:
    #
    #   :my_symbol.duplicable? # => false
    #   :my_symbol.dup         # => TypeError: can't dup Symbol
    def duplicable?
      false
    end
  end
end

Because Ruby 2.4 is a bit weird and a literal symbol can be duplicated but dynamic one cannot. Unless the dynamic one is the same as a literal one created before it… yep…


:literal_symbol.dup
# => :literal_symbol

"dynamic_symbol".to_sym.dup
# => TypeError: allocator undefined for Symbol

:dynamic_preceeded_with_literal
# => :dynamic_preceeded_with_literal 
"dynamic_preceeded_with_literal".to_sym.dup
# => :dynamic_preceeded_with_literal

Frankly, I am not sure if that’s a bug or expected behavior. I see no reason why the dynamic symbol could not return itself as well. Maybe dup is suppose to preserve some constraints that I am not aware of…

I enjoyed reading the explanation why ActiveSupport even attempts to implement those methods. Quoting the documentation itself.

Most objects are cloneable, but not all. For example, you can’t dup methods:

method(:puts).dup
# => TypeError: allocator undefined for Method

Classes may signal their instances are not duplicable removing dup/clone or raising exceptions from them. So, to dup an arbitrary object you normally use an optimistic approach and are ready to catch an exception, say:

arbitrary_object.dup rescue object

Rails dups objects in a few critical spots where they are not that arbitrary. That rescue is very expensive (like 40 times slower than a predicate), and it is often triggered.

That’s why we hardcode the following cases and check duplicable? instead of using that rescue idiom.

So it is a performance optimization inside the framwork’s critical paths. Remember that optimizing exceptions in you web application most likely won’t have any meaningful impact.

Looking for a way to get your first Ruby job?

Check out our Junior Rails Developer course.

Already a Ruby master?

You will enjoy our upcoming Rails DDD Workshop (25-26th May 2017, Thursday & Friday, Lviv, Ukraine. In English) which teaches you techniques for maintaining large, complex Rails applications.