Could we drop Symbols from Ruby?

… and check why 5600+ Rails engineers read also this

Could we drop Symbols from Ruby?

Don’t know about you, but I personally have been hit a least a dozen times by bugs caused by strings vs symbols distinction. That happened in my own code, and it happened when using some other libraries as well. I like how symbols look in the code, but I don’t like the specific distinction that is made between them and strings. In my (perhaps controversial opinion) they introduce more problems than they solve.

So I was thinking… Maybe we could drop them? Sounds radical right? But I don’t think rewriting thousands of Ruby libraries to remove every :symbol is a viable strategy. So maybe there is a different option? Maybe symbol literals could become frozen, immutable strings. How could that work?

Imagine a world in which…

It’s hard for me to describe a solution very well in long paragraphs. So I thought I would rather try to demonstrate the properties that I imagine and let the code speak for itself. So… Imagine a world in which…

:foo == :foo  # true
:foo == "foo" # true

This is what I started with. My goal. I don’t want to care anymore if I have a string or symbol. Of course, nothing is that easy. We need more properties (test cases) to fully imagine how that could work.

Usually my use-case is about taking something out of a hash or putting into a hash. Let’s express it.

{"foo" => 1}[:foo] == 1 # true
{foo: 1}["foo"]    == 1 # true

That would make my life easier :)

For that we need:

:foo.hash == "foo".hash # true

Whenever you put or get something out of a Hash (or Set) Ruby uses Object#hash as input to a hashing function. If two objects are equal they should return the same hash. Otherwise Ruby won’t properly find objects in a Hash. Let me show you an example:

class Something
  def initialize(val)
    @val == val
  end
  att_reader :val

  def ==(another)
    val == another.val
  end
end

a = Something.new(1)
b = Something.new(1)

hash = {a => "text"}
hash[a] # => "text"
hash[b] # => nil

You defined a Value Object. A class that is defined by its attributes (one or many) and which uses them for comparison. But because we haven’t implemented hash method, Ruby doesn’t know they can be used interchangeably as Hash keys.

a.hash
# => 2172544926875462254

b.hash
# => 2882203462531734027

If two objects return the same hash on the other hand that does not mean they are equal. There is a limited number of hashes available so conflicts can rarely occur. But if two objects are equal, they should return the same hash.

class Something
  BIG_VALUE = 0b111111000100000010010010110011101011000100010101001100100110000
  def hash
    [@val].hash ^ BIG_VALUE
  end
end

Usually, you compute the hash as hash of array of all attributes XORed with a big random value to avoid conflicts with that exact array of all attributes. In other words, we want:

Something.new(1).hash != 1.hash
Something.new(1).hash != [1].hash

But that was a digression. Let’s get back to the merit.

I would love:

{"foo" => 1}[:foo] == 1 # true
{foo: 1}["foo"]    == 1 # true

And for that we would need:

:foo.hash == "foo".hash # true

But here is the thing. It might be that computing a Symbols’s hash is 2-3 times faster than a String’s hash right now. I don’t know why. Maybe Symbols, which are immutable have a pre-computed hash or can have a memoized hash value because it won’t change. I am not sure. But if that’s the reason, I can imagine that frozen, immutable Strings could have lazy-computed, memoized hash value as well.

I believe there is a lot of libraries and apps out there that rely on that fact:

:foo.object_id == :foo.object_id

So obviously that should be preserved. But I believe if symbols were strings and Ruby would internally keep a unique list of them, just like doing it today for us, everything would work without a problem.

After all, the fact that you always get the same symbol is just a mapping somewhere in Ruby implementation from

{"foo" => Symbol.new("foo")}

Historically, it was not even garbage-collected. Now it is.

So with:

{"foo" => "foo".freeze}

somewhere there in Ruby internals, we could still get the same object when we ask for :foo :

:foo.object_id == :foo.object_id # true
:foo.equal?(:foo)                # true

Let’s continue this journey. Here is a problematic area:

foo = "foo"
foo.equal?(foo.to_s) # true

String#to_s basically returns self in Ruby. So if Symbols were frozen strings this would break:

foo = :foo
bar = foo.to_s
bar << " baz"

because bar would be the same object as foo instead of a new string (like it is right now for Symbols).

Here is another potential issue. There might be libraries out there checking if an object is a symbol.

if var.is_a?(Symbol)
  # do something
else
  # do it differently or not at all
end

I was thinking how to solve it… How could we distinguish :foo from "foo" if we really needed.

I see two options. Make Symbol work like String without making it a String (either by adding all the methods or making it an alias Symbol = String). And another option. Make Symbol inherit from String so Symbol < String.

With that

:foo.is_a?(Symbol)

would be true. But…

:foo.is_a?(String)

would also be true.

The difference could be that Symbol#to_s would be redefined to return new, identical, unfrozen String instead of the same one.

So maybe something like that.

class Symbol < String
  def initialize(val)
    super
    freeze
  end

  def to_s
    "#{self}"
  end

  def hash
    @hash ||= super
  end
end

I doubt that’s gonna happen. Probably too many corner cases right now to introduce such a change. But if we could drop Fixnum and Bignum, maybe we can drop Symbol?

Would we even want to? What’s your opinion? Do you need Symbol class in your code? Or do you just like the symbol notation?

I will leave you with a quote by Matz

Symbols are taken from Lisp symbols, and they has been totally different beast from strings. They are not nicer (and faster) representation of strings. But as Ruby stepped forward its own way, the difference between symbols and strings has been less recognized by users.

And if you think that would be a bad idea, let me tell you that he tried but failed.

I guess too many libraries out there rely on checking if an object is Symbol or not.

And in Smalltalk Symbols inherit from Strings:

You might also like