Relative Testing vs Absolute Testing

Recently I had one of these small aha! moments when something strikes you and a concept clarifies in your head. This time it was about testing. I am not even sure if this is something new or obvious. I know it’s something that so far I haven’t been paying attention to consciously and it made testing unnecessarily harder for me. This recognition of the difference between both approaches comes purely thanks to using mutation testing. When you really try hard to make sure every part of code is tested properly it makes you think twice sometimes.

In the simplest, most basic terms I can describe absolute testing as one in which you use equality comparison assert_equal(...) / expect(...).to eq(...) with a specific value.

And relative testing is when you use greater-than or less-then or not-equal types of assertions. But also when you compare equality but with non-specific values. When you compare equality with other values generated by your code rather than an exact value expected and provided by your test.

But that’s just superficial explanation. It’s sometimes much more subtle than that.

Here is what I realized recently. When we decide to test a part of the code based on the requirements we often lean towards relative or absolute testing. We are primed with our own thoughts, expectations and first tests. Then we reach a case or obstacle in testing and it’s hard for us to move on. But what could help us is switching the mindset from relative to absolute testing or the other way.

Example one

When you want to implement an object to be equal by values instead of equal by reference you need to overwrite the == method. But you should also implement the hash method which is used when putting the object in or out of Hash (or Set) as part of the hashing function.

This is part of a class that I have implemented.

module RubyEventStore
  class Event
    def initialize(
      event_id: SecureRandom.uuid,
      data: nil
    )
      @event_id = event_id.to_s
      @data     = data.to_h
    end

    attr_reader :event_id, :data

    def to_h
      {
          event_id:   event_id,
          data:       data
      }
    end

    def ==(other_event)
      other_event.instance_of?(self.class) &&
        other_event.event_id.eql?(event_id) &&
        other_event.data.eql?(data)
    end

    BIG_VALUE = 0b111111100100000010010010110011101011000101010101001100100110000

    def hash
      [
        self.class,
        event_id,
        data
      ].hash ^ BIG_VALUE
    end

    alias_method :eql?, :==
  end
end

Let me present you some of the tests.

The hash depends on event_id:

  expect(
    Event.new(event_id: "doh").hash
  ).to eq(
    Event.new(event_id: "doh").hash
  )

  expect(
    Event.new(event_id: "doh").hash
  ).not_to eq(
    Event.new(event_id: "bye").hash
  )

The hash depends on event’s data:

  expect(
    Event.new(event_id: "doh", data: {}).hash
  ).to eq(
    Event.new(event_id: "doh", data: {}).hash
  )

  expect(
    Event.new(event_id: "doh", data: {}).hash
  ).not_to eq(
    Event.new(event_id: "doh", data: {a: 1}).hash
  )

The hash depends on event’s class:

  klass = Class.new(Event)

  expect(
    klass.new(event_id: "doh").hash
  ).not_to eq(
    Event.new(event_id: "doh").hash
  )

  expect(
    klass.new(event_id: "doh").hash
  ).to eq(
    klass.new(event_id: "doh").hash
  )

The event’s hash does not collide with the hash of an Array keeping event’s class, event_id and data.

  expect(klass.new(event_id: "doh").hash).not_to eq([
    klass,
    "doh",
    {}
  ].hash)

Thanks to proper implementation I can put events into Set or Hash and they are properly recognized even if I don’t use the same instance but rather a new instance with identical values.

  expect({
    klass.new(event_id: "doh") => :YAY
  }[ klass.new(event_id: "doh") ]).to eq(:YAY)

  expect(Set.new([
    klass.new(event_id: "doh")
  ])).to eq(Set.new([klass.new(event_id: "doh")]))

Now if you look at all the tests carefully you will realize that in not a single place have I specified the exact value that hash function should return.

Every test case is based on relativity here. The hash value must be equal to another hash value also generated by my code. Or not equal to another hash value generate by my code. But it never in plain sight says that the hash in a certain situation should be equal 8061336739304082551.

Every part of the implementation code is there for something. For example the XOR.

BIG_VALUE = 0b111111100100000010010010110011101011000101010101001100100110000

def hash
  [
    self.class,
    event_id,
    data
  ].hash ^ BIG_VALUE
end

The XOR (^) operator is there so that we can avoid this collision:

expect(klass.new(event_id: "doh").hash).not_to eq([
  klass,
  "doh",
  {}
].hash)

It’s just a random, big value that I generated. You could change a random bit in this number and the code would still work.

If you have an experience with mutation testing you know that it routinely checks for off-by-one errors by mutating numbers in your code to a bigger or lower number and checking whether the code fails.

But if you mutate BIG_VALUE to BIG_VALUE -1 or BIG_VALUE +1 the code still works and no test fails. All of those values are as good as any other. What does it mean? I have no explanation for my tests why I want this number over another number.

What am I missing? One absolute test.

expect(
  Event.new(event_id: "doh", data: {a: 1}).hash
).to eq(8061336739304082551)

You might scream 🙀🙀 but that test has no value 😱😱. Run it on your computer, CI, ruby 1.9 - ruby 2.5, JRuby, Rubinius and the value might differ and your test can fail. What could that tell you? Maybe, whether your hashing function is stable in a distributed, heterogeneous environment.

Example 2

Imagine that you write data in certain order:

event0 = OrderPlaced.new(data: {
  order_data: "sample",
  festival_id: "b2d506fd-409d-4ec7-b02f-c6d2295c7edd"
})
client.publish_event(event, stream_name: "Order-dd00859a")

event1 = OrderVerified.new(data: {
  verified_by: {
    name: "Lana D",
    id: "46561fb7-07ba-4bab-9ea7-7a648236f2ec",
  }
})
client.publish_event(event, stream_name: "Order-dd00859a")

and you expect it to come back in the same order.

events = client.read_stream_events_forward("Order-dd00859a")
expect(events).to eq([event0, event1])

Now imagine that somewhere in the implementation of #publish_event method there are lines such as:

position = read_last_position || -1
position += 1
Event.create!(
  position: position,
  data: {...},
  stream_name: ...,
)

that makes our events in stream indexed from 0. So first event’s position is 0, next one is 1, and then there is 2…

Here is the thing. From the point of our API, it does not matter how they are internally numbered and stored in the database.

events = client.read_stream_events_forward("Order-dd00859a")
expect(events).to eq([event0, event1])

If we went with:

position = read_last_position || -8
position += 1

it would work equally well and the tests would be still passing. Our data would be indexed from -7 which sounds silly, but for computers, it works. So yeah. What’s the difference 😉 ?

Why does the test still work? Because your tests are relative. They compare the order of written events and the order of read events and make sure they are identical. Write A, B to X. Read X and get A, B back.

Why did I choose to have the numbers indexed from 0? Because I am used to? Because that is usually the default (in most languages). Because of aesthetics, I guess…

So how can I justify this -1 in the code?

position = read_last_position || -1
position += 1

I need one non-relevant test but rather an absolute one.

event0 = OrderPlaced.new(event_id: "b3b2f9f0")
client.publish_event(event, stream_name: "Order-dd00859a")

expect(Event.find_by(position: 0).id).to eq(event0.id)

Granted it’s not important for the public api of the client that I am testing:

client.publish_event(event, stream_name: "Order-dd00859a")
client.read_stream_events_forward("Order-dd00859a")

And you might scream 🙀🙀 but that test has no value 😱😱. But this kind of things matters when you want your code to use the same convention between multiple releases/versions. If some data is written in a previous version of the code and some data is written in the next version of the code they might be inconsistent. So even though this implementation detail (exact position number) is not exposed to the clients of this API, you might still want to pinpoint it to a single specific number and keep consistent between releases.

I don’t recommend doing a lot of those tests. I usually try to design my tests so that they operate on the same layer for setup/preparation and for verification. But checking the implementation detail of one layer below, in this case, can be beneficial because these values get persisted forever.

Example 3

This example is similar to the previous one. Imagine that inside read_stream_events_forward you have code similar to:

def read_stream_events_forward(stream_name)
  events = Event.
    where(stream_name: stream_name).
    order("position ASC")
  # ...
end

but mutation testing tells you that if you remove order("position ASC") the code still works and returns the rows in proper order. That’s usually the case because with a small amount of data, the DB will often return those records in the same order they were inserted. But you don’t want to rely on that (p.s. even messing with auto-incremented IDs saved on DB might not help you because DBs will often use internal row ids).

You want to be explicit. You know you are doing the right thing by explicitly specifying the order but it might be hard for you to create a test setup presenting the situation in which not providing the order fails the tests. Damn DBs.

But I realized one thing. I don’t need to use relative testing which checks that writing A, B, C in that order leads to reading A, B, C in that order. I can capture the SQL statement generated by Active Record and verify it. Instead of checking if I got the right results I can check if I generated the right SQL query. Absolute instead of relative.

Summary

When you get stuck in testing (especially if you want to make sure the last 5% keeps working as expected as well) it might be a sign that you hit the wall with your current approach. You might have been testing only using Relative Testing or only using Absolute Testing. Stop for a moment and consider if the other approach makes it easier to achieve your goal.

I usually try to not cross the boundaries, not test too much implementation details because that makes refactorings harder. I prefer testing units over classes (remember: There is no such rule that there should be one test class per class). But sometimes when all you have is relative, it might be good to introduce more specificity. And vice-versa.

Would you like to continue learning more?

If you enjoyed the article, subscribe to our newsletter so that you are always the first one to get the knowledge that you might find useful in your everyday Rails programmer job. Content is mostly focused on (but not limited to) Ruby, Rails, Web-development and refactoring.