First class json(b) handling in Rails Event Store
… and check why 5600+ Rails engineers read also this
First class json(b) handling in Rails Event Store
Recently, in Rails Event Store v2.8.0 PreserveTypes
transformation has been introduced. v2.9.0 release brought RailsEventStore::JSONClient
. It’s a set of great improvements for RES users who plan to or already use PostgreSQL with jsonb
data type for keeping events’ data
and metadata
.
Back to the primitive
According to RFC 4627 JSON can represent four primitive types:
- strings
- numbers
- booleans
- null
and two structured types:
- objects
- arrays
When data
is serialised to JSON format with JSON.dump
or ActiveSupport::JSON.encode
, which happens implicitly when persisting event, the data
need to be converted to primitives or structured types.
Given event data
:
{
boolean: true,
nothing: nil,
string: "hello",
symbol: :baz,
int: 123,
float: 1.23,
big_decimal: BigDecimal("1.23"),
array: [1, 2, 3],
hash: { foo: :bar },
date: Date.current,
time: Time.now,
active_support_time_with_zone: Time.current,
}
becomes in the database:
{
"boolean": true,
"nothing: null,
"string": "hello",
"symbol": "baz",
"int": 123,
"float": 1.23,
"big_decimal": "1.23",
"array": [1,2,3],
"hash": {"foo":"bar"},
"date": "2023-01-27",
"time": "2023-01-27 16:11:04 +0100",
"active_support_time_with_zone": "2023-01-27 15:11:04 UTC"
}
As expected, everything was converted to JSON’s primitives. There are slight difference between JSON.dump
and ActiveSupport::JSON.encode
when it comes to time serialization (it’s more precise), but it doesn’t matter for this example.
We will get our data types back… Right?
What happens when we want to read the event and access it’s data
?
{
"boolean" => true,
"nothing" => nil,
"string" => "hello",
"symbol" => "baz",
"int" => 123,
"float" => 1.23,
"big_decimal" => "1.23",
"array"=> [1, 2, 3],
"hash"=> {"foo"=>"bar"},
"date"=> "2023-01-27",
"time"=> "2023-01-27 16:17:13 +0100",
"active_support_time_with_zone"=> "2023-01-27 15:17:13 UTC"
}
Short list of problems that will occur:
- Keys in the hash are no longer symbols, if your code used to access event’s data values by the symbol, you’ll get
nil
orKeyError
when using.fetch
method. The same goes with symbols in values, they will become strings. BigDecimal
remained a string.Date
is as string.Time
andActiveSupport::TimeWithZone
are strings either- Information about time precision (fraction of seconds) is also lost.
How we resolved those problems in Rails Event Store and why we even face them?
For a long time, our recommended setup was binary
column type for storing events data
and metadata
and YAML serializer:
The reason is that YAML is available out of box and can serialize and deserialize data types which are not easily handled in other formats.
However, we had found out that if you’re on PostgreSQL, it’s good to use its native jsonb type for storing data
and metadata
. Using JSON types in postgres has a lot of benefits, e.g. you can query event data or metadata using SQL and find events of your interest based on their payload. We would love to expose API for that in Rails Event Store one day.
There are numerous ways to solve the problems mentioned above your Rails Event Store events. Implementing custom event class with schema is one solution, the other would be to implement a custom transformation and a mapper.
JSONClient and PreserveTypes transformation to the rescue
We decided to provide sane defaults not to bother users with advanced configuration. Newly introduced RailsEventStore::JSONClient incorporates mapper containing PreserveTypes
transformation.
PreserveTypes
allows registering serializer and deserializer for any type of data you wish to put in your event’s data
or metadata
. Meaning that event published with data
:
{
boolean: true,
nothing: nil,
string: "hello",
symbol: :baz,
int: 123,
float: 1.23,
big_decimal: BigDecimal("1.23"),
array: [1, 2, 3],
hash: { foo: :bar },
date: Date.current,
time: Time.now,
active_support_time_with_zone: Time.current,
}
after persisting and reading it again will represent originally intended data types:
{
boolean: true,
nothing: nil,
string: "hello",
symbol: :baz,
int: 123,
float: 1.23,
big_decimal: 0.123e1,
array: [1, 2, 3],
hash: { foo: :bar },
date: Fri, 27 Jan 2023,
time: 2023-01-27 18:06:32.647146 +0100,
active_support_time_with_zone: Fri, 27 Jan 2023 17:06:46.914852000 UTC +00:00,
}
Here’s PreserveTypes
configuration extracted from RailsEventStore::JSONClient
RubyEventStore::Mappers::Transformation::PreserveTypes
.new
.register(
Symbol,
serializer: ->(v) { v.to_s },
deserializer: ->(v) { v.to_sym }
)
.register(
Time,
serializer: ->(v) { v.iso8601(RubyEventStore::TIMESTAMP_PRECISION) },
deserializer: ->(v) { Time.iso8601(v) }
)
.register(
ActiveSupport::TimeWithZone,
serializer: ->(v) { v.iso8601(RubyEventStore::TIMESTAMP_PRECISION) },
deserializer: ->(v) { Time.iso8601(v).in_time_zone },
stored_type: ->(*) { "ActiveSupport::TimeWithZone" }
)
.register(
Date,
serializer: ->(v) { v.iso8601 },
deserializer: ->(v) { Date.iso8601(v) }
)
.register(
DateTime,
serializer: ->(v) { v.iso8601 },
deserializer: ->(v) { DateTime.iso8601(v) }
)
.register(
BigDecimal,
serializer: ->(v) { v.to_s },
deserializer: ->(v) { BigDecimal(v) }
)
As you noticed, the configuration is pretty simple, we expect both serializer and deserializer to respond to call
and accept single argument with value.
Primitive types like String
or Integer
require no serialization, original value will be passed. It also won’t be deserialized on read.
If you’re curious how PreserveTypes
transformation is implemented, feel free to look at the source code.
Quirks
Look at me, I’m Time
now
During implementation, we figured out that there’s a quirk around ActiveSupport::TimeWithZone
. We rely on on Module#name to recognize object class name. It turned out that ActiveSupport::TimeWithZone.name
for unknown reason will return Time
. Obviously this broke deserialization of this particular type, since the wrong one was picked by transformation. For this reason we introduced another parameter called stored_type
which expects object responding to call
, returning a string containing class name.
Additional metadata is persisted
Yes, it will take additional space in your database, but it’s the sacrifice we’re ready for. Event’s metadata
along with standard information will store information about types
:
{
"types": {
"data": {
"boolean": ["Symbol","TrueClass"],
"nothing": ["Symbol","NilClass"],
"string": ["Symbol","String"],
"symbol": ["Symbol","Symbol"],
"int": ["Symbol","Integer"],
"float": ["Symbol","Float"],
"big_decimal": ["Symbol","BigDecimal"],
"array": ["Symbol",["Integer","Integer","Integer"]],
"hash": ["Symbol",{"foo":["Symbol","Symbol"]}],
"date": ["Symbol","Date"],
"time" :["Symbol","Time"],
"active_support_time_with_zone": ["Symbol","ActiveSupport::TimeWithZone"]
},
"metadata":{
"correlation_id":["Symbol","String"]
}
}
First element in the array represents type of the key, the other one — value’s.
This is completely transparent operation, you won’t see this data when reading event in your console:
irb(main):021:0> event_store.read.last
RubyEventStore::ActiveRecord::Event Load (1.3ms) SELECT "event_store_events".* FROM "event_store_events" ORDER BY "event_store_events"."id" DESC LIMIT $1 [["LIMIT", 1]]
=>
#<OopsIDidItAgain:0x000000011265f7d0
@data=
{:int=>123,
:date=>Fri, 27 Jan 2023,
:hash=>{:foo=>:bar},
:time=>2023-01-27 18:23:58.607724 +0100,
:array=>[1, 2, 3],
:float=>1.23,
:string=>"hello",
:symbol=>:baz,
:boolean=>true,
:nothing=>nil,
:big_decimal=>0.123e1,
:active_support_time_with_zone=>Fri, 27 Jan 2023 17:23:58.607731000 UTC +00:00},
@event_id="df6c5c48-06da-47ff-90ae-1b76eb6ceeaf",
@metadata=
#<RubyEventStore::Metadata:0x000000011265f780
@h=
{:correlation_id=>"d55c851d-97e8-4dc4-880c-cafcea3e9c49",
:timestamp=>2023-01-27 17:24:29.831713 UTC,
:valid_at=>2023-01-27 17:24:29.831713 UTC}>>
nor in the Browser.
Summary
Running on JSONClient
with PreserveTypes
provides seamless type handling. It removes all the disadvantages which using JSON brought as compared to YAML serialization within binary column. It opens new possibilities like ability to query events’ data
and metadata
via SQL interface and RES interface in the future. It’s also a nice alternative to event schemas used solely for type casting which can be sometimes slow.