Rewrite with Confidence: Validating Business Rules Through Isolated Testing
… and check why 5600+ Rails engineers read also this
Rewrite with Confidence: Validating Business Rules Through Isolated Testing
A few months back, our team at Arkency faced a challenge that many Rails developers might recognize. We needed to implement a new flow at Lemonade that would eventually replace a legacy process โ but with three major constraints that couldn’t be compromised: user experience, cost efficiency, and avoiding technical debt.
The stakes were high. Any discrepancies between systems would impact customers and potentially create legal issues in the insurance domain. We had just three months to understand, replicate, and improve a complex flow that had evolved organically over years. And we needed to break free from obsolete data structures while preserving essential business rules embedded in a codebase with over 1 million lines of code.
Traditional approaches wouldn’t work. Full test coverage would take months we didn’t have. What we needed was a methodology to systematically identify, isolate, and verify each business rule independently of its implementation.
We needed a way to rewrite with confidence.
The Context: Insurtech at Scale
If you had asked me three years ago if insurtech could be exciting, I would have probably laughed. But it can be.
Lemonade is an innovative insurance company that hit $1 billion in premiums just 10 years after founding. It took other well-established insurance brands 40โ60 years to reach that milestone. Even companies like Microsoft, Netflix, Salesforce, and Tesla needed more time to achieve that.
What a thrill for @lemonade_inc to be in the ๐๐ณ๐ฆ๐ด ๐๐ฐ๐ฎ๐ฎ๐ข๐ด Club! Iโm not particularly moved by a car โwith doors that open like ๐ต๐ฉ๐ช๐ด ๐โ, but Iโm definitely exhilarated by the ride so far, and canโt wait for our #Next10X! ๐๐ป๐๐ https://t.co/HKpgfyFO7Y
— Daniel Schreiber (@daschreiber) March 25, 2025
When we joined Lemonade three years ago, their Director of Engineering shared a story that perfectly illustrated the stakes of our work. They once had an issue with roof coverage in one of their product lines and had to hire a legal team for a six-month sprint to fix things. The legal costs exceeded the entire IT budget.
We couldn’t break things. We had to be 100% sure that the new flow provided the same outcome.
The Architecture: A Rails Monolith Under Transformation
Lemonade used a Rails monolith as their foundation โ my favorite architecture. There’s no coincidence they became successful. Over the past few years, they’ve been transforming to a microservices architecture, with new product lines released using their internal framework. But all home and renters insurance is still handled within the Rails monolith.
Our scope was clear: implement a new quote flow for HO4 (renters insurance) in the US that would produce identical underwriting results to the legacy system.
Understanding the Problem
The God Model
Like many mature Rails applications, the system had a Quote model that accumulated responsibilities over time:
class Quote < ApplicationRecord
serialize :data
serialize :answers
enum :status,
{
pending: 'pending',
stubbed: 'stubbed',
bindable: 'bindable',
uw_declined: 'uw_declined',
},
default: 'pending'
end
The business raised a valid point: “We don’t want pending and stubbed Quotes in the system.” Pending represented abandoned quotes with no value. Stubbed meant the system couldn’t make a risk assessment, usually due to third-party issues. This data model pollution required filtering at different levels:
class Quote < ApplicationRecord
scope :not_pending, lambda do
where.not(status: :pending)
end
end
We all have such excluding scopes in our apps โ don’t pretend you don’t.
This was especially problematic for the Data Science team. Without filtering these quotes, their models would be far from accurate.
The Data Complexity
The Quote model contained two serialized columns with deeply nested data. Here’s just a glimpse of what we were dealing with:
data:
:locale:
:region: US
:language: en
:client_uuid: 30a1377e-5f06-4e6f-878b-41564c2e1221
:user_logged_in: false
:flags:
- :send_pdf_sample_docs
- :tenant_pet_damage_activated
# ... and dozens more attributes
The answers hash was even more complex, containing everything from address components to user preferences to tracking data.
Why Traditional Approaches Failed
We initially tried static analysis to figure out which Quote attributes were necessary for the underwriting process. We quickly realized this was impossible โ too many branches in the code. Imagine: every US state has its own regulations affecting insurance products. Multiply this by product editions that change over time due to legal concerns or business needs. We also share the data model and flow with home insurance.
Then we tried using Module#prepend to instrument Quote accessors and track which data was involved. This gave us better overview but was still overwhelming.
And we hadn’t even touched the HTTP communication part โ all the first-party and third-party calls required for underwriting, coverage selection, deductible calculation, and premium determination.
What About Other Approaches?
We considered several alternatives before settling on our solution.
Shadow traffic was an interesting option. This technique involves routing live production traffic to both the existing backend and a new shadow backend simultaneously. The shadow backend processes requests without affecting users, while comparison mechanisms validate behavior. Tools like nginx plugins or Zalando’s Skipper can handle this elegantly.
However, shadow traffic came with significant drawbacks for our use case:
- Substantial infrastructure work and ongoing costs
- Potential compliance issues using production data in non-production environments
- Need to implement complex comparison mechanisms
- Difficulty avoiding side effects when dealing with stateful operations and third-party APIs
The infrastructure overhead alone would have consumed a significant portion of our three-month deadline.
The Solution: Testing on Production
Here’s where we took an unconventional approach. Instead of trying to replicate production conditions in a test environment, we decided to test directly on production โ but safely.
The Brave New Flow
The key architectural change was simple but profound: instead of creating a Quote at the beginning of the flow and updating it on every step, we’d receive all the data gathered by the frontend client and perform our task at the very end.
This meant:
- No more
pendingquotes - No more
stubbedquotes - Only
bindableoruw_declinedas final states - Much cleaner data model
Implementing the Sampling Mechanism
We built a sampling system using Ruby’s prepend to non-invasively inject our verification code:
class RentersUsQuoteSampler
module AroundFilter
def run_prepare_for_preview(quote)
if RentersUsQuoteSampler.conditions_met_for_sampling?(quote)
RentersUsQuoteSampler.sampled(quote) { super }
else
super
end
end
end
end
This allowed us to intercept the underwriting process for specific quotes without affecting the normal flow.
What We Sampled
For each qualifying quote, we captured:
- Quote state before underwriting - The raw quote data as it entered the process
- Quote state after underwriting - The complete quote with pricing, deductible, and coverage
- Address data - All location information
- HTTP interactions - Every external API call made during the process
class RentersUsQuoteSampler
def self.sampled(quote)
address = Address.lemonade.find_by(quote_id: quote.id)
before_quote = to_sample(quote)
TyphoeusRecorder.start_recording
begin
result = yield
typhoeus_requests = TyphoeusRecorder.recorded_requests
ensure
TyphoeusRecorder.stop_recording
end
Record.create!(
quote_before: before_quote,
address: to_sample(address),
quote_after: to_sample(quote),
typhoeus_requests: typhoeus_requests,
)
result
rescue => e
::Sentry.capture_exception(e, hint: { ignore_exclusions: true })
end
end
Recording HTTP Interactions
Lemonade used Typhoeus as the HTTP client for microservices and third-party communication. Fortunately, Typhoeus provides a callback system:
class RentersUsQuoteSampler
module TyphoeusRecorder
RECORD_PROC = ->(response) do
@@typhoeus_requests.merge!(serialize_request(response))
end
def self.start_recording
@@typhoeus_requests = {}
::Typhoeus.on_complete(&RECORD_PROC)
end
def self.stop_recording
::Typhoeus.on_complete.delete(RECORD_PROC)
@@typhoeus_requests = nil
end
def self.serialize_request(response)
{
{
base_url: response.request.base_url,
params: response.request.options[:params],
method: response.request.options[:method],
body: response.request.options[:body],
} => {
code: response.code,
body: response.body,
},
}
end
end
end
This gave us perfect request-response pairs to use as stubs during verification.
The Verification Process
Sampling and verification were separate processes, allowing us to:
- Collect samples from production continuously
- Run verification asynchronously
- Re-run verification after code fixes
- Iterate until we achieved parity
Leaving No Trace
The critical requirement was not polluting production with duplicate quotes:
class RentersUsQuoteDtoVerifier
def with_rollback
ActiveRecord::Base.transaction do
yield
raise ActiveRecord::Rollback
end
end
end
But there was a gotcha: background jobs. We needed to ensure no jobs were scheduled within our rolled-back transaction.
After Commit Handling
This feature became built-in to Rails 7.2, but we weren’t there yet. Fortunately, one of the best things about working at Arkency is that if you need a solution, there’s a good chance we’ve solved it before โ like in RailsEventStore or our blog posts from 9 years before Rails introduced it.
module AfterCommitRunner
def self.call(&schedule_proc)
transaction = ActiveRecord::Base.connection.current_transaction
if transaction.joinable?
transaction.add_record(async_record(schedule_proc))
else
schedule_proc.call
end
end
def self.async_record(schedule_proc)
AsyncRecord.new(schedule_proc)
end
class AsyncRecord
def initialize(schedule_proc)
@schedule_proc = schedule_proc
end
def committed!(*) = schedule_proc.call
def rolledback!(*) = nil
def before_committed!() = nil
attr_reader :schedule_proc
end
end
This allowed us to queue jobs only after successful commits, not within rolled-back transactions.
HTTP Stubbing Strategy
We needed to stub all external HTTP calls to avoid:
- Mutating state in other microservices
- Making expensive third-party API calls
- Affecting external systems (like credit scores)
- Rate limiting issues
First, we blocked all Typhoeus requests:
def with_http_stubs_mechanism
callback = ->(req) do
req.block_connection = true
req
end
Typhoeus.before.prepend(callback)
yield
ensure
Typhoeus.before.delete(callback)
Typhoeus::Expectation.clear
end
Then we used our recorded requests as stubs:
def with_common_http_stubs
http_stubs.each do |req, res|
Typhoeus
.stub(req[:base_url], req[:params])
.and_return(Typhoeus::Response.new(**res))
end
yield
end
Handling Edge Cases
Some libraries used net/http directly, which wasn’t easy to stub. For AWS S3 clients, we used Ruby’s metaprogramming capabilities:
def with_no_verisk_persistence
old_const = Storage::IamS3Resource
no_writes_iam_resource = Class.new do
extend old_const
def self.put(*) = 'http://example.org'
def self.presigned_url(*) = 'http://example.org'
end
Storage.send(:remove_const, :IamS3Resource)
Storage.const_set(:IamS3Resource, no_writes_iam_resource)
yield
ensure
Storage.send(:remove_const, :IamS3Resource)
Storage.const_set(:IamS3Resource, old_const)
end
This allowed us to override behavior while still downloading resources from S3 (assuming GETs don’t mutate state).
The Complete Verification Flow
Putting it all together:
def sample_remake
sample_remake = nil
with_rollback do
with_http_stubs_mechanism do
with_common_http_stubs do
with_bouncer_stubs do
with_census_block_stubs do
with_no_segment do
with_no_verisk_persistence do
with_no_promises do
with_no_impressions do
remake = mk_quote
Chat::Quote.run_prepare_for_preview(remake)
sample_remake = RentersUsQuoteSampler.to_sample(remake)
end
end
end
end
end
end
end
end
end
sample_remake
end
Yes, the nesting looks deep, but each wrapper handled a specific concern. We could experiment safely as many times as needed.
Comparing Results
We used the super_diff gem to identify discrepancies:
def verify
tuple_to_compare.reduce(:==)
end
def diff
SuperDiff.diff(*tuple_to_compare)
end
Example output when things didn’t match:
{
- "status" => "bindable",
+ "status" => "pending",
"product" => "iso",
"form" => "ho4",
- "edition" => "E240716",
+ "edition" => "E240618",
# ...
}
This worked beautifully for nested structures, which was crucial for our case.
The Results
After implementing this methodology, we achieved:
- Fewer questions asked - Simplified the customer flow
- Cleaner data model - Eliminated obsolete quote states
- Identical outcomes - 100% parity with legacy underwriting
- Confidence to ship - No surprises in production
The project leader shared across the organization:
“This is part of one of the best releases I have ever experienced.”
There was even a panic moment when he reached out on a Friday evening before both our ski vacations โ but it was just to thank the team for the exceptional release quality.
Key Takeaways
This approach worked because we:
- Separated collection from verification โ Continuous sampling with async verification
- Treated production as the specification โ No need to replicate complex environments
- Isolated tests from side effects โ Transaction rollbacks and HTTP stubbing
- Iterated until perfect โ Fixed issues and re-verified until parity achieved
- Leveraged Ruby’s strengths โ Metaprogramming made complex stubbing manageable
The methodology is applicable beyond insurance or quote systems. Anytime you need to rewrite complex business logic while ensuring behavioral parity, consider:
- Can you sample real production behavior?
- Can you replay it safely in isolation?
- Can you compare results programmatically?
- Can you iterate until perfect?
When refactoring mission-critical business logic, traditional testing might not be enough. Sometimes the best test suite is production itself โ as long as you can verify without breaking things.
Prefer watching?
This post is based on authorโs conference talk delivered at wroclove.rb 2025 in Wrocลaw, Poland and EuRuKo 2025 in Viana do Castelo, Portugal.
This methodology emerged from real-world necessity at Lemonade. We’re grateful for their trust in letting us solve this challenge and share the solution with the Ruby community.