Using a Custom RSpec Matcher to Check Hash Contents

Photo by David Rangel on Unsplash

Recently, I was minding my own business, discussing some upcoming work with someone on my project team and before I knew it, bam! I was getting messages from our support team that one of our applications was down (one that I’m responsible for). This particular Rails application has been incredibly stable, so to hear that it was down was quite a surprise. It’s an application I wrote several years ago and has seen a lot of bugfixes, new features, and upgrades over the years. The team that supported it after I left the project did a great job keeping the lights on and making improvements. So, what happened?

Before we can get into that, let’s talk about what this application does. While I can’t get into the specifics, it’s basically a data transformation and process handling API. Let’s call it “Core API.” Core API receives a bunch of data and then breaks it up into smaller chunks and runs it through another API (let’s call this “External API”), calling several of External API’s endpoints. In the end, External API’s services generates some data and that data is retrieved and stored by Core API into our internal database for use by our other systems.

During the construction and maintenance of Core API, a critical component had been overlooked: API constraints in External API’s specifications. Specifically, a constraint for one of their data fields where the ’number’ for that object type may not be over 9,999,999. While this number seems high, we were simply using an automatically incremented ‘id’ field from a table in our database to populate this field. Which meant that we would hit the limit sooner or later. As fate would have it, it took us about seven and a half years to go through, you guessed it, 9,999,999 records in our database.

Which meant that on a Friday afternoon, at 3:50 PM, all of the requests to Core API started failing. Yay.

If you don’t read the API specification, you’re gonna have a bad time

Fortunately, the number that is required in External API is only referential inside of that dataset. So, these objects didn’t require a number that is unique across our organization. Which meant that we could simply renumber the objects, starting with “1” each time we sent a dataset to External API. Great! Now we have a solution, but with one minor hurdle–we couldn’t modify the data structure as it passed through the controller to the service classes that manipulated it and called External API. Here’s roughly how this data looked as it came through:

[
  {
    "parent_id": 9999999,
    "child_id": 12234,
    "name": "Timmy"
  },
  {
    "parent_id": 9999999,
    "child_id": 2345,
    "name": "Suzie"
  },
  {
    "parent_id": 10000000,
    "child_id": 4321,
    "name": "Lauren"
  },
  {
    "parent_id": 10000000,
    "child_id": 9876,
    "name": "John"
  }
]

Each “child” has to be grouped under a parent further down in the process. In order to keep our data structure intact, we (shout out to Jesse, who pair programmed with me to help resolve this quickly!) decided to find the parent_ids and put them into an array (only keeping ones that weren’t there already). So, this case, we would end up with:

  [9999999, 10000000]

Then, we would loop back over the data and use the index + 1 of the parent_id in that array to renumber the parent_id of each child. There’s probably a more efficient way to do this, but we decided not to optimize the algorithm since Core API works in the background and no user is just sitting around, waiting on the response. Now, after the renumbering, our data looks like this:

[
  {
    "parent_id": 1,
    "child_id": 12234,
    "name": "Timmy"
  },
  {
    "parent_id": 1,
    "child_id": 2345,
    "name": "Suzie"
  },
  {
    "parent_id": 2,
    "child_id": 4321,
    "name": "Lauren"
  },
  {
    "parent_id": 2,
    "child_id": 9876,
    "name": "John"
  }
]

Hooray! Problem solved! But that’s only the beginning. How do you test this? Especially with RSpec, in a concise manner, making sure that if the test data is changed, the test isn’t brittle and passes if the code still satisfies the requirements? I mean, obviously, we wrote the tests first…

What we decided to do is to check that each parent_id is no higher than the length of the array of ids. We know we will never send 10 million parent objects–that’s just not reasonable for this data. And besides, we added some validation elsewhere to handle that scenario.

RSpec has the ability to use custom matchers , but it wasn’t really clear to me how to write a matcher for this test from the documentation. Ultimately, here’s what I came up with:

# my_class_spec.rb
RSpec.describe MyClass do
  # test setup items here, along with some `let` blocks
  def check_parent_id_renumbering
    satisfy do |json_data|
      # The External API has a constraint that we not exceed 9,999,999 for our 'parent_id'
      # So we renumber the parent_id to start from 1
      highest_number = json_data.max { |a, b| a['parent_id'] <=> b['parent_id'] }['parent_id']
      json_data.all? { |obj| obj['parent_id'] <= highest_number }
    end
  end
  # ... other tests
  it 'uses location numbers that have been renumbered' do
    expect(described_class).to receive(:perform_async).with(check_parent_id_renumbering, anything, anything)
    subject.generate_data
  end
end

Let’s break this down a bit. What I did was define a method in my test file. The method I defined has a block, which receives an argument. The block can then do any data “acrobatics” to make sure the condition is satisfied! Cool!

Perhaps there are “better” ways to write the code or the test. But this is what I found to work for me and our tests passed and the bug was fixed. Ultimately, we had a few hours of downtime, thanks to the team that worked together to report, diagnose, fix, test, and deploy this fix through our various test environments and into production.

What do you think? Have you ever had a data issue like this or needed to write a custom matcher because you needed something beyond it { expect(subject.value).to eql(2.1) }? Is there a better way to write this helper? I’m working on adding comments to this blog, so in the meantime, feel free to use my contact form to get in touch. I’d love to hear from you.

I hope you learned something through this process. I know I did! Until next time, thanks for joining me on this journey!

Published February 20, 2025 by Toby Chin