The first time the prototype read a real Calgary shelf tag correctly, it felt like a card trick I was performing on myself. Then I scanned the next one. And the next. By about the tenth tag I had the quiet thought that has shaped every week since: most of the time. About seven out of ten. That was the feasibility proof, and it was also the trap.
The trap of seventy percent
I learned what diminishing returns meant from video games, long before I knew the phrase. The first hour of grinding gets you levels two through eight. The next hour gets you level nine. The hour after that gets you halfway to ten. You can see the curve flatten in real time.
This is where I've been living since the prototype worked. The first seventy percent came from the obvious cases — clean tags, good lighting, standard layouts. Getting to eighty took weeks. Getting to eighty-five was a separate campaign. Every percent after that is a fight for itself, sometimes a fight that costs me ground somewhere else. I'd close one failure mode and watch two others wake up that the first one had been quietly masking. Some days I went backwards.
Seventy percent is good enough to convince you the technology is real. It is nowhere near good enough to ship. A tool that's wrong three times in ten is a tool that gets opened twice and deleted on the third try.
Then the stores started multiplying
The second mountain was the stores.
I started with one chain. Once it was working — for a generous definition of "working" — I added another. That's when I learned that every grocery store does its tags differently, and the differences are not small. Lucky Supermarket's tags don't look like Save-On's. Calgary Co-op's produce labels are their own dialect. T&T has Chinese characters mixed with English and multibuy formats no one else uses. Safeway runs club-price overlays that change the meaning of the number underneath.
Adding a store isn't just more data. It's a new set of edge cases stacked on top of the ones I hadn't finished solving yet for the last store. And it's not only an engineering problem — every store I onboard adds administrative weight too. Brand records, store records, rule sets, category mappings, the small ongoing tasks of keeping all of that current. Every tester I bring in multiplies the friction surface, because their scans find the corners I would never have found on my own.
The fragmentation isn't a side effect of how grocery retail works. It is how grocery retail works. There is no standard. There was never going to be one.
The setbacks
What started getting under my skin wasn't the failures themselves. It was the pattern of them. Fix one thing, two more show up. Add a store, the rules for the previous store start behaving differently. Close a bug on Tuesday, find on Friday that the close had a quiet cost somewhere else in the pipeline.
After enough of these I started suspecting the problem wasn't the individual failures. It was something underneath them. Something about how the system was put together that meant fixes had a shorter half-life than they should.
That's the thread I want to pull on next. This post is about the grind. The next one is about what the grind taught me about the foundation.
— Elmer