DEV Community 2h ago

My landing page passed every CI check and was still broken on my customer's phone

A customer texted me a screenshot last month. It was my own landing page, open on their Pixel. The headline - "Financial infrastructure to grow your revenue" - was clipped at "...grow your reven". The signup button below it was gray-on-slightly-lighter-gray, basically unreadable. And the hero image? A broken-image icon.

Here's the part that stung: every check I had was green. Lighthouse: 98. My Playwright tests: passing. CI: all checkmarks. I had shipped that page an hour earlier feeling good about it. None of my tooling caught any of it.

I want to walk through why, because I think a lot of us have this blind spot, and then I'll tell you what I did about it.

The core issue

CI tests the DOM. It does not test what a human sees. My tests asserted things like "the signup button exists" and "the form has an email input." All true. The button was in the DOM. It just rendered unreadable on a 412px-wide screen with the system in light mode.

Lighthouse runs one viewport (usually a throttled Moto G4 emulation) and scores performance/SEO/a11y heuristically. It does not look at your page across the actual range of devices your visitors use and say "this headline is physically clipped on a Pixel 8."

And my "responsive testing"? I was dragging the Chrome devtools responsive bar to two breakpoints - 375 and 1440 - eyeballing it, and moving on. That's not testing. That's hoping.

The three bugs that slipped through

Let me get specific, because the category of each bug is instructive.

1. The clipped headline - a measurable, deterministic bug

.hero-title {
  white-space: nowrap; /* the culprit */
  width: 100%;
  overflow: hidden;
}

On desktop, the headline fit. On a narrow viewport, white-space: nowrap refused to wrap, overflow: hidden clipped the overflow, and the last word vanished.

The brutal thing: this is trivially detectable in code. The element's scrollWidth was greater than its clientWidth. That's a one-line check:

const clipped = el.scrollWidth > el.clientWidth;

No AI needed. No human eyeball needed. The browser already knows the text doesn't fit. I just was never asking it, on the viewport where it mattered.

2. The unreadable CTA - failing WCAG by math

The button was #9aa0a6 text on a #b0b4b8 background. Run the WCAG contrast formula on that and you get a ratio around 1.4:1. The minimum for normal text is 4.5:1. It's not subjective - the text was, by definition, hard to read.

Again: deterministic. You can compute the contrast ratio from getComputedStyle without a single judgment call. I just wasn't computing it on a real render.

3. The 404 hero image - only on mobile

This one was sneaky:

<img
  srcset="/hero-desktop.jpg 1440w, /hero-mobile.jpg 768w"
  sizes="(max-width: 768px) 100vw, 1440px"
  src="/hero-desktop.jpg"
/>

hero-mobile.jpg didn't exist - a deploy had dropped it. On desktop the browser picked hero-desktop.jpg and everything looked fine. On mobile it picked the 768w candidate, got a 404, and rendered a broken-image box. My desktop devtools never requested the broken file.

This shows up in the network tab as a plain 404. A runtime signal, sitting right there, that I wasn't watching on a mobile profile.

The pattern: these bugs are visible, not logical

Notice what all three have in common. They're not logic bugs. The JavaScript ran fine. The data was correct. The bug lived entirely in what the page looked like on a specific screen - and my entire test suite was built to verify behavior, not appearance.

That's the gap. CI is great at "does the function return the right value." It is structurally blind to "is the headline physically cut off on the third-most-common phone my visitors use."

What actually catches this

The only reliable way I found is embarrassingly low-tech in concept: render the page in a real browser, at the actual device viewports, and look at each one - but do the looking programmatically so it scales past two breakpoints.

Concretely, the approach that works:

Real browser, multiple device profiles. Playwright with device descriptors (iPhone 15 Pro, Pixel 8, iPad, etc.) gives you real viewport + DPR + user-agent. Full-page screenshot each one.
Run the deterministic checks first. Walk the DOM and flag the math-provable stuff: scrollWidth > clientWidth for clipping, WCAG ratios for contrast, 404s from the network log for broken assets. These are facts, not opinions - high confidence, zero hallucination risk.
Then, and only then, use vision for the fuzzy stuff. Layout overlap, "this section looks empty," visual misalignment - things you genuinely can't measure with a selector. A vision model is good at this if you constrain it.

The trap I fell into: my first version confidently reported bugs that weren't there. The fix was a corroboration step - if a finding only showed up on one device out of many, downgrade it. Real visual bugs tend to appear across multiple similar viewports; one-off "findings" are usually capture artifacts or model noise. I'd rather under-report than cry wolf.

The ordering matters. Lead with what's provable, fall back to what's inferred, and be honest in the output about which is which.

The honest limits

This approach is emulation, not a physical-device lab. It will not catch hardware-specific GPU rendering quirks, vendor-modified browser builds, or the exact way one OLED panel handles a near-black gradient. If your bug only reproduces on a real Galaxy A14 running Samsung Internet, you need an actual device or a real-device cloud.

But the vast majority of "broken on the customer's phone" bugs I ship - and I suspect you ship - are not those. They're clipped text, contrast, overflow, broken responsive images, CTAs below the fold. The boring, embarrassing, visible stuff that CI waves through.

I built this into a tool

I got tired of doing this by hand, so I packaged it: you paste a URL you own, it runs the page across up to 17 device profiles, runs the deterministic checks + the constrained vision pass, and gives you a severity-ranked report. It's called Canaryflux. There's a live demo report you can poke at without signing up: canaryflux.com/demo. Free tier is 3 devices / 3 scans a month if you want to point it at your own page.

But honestly - even if you never touch it, the lesson stands on its own and you can build a scrappy version in an afternoon with Playwright + a DOM walker:

Your CI tests what your page does. Add something that tests what your page looks like - on the screens your visitors actually use.

The customer screenshot was a gut-punch, but it taught me the most useful QA lesson I've learned in a while.

What's the worst "passed CI, broken on the actual device" bug you've shipped? I collect these - drop yours in the comments.

Read on DEV Community ↗ ← Back to News