Diagnosing a slow Rails page, layer by layer
DEV Community

Diagnosing a slow Rails page, layer by layer

The layers of a request

Middleware → Controller → SQL → View → external calls → browser

Most slowness lives in one of these:

Layer Typical problem How it shows up
Controller heavy logic in the request large "Executing" time
SQL (count) N+1 - many tiny repeated queries query count explodes
SQL (single) a slow query, usually a missing index one query dominates
View rendering logic, or N+1 hiding in the template large "Rendering" time
External a synchronous API/email call a gap that isn't SQL
Browser large images, heavy JS backend fast, page still slow

The point is not to guess. It's to read the numbers and let them point at the layer.

Step 1 - read the numbers (development)

In development I lean on two tools:

  • rack-mini-profiler - a badge in the corner that breaks a request into controller / view / SQL time, and counts every query.
  • bullet - watches for N+1 and tells you exactly which association to preload.

On a storefront page that lists a page of 48 products, rack-mini-profiler showed:

Executing: stores#show 2 sql
Rendering: show.html.erb 49 sql ← 49 queries just to render?
SQL Summary: 51 sql total

Fifty-one queries to render one page of products is a red flag, and the fact that 49 of them happen during rendering points straight at the view. bullet confirmed it:

USE eager loading detected
Product => [:image_attachment]
Add to your query: .includes([:image_attachment])

Step 2 - understand the N+1

The products use Active Storage for their images:

class Product < ApplicationRecord
  has_one_attached :image
end

An attached image isn't a column on products. Active Storage spreads it across three tables: active_storage_attachments (which record owns which file), active_storage_blobs (the file's metadata + a storage key), and active_storage_variant_records (generated thumbnails). The file bytes themselves live in a storage service - disk locally, object storage in production.

So every time the view touches product.image, Rails walks those tables. In a loop over N products, that's N extra round-trips: a textbook N+1.

Step 3 - fix and re-measure

The fix is to preload the attachment once, up front. Active Storage generates a scope for exactly this:

# before
@products = Product.where("stock > 0")

# after
@products = Product.where("stock > 0").with_attached_image

with_attached_image is just an Active Storage flavoured includes.

Re-measured on the same page:

Before After
SQL queries 51 3
ActiveRecord time ~210 ms ~12 ms
bullet warnings yes none

The query count is now flat regardless of how many products are on the page - O(1) instead of O(n). That's the real win: N+1 isn't scary because of its cost on any single request, it's scary because it grows with your catalog and your traffic. The same page under a few hundred requests a minute turns a handful of extra queries into thousands of extra round-trips against the database.

The other kind of slow: a single heavy query

N+1 is about query count. The other common case is one query that is slow on its own - usually a missing index. Here EXPLAIN is the tool: it shows how Postgres plans to run a query without running it.

Looking up orders by a column with no index, over an orders table with ~800k rows:

Seq Scan on orders (cost=0.00..21450.00 rows=1 width=93)
  Filter: (customer_email = 'someone@example.com')

Seq Scan means Postgres reads the whole table row by row to find one order - wasteful when there are hundreds of thousands of them.

After adding an index on that column:

Index Scan using index_orders_on_customer_email on orders (cost=0.42..8.44 rows=1 ...)
  Index Cond: (customer_email = 'someone@example.com')

Seq ScanIndex Scan, and the planner's cost estimate drops from ~21,000 to ~8 - the index turns "scan everything" into "jump straight to the row."

Two gotchas worth knowing:

  • Data volume matters. On a small table Postgres picks Seq Scan even when an index exists - scanning a few rows is cheaper than an index lookup. The index only earns its keep once the table is large, so test against production-scale data, not a dev seed.
  • Selectivity matters. An index only helps when the query matches a small slice. A query that returns most of the table will be a Seq Scan regardless.

How this plays out in production

Development tools (rack-mini-profiler, bullet) catch problems before they ship. But dev never fully mirrors production - you don't hit every page, your data is small, and some N+1s only appear with real data shapes. So production needs an APM (I use Scout) watching real traffic to catch what slipped through.

The end-to-end flow when something is slow in production:

APM flags a slow endpoint → reproduce locally with realistic data → EXPLAIN the suspect query → add the index / preload / cache → deploy, confirm in the APM that it actually got faster

bullet is the prevention, the APM is the evidence. They're not redundant - they're defense in depth, because dev can never be a perfect copy of prod.

Takeaways

  • Don't guess where a page is slow - read it layer by layer.
  • Two distinct DB problems: many queries (N+1, fix with includes/preload) vs one slow query (missing index, find with EXPLAIN).
  • N+1 matters because it scales with data, not because it's slow today.
  • EXPLAIN results depend on data volume and selectivity - test with realistic data.
  • Prevent in development, verify in production.

Comments

No comments yet. Start the discussion.