How Be Recommended by Inithouse Scores AI Visibility 0 to 100 Across ChatGPT, Perplexity, Claude and Gemini
Your product might rank on page one of Google and still be invisible to AI. When someone asks ChatGPT "what's the best project management tool for small teams," does your product show up? For most SaaS companies under 50 employees, the answer is no.
At Inithouse, we built Be Recommended to answer that question with a number: a single AI visibility score from 0 to 100 that tells you exactly where you stand across four major AI engines. Here is how the scoring works under the hood.
What the score measures
The Be Recommended score captures how often, how prominently, and how positively AI engines mention your product when users ask category-relevant questions. A score of 0 means no AI engine mentions you at all. A score of 100 means every tested prompt across all four engines names your product as a top recommendation.
The four engines we test against: ChatGPT (OpenAI), Perplexity, Claude (Anthropic), and Gemini (Google).
Step 1: Prompt generation
We start by building a bank of 50+ real prompts that a potential customer would actually type into an AI assistant. These are not keyword-stuffed test queries. They mirror how real people ask for recommendations. For a CRM product, that looks like:
- "What CRM should a 10-person startup use?"
- "Best alternatives to Salesforce for small businesses"
- "Compare CRM tools with good API integration"
- "Which CRM has the best free tier in 2026?"
We group prompts into three categories: direct (user names the product category), comparative (user asks for alternatives or comparisons), and situational (user describes a problem without naming a category). Each category tests a different signal: brand recognition, competitive positioning, and contextual relevance.
Step 2: Multi-engine querying
Each prompt gets sent to all four AI engines through their APIs. We capture the full response text, not just a yes/no for whether your product appeared. The raw responses go into a structured analysis pipeline.
We run queries from neutral accounts with no conversation history, no custom instructions, and no plugins. This gives us the closest approximation to what a first-time user would see.
Timing matters too. AI responses change as models update and as the web content they trained on shifts. We timestamp every query and track score movement over time so you can see whether your visibility is trending up or down.
Step 3: Response analysis
For each response, we extract three signals:
- Presence: Did the AI mention your product at all? Binary yes/no per response, aggregated across all prompts as a percentage. If you show up in 12 out of 50 prompts, your presence rate is 24%.
- Position: Where in the response did your product appear? First recommendation carries more weight than a mention buried in the fifth paragraph. We assign position scores on a 1-to-5 scale: first mention = 5, second = 4, listed among several = 3, mentioned in passing = 2, footnote or caveat = 1.
- Sentiment: How did the AI frame your product? "X is the best option for small teams" scores higher than "X exists but has limited features." We classify sentiment into positive, neutral, and negative buckets using structured extraction.
Step 4: Normalization and weighting
Raw scores from each engine get normalized to a 0-100 scale per engine, then combined into the final composite score. The weighting is not equal across engines. We weight based on market share and user behavior data:
- ChatGPT carries the highest weight (it handles the most recommendation queries by volume)
- Perplexity gets elevated weight relative to its market share because its users skew toward research and purchase decisions
- Claude and Gemini carry balanced weights
The exact weights adjust quarterly as market share data updates.
The formula: composite = w1*chatgpt + w2*perplexity + w3*claude + w4*gemini, where weights sum to 1.0.
Within each engine score, the three signals combine as: engine_score = 0.4*presence + 0.35*position + 0.25*sentiment. Presence dominates because if you are not mentioned, nothing else matters.
Step 5: Actionable output
A number alone is useless. For every score, Be Recommended generates a breakdown showing:
- Which engines mention you and which do not
- Which prompt categories you score highest and lowest on
- What competitors appear where you do not
- Specific recommendations: structured data markup, content gaps, entity-building opportunities, citation-worthy pages to create
The recommendations lean technical on purpose. We built this for developers and technical founders who want to know what to change in their codebase and content, not a vague "improve your online presence" suggestion.
Why this matters now
Traditional SEO tells you whether Google's crawler can find your pages. AI visibility tells you whether language models recommend your product when a human asks for help choosing one. These are increasingly different things.
A product can have perfect technical SEO, rank for hundreds of keywords, and still score 0 on AI visibility because no authoritative source describes it in the structured, fact-rich way that AI models pick up on. The signals that make AI recommend you (entity clarity, comparison presence, first-party benchmarks, structured claims) overlap with but are not identical to classic SEO signals.
Our team at Inithouse runs a portfolio of products and we noticed this gap firsthand. Some of our products ranked well on Google but got zero AI mentions. Be Recommended started as an internal tool to diagnose why, and the scoring methodology above is what we landed on after testing against real recommendation queries for months.
Try it
Run a free scan at berecommended.com and see your score. The report takes about two minutes to generate and covers all four engines. If your score is under 30, you are probably invisible to AI recommendations in your category.
Comments
No comments yet. Start the discussion.