ai-visibility

How to audit your brand's AI search visibility

An AI visibility audit answers four questions: where you appear, how often, how you're described, and which sources are carrying the citations. Here is how to run one.

Updated May 9, 20268 min read

Originally published January 9, 2025

How to audit your brand's AI search visibility

Traditional SEO audits ask whether your pages rank. AI visibility audits ask whether generative systems can find, understand, and recommend your brand when buyers ask the questions that matter — and the answer is increasingly "no, even if your SEO looks fine."

Soar is a community marketing agency that has run 4,200+ community campaigns across 280+ brands since 2017. The audit framework below is the same one we run as the diagnostic on day one of every AI visibility engagement.

Build a prompt list first

Start with a focused set of category, comparison, and problem-solving prompts. These should mirror actual customer language — not internal branding terms, not what the SEO team thinks people search.

A working prompt list for a B2B brand typically covers five intent buckets:

  • Category prompts ("best [category] for [segment]").

  • Comparison prompts ("[your brand] vs [primary competitor]").

  • Problem-diagnosis prompts ("how to [solve specific pain]").

  • Buyer-evaluation prompts ("how to choose [category]" / "what to look for in [category]").

  • Pricing and budgeting prompts ("how much does [category] cost" / "[category] pricing").

Aim for 30-50 prompts as a starting baseline. Fewer than 30 produces noisy month-over-month results; more than 50 becomes operationally heavy without adding signal. This prompt list is your test bench. Without it, AI visibility becomes anecdotal and the conversation defaults to "ChatGPT mentioned us once last week."

Test across multiple systems

Run the same prompts across the AI tools your buyers are likely to use: ChatGPT, Perplexity, Google AI Mode, and Claude at minimum. Do not assume visibility transfers automatically from one model to another — Ahrefs' 78.6M-URL analysis found that only 11% of domains are cited by both ChatGPT and Perplexity, and only 7 websites appear in the top 50 across all three major platforms.

For each prompt, record:

  • Whether your brand appears at all.

  • Whether it is cited (a clickable source link) or merely mentioned in the prose.

  • How it is described — the actual phrasing the model uses.

  • What sources the model appears to rely on, including any visible citations.

  • Position relative to competitors when the response includes comparisons.

Run the audit logged out, in a private window, with location and personalization minimized. Models personalize, and an audit run from the founder's heavily-customized account will systematically misread the public answer.

Score visibility simply

A lightweight scoring model works well, because the goal is direction, not statistical purity:

  • Strong cited mention (your brand appears with a clickable source link).

  • Weak presence (mentioned in prose without citation).

  • No presence (brand absent from response).

  • Negative presence (brand appears alongside a competitor recommendation or in a critical context).

When you track this scoring across a consistent set of prompts month-over-month, you get a practical baseline and a way to measure change. Search Engine Land's 2025 ChatGPT citation study found that 89.7% of cited pages had been updated within the year, which means the audit is also a freshness test on your own content — pages that were once cited and are no longer cited frequently turned out to be stale.

Inspect the sources behind the answers

This is where most of the real diagnosis happens. The model will tell you, often explicitly, where it learned what it just said. If the response describes your brand using a review site, a Reddit thread, a news mention, or your own product page, that tells you which surfaces are actually influencing visibility — and which surfaces aren't even in the room.

It also reveals gaps. If competitors are appearing through sources you do not control or do not occupy — Reddit threads in subreddits you've never engaged in, Quora answers you didn't write, G2 reviews you don't have — that becomes a clear action item. Profound's 2025 analysis is unambiguous on this: 47% of Perplexity's top-10 cited sources are Reddit, and brands that have no Reddit footprint are systematically absent from those answers regardless of how well their site is optimized. The same Ahrefs analysis we cite in our backlinks vs brand mentions breakdown shows that unlinked brand mentions correlate 0.664 with AI citations versus backlinks at 0.218.

When you finish a source-level pass, the audit usually outputs a one-page diagnosis: "we are absent from Reddit, present but undercited on G2, mentioned on Quora through one stale answer, and Wikipedia describes us using outdated 2023 product information." Each of those is a different fix.

Look for platform-specific blind spots

Different AI tools emphasize different retrieval patterns and source sets. If your brand is visible in ChatGPT and absent in Perplexity, the issue is rarely general brand weakness — it is source mix. ChatGPT's reliance on Wikipedia, Forbes, and G2 means a brand with a strong Wikipedia entry and review profiles can show up well there even with a weak community footprint. Perplexity's reliance on Reddit threads inverts that.

The Semrush 26K-URL study on Google AI Mode found Quora as the #4 most-cited domain (7.25% of responses), which is why our Quora B2B AI visibility playbook treats Quora answers as a citation channel, not an awareness one. Each platform is its own retrieval system. Audit them as separate channels and do not average results into a single "AI visibility" number — the average hides exactly the gaps you need to act on.

Audit monthly, not once

AI visibility shifts with model updates, new content, and competitor movement — and 40-60% of cited sources change month-to-month across Google AI Mode and ChatGPT. A one-time audit gives you a snapshot of noise. Repeated audits give you direction.

The cadence we run for clients:

  • Monthly prompt-set run across all four major platforms.

  • Quarterly review of the prompt list itself (add new commercial queries, retire ones that no longer match buyer language).

  • Annual deeper analysis tying AI citation lift to influenced pipeline and CRM data.

Anything less frequent than monthly, and you cannot tell the difference between a real shift and platform noise. Anything more frequent, and you are spending operations time without earning new signal.

How is an AI visibility audit different from a traditional SEO audit?

A traditional SEO audit asks whether your pages rank in Google. An AI visibility audit asks whether generative systems cite, mention, or recommend your brand when buyers ask commercial questions. The two correlate weakly — only ~12% overlap between Google's top 10 and AI citations. A brand can rank #1 in Google and be invisible in ChatGPT. The audits answer different questions and require different fixes.

How many prompts do we need for a useful audit?

Thirty to fifty prompts as a baseline, segmented across category, comparison, problem-diagnosis, buyer-evaluation, and pricing buckets. Below thirty, month-over-month variance is too high to read trends. Above fifty, the operational cost outpaces the signal gained. The prompt list itself should be reviewed quarterly to keep it aligned with how buyers are actually phrasing questions in the AI surface.

Should we run audits manually or use a tool?

Both. Manual runs catch nuance — how the brand is described, what sources surfaced, what tone the model used. Tools (Profound, Parse, AthenaHQ, and others) automate the repeatable scoring across 50+ prompts and four platforms, which is impractical to do manually every month. The pragmatic stack is automated scoring plus a monthly manual deep-dive on a 10-prompt subset.

What's a realistic timeline before audit results change?

For content fixes (rewriting an underperforming page, adding schema, restructuring for retrieval) — 30-60 days. For source-mix fixes (building Reddit presence, earning Quora citations, fixing G2 listings) — 4-6 months minimum, because models retrain on new conversational data on that horizon. Audits run more frequently than the change cadence will appear flat for the first quarter even when the underlying work is on track.

:::

Further reading