ai-visibility

The 90-day GEO program: from audit to first citations

A real Generative Engine Optimization program runs on 90 days, not 30. Here is the phase-by-phase structure we use across hundreds of client engagements: audit and baseline, source intervention,

Updated May 14, 202611 min read

Originally published April 13, 2026

The 90-day GEO program: from audit to first citations

Every GEO program we run lives or dies on the first 90 days. Less than 90 and you have not collected enough week-over-week data to tell signal from noise. More than 90 and the board asks why the retainer has not produced a metric. Ninety is the cadence where Generative Engine Optimization stops being an experiment and starts being a measurable channel. This post is the phased structure we ship to clients: days 1 to 30 audit and baseline, 31 to 60 source intervention, 61 to 90 iteration and expansion.

Soar is a community marketing agency that has run 4,200+ community campaigns across 280+ brands since 2017, and the 90-day cadence below is the one we run for AI visibility engagements specifically.

Key takeaways

Thirty days is a diagnostic. Ninety is the shortest honest window where a clean baseline, multi-engine interventions, and post-intervention data sit in sequence.
Build the baseline against all four engines from day one. Brands that optimize for ChatGPT alone get exposed when retrieval weights shift, as they did in August 2025 when Reddit's ChatGPT citation share fell from roughly 60 percent to 10 percent in six weeks.
Source intervention splits four ways: Bing for ChatGPT, Reddit and Wikipedia for Claude and Perplexity, E-E-A-T for Google AI Overviews, original content for the gaps.
A 15 percent lift in mention rate after one 60-day intervention cycle is a strong result. Compounding starts month four.
Day 90 is not a report. It is an operating rhythm: weekly interventions, monthly reporting, quarterly prompt-set refresh.

Why 90 days, not 30

Most agencies pitch a 30-day audit and call it a program. It is a snapshot. Weekly mention rate bounces 10 to 20 percent on its own from retrieval changes you do not control, so you need four or five weeks of post-intervention data before you can call direction. Ninety days is the shortest cadence where a clean baseline, interventions across all four engines, and enough post-intervention data happen in sequence. Shorter is a diagnostic. Longer is where compounding starts.

Days 1 to 30: Audit and baseline

The first month is the hardest to sell and the most expensive to skip. Every brand we inherit from an engagement that launched without a baseline loses a month rebuilding it.

Week 1: Prompt set. Build 50 to 200 prompts across four categories: brand, category, comparison, and problem. B2B SaaS usually lands at 40 percent category, 30 percent comparison, 20 percent problem, 10 percent brand. Consumer brands tilt toward brand and problem. Our default is 80 prompts.

Week 2: Baseline audit. Run every prompt through ChatGPT, Claude, Perplexity, and Google AI Overviews. Record mention rate, citation rate, share of voice against your top three competitors, and sentiment on each named answer. Then record the top five sources each engine uses. Without those source lists, source intervention is guesswork.

Week 3: Engine-by-engine diagnosis. The four engines work differently enough that one baseline produces four problem statements. ChatGPT Search launched on October 31, 2024 and uses Bing as a core source. Claude runs three crawlers (ClaudeBot, Claude-User, Claude-SearchBot) and newer versions have a web search tool. Perplexity uses a three-layer retrieval pipeline with an XGBoost reranker and manually boosts Reddit, Wikipedia, GitHub, Amazon, and LinkedIn. Google AI Overviews weights E-E-A-T, and Google's docs are explicit that no special schema is required. Treat the baseline as four parallel audits.

Week 4: Tracking dashboard. One anchor tool, one cross-check. We use Parse (Soar's own tool, free tier, daily refresh) as the anchor and Semrush AI Toolkit or Profound ($499+ per month) as the cross-check. The full comparison is in the free tools to track AI visibility in 2026.

The output of day 30 is one document: baseline mention rate, citation rate, top sources per engine, top 10 losing prompts, and a ranked intervention list. Without it, the program has failed before it started.

Days 31 to 60: Source intervention

You cannot rewrite the training data. You can change what the engine finds at retrieval and what the next training run will learn from. The work splits four ways.

For ChatGPT: Bing visibility. ChatGPT Search draws on Bing as a core source. If you do not show up in Bing for target prompts, you do not show up in ChatGPT Search for them either. Submit the sitemap to Bing Webmaster Tools, confirm indexing, fix the gaps. Most brands skip this because they assume Google coverage translates. It does not.

For Claude and Perplexity: Reddit, Wikipedia, editorial. Semrush analyzed 150,000 LLM citations in June 2025 and found Reddit at 40.1 percent, Wikipedia at 26.3 percent, and YouTube at 23.5 percent of cited sources across the major engines. Claude and Perplexity retrieval rewards long-form human text with topic authority. The full case is in how Reddit became the biggest single source of LLM citations. A branded subreddit plus 5 to 10 seeded threads per month hits all four engines from one pipeline.

For Google AI Overviews: E-E-A-T. Google's documentation is explicit that no special schema is required for AIO or AI Mode. The ranking signals are the same ones that drive traditional Search. The intervention is author bylines with real credentials, updated citations, and consistent E-E-A-T site-wide. Seer Interactive's 2025 study found that when you are cited inside the Overview, you get 35 percent more organic clicks and 91 percent more paid clicks than non-cited results on the same query. Being inside is worth an order of magnitude more than being outside.

For everything: Content for the gaps. The baseline surfaces prompts where no source answers well. Those are the highest-leverage targets because you can create the answer and watch the engine find it. A standard week ships two to four new pieces against specific gaps. For the content pattern that gets cited, read how to create content that AI tools are more likely to cite.

40.1%

Share of LLM citations across major engines that come from Reddit, with Wikipedia at 26.3 percent and YouTube at 23.5 percent.

Source: Semrush, June 2025

35%

More organic clicks when you are cited inside a Google AI Overview vs non-cited results on the same query.

Source: Seer Interactive, 2025

87.4%

Share of all AI referral traffic that originates from ChatGPT specifically, across a ten-industry study.

Source: Passionfruit, 2025

60% → 10%

Reddit's ChatGPT citation share over a six-week window after an OpenAI retrieval change. Source weights are not static.

Source: Semrush, August 2025

Days 31 to 60 are where most brands stall. Hold the line.

Days 61 to 90: Iteration and expansion

Month three is where the re-audit runs, the metric tells a story, and the program moves from launch to operating mode.

Re-audit and compare. Same prompt set, same engines. The Princeton/Georgia Tech/Allen Institute paper on GEO (arXiv 2311.09735, KDD 2024) reported up to a 40 percent visibility boost on tested methods. Real engagements land lower. A 15 percent lift after one 60-day intervention cycle is a strong result.

Double down on what moved. Kill the interventions that did not move the metric. Scale the ones that did. Internal teams skip this because they have fallen in love with the tactics they committed to on day one. A 90-day program should drop three or four tactics and scale up two or three.

Expand the prompt set. Add 20 to 50 new prompts on competitive queries you were not ready to target on day one.

The output of day 90 is not a report. It is an operating rhythm: monthly reporting, weekly interventions, quarterly prompt-set refresh. Compounding kicks in from month four.

The common mistakes

Launching without a baseline. The team skips the first month and starts writing. Three months later nobody can tell whether anything moved. No interventions ship before the baseline lands.

Optimizing for one engine only. Almost always ChatGPT. When retrieval weights shift (as they did in August 2025, when Reddit's ChatGPT citation share fell from 60 percent to 10 percent in six weeks) the program is exposed. Measure all four engines from day one.

Stopping after 30 days. A one-month audit is not a program.

Treating GEO as a content project. GEO is a content project, a technical project, a Reddit project, and a measurement project at once. Teams that ship content and declare themselves done miss the source intervention, the Bing work, and the crawlability audit. The technical checklist is in how to audit your brand's AI search visibility.

What the retainer ships

A 90-day Soar GEO program ships the prompt set by day 7, baseline by day 14, first intervention batch by day 30, midpoint review by day 45, second batch by day 60, re-audit by day 75, and the expanded operating plan by day 90. That covers two baseline audits, four to eight content deliverables, one Reddit seeding program, one Bing visibility audit, one E-E-A-T refresh across the top 20 pages, and weekly reporting. The per-engine playbook is in ChatGPT vs Claude vs Perplexity vs Gemini.

Track actual AI-referred traffic inside your analytics as a bonus metric. Vercel reports that roughly 10 percent of its signups come from ChatGPT, per Mintlify's attribution write-up. For the deeper measurement discussion, read how to measure AI visibility for your brand.

Frequently asked questions

How is a 90-day GEO program different from a 90-day SEO program?

The shape is similar, the content is not. SEO interventions ship to your own pages. GEO interventions ship to a mix of your pages, third-party surfaces (Reddit, Wikipedia, editorial), Bing visibility, and a per-engine playbook. The measurement surface is also different: mention rate and citation rate across four engines, not rank and CTR on Google. Plan for parallel workstreams, not one funnel.

Can we run the 90-day program in-house, or do we need an agency?

Either works if you have the headcount. The internal version usually slips to four to six months because building the prompt set, integrating a tracking tool, and learning per-engine mechanics is a full quarter of ramp before the first intervention ships. Brands that staff a dedicated GEO lead from day one can run it. Brands that bolt it onto an existing SEO role almost always under-resource the source intervention work.

What does a 90-day GEO program cost?

Productized engagements range from roughly $5,000 to $15,000 per month depending on prompt-set scope, content volume, and whether a Reddit seeding program is bundled. Tooling adds $0 to $499 per month on top, depending on whether you anchor on a free tool like Parse or an enterprise stack like Profound. See the AI visibility agency pricing breakdown for the tier comparison.

When should we expect to see results?

The leading indicator (mention rate inside target prompts) moves in 60 to 90 days. The lagging indicator (real AI-referred traffic in your analytics) moves on a 4 to 6 month curve because the engines retrain and re-index on their own cadence. Boards that benchmark GEO against a paid-ads dashboard will be disappointed. Boards that benchmark it against a content marketing program will be impressed.

Conclusion

A 90-day GEO program is the shortest honest timeline that turns AI visibility into a real channel. Days 1 to 30 build the baseline. Days 31 to 60 ship the interventions. Days 61 to 90 iterate and expand. Skip any phase and the program collapses. The brands that stick to the structure have defensible share of voice by month six. The ones that compress the timeline end up with a pile of content and no idea whether any of it worked.

How Soar saves you time and money

An internal 90-day program usually takes four to six months because the ramp alone eats the first two. Building the prompt set, integrating a tracking tool, running a clean baseline, and learning the per-engine mechanics is a full quarter of work before the first intervention ships. We compress that into 30 days with a prompt-set template, Parse as the default tool, and a library of interventions refined across hundreds of engagements.

A senior GEO specialist in 2026 does not really exist as a hiring pool yet. Brands either promote an SEO lead and rebuild from scratch or hire a generalist and hope. Either path takes six months. A Soar 90-day program runs at roughly a quarter of the fully-loaded cost of an equivalent internal hire over the same window, and the metric improvements compound from month four onward.

The 2026 guide to Generative Engine Optimization
What is Generative Engine Optimization (GEO)? A category definition
Free AI visibility tracking tools in 2026
How to audit your brand's AI search visibility
How to measure AI visibility for your brand
ChatGPT vs Claude vs Perplexity vs Gemini: how brand visibility differs