Pilot live: ACP for AI commerce.Explore ACP
Skip to content
Back to Blog

How AI Agents Decide Which Brand to Recommend (And Why Yours Isn't on the List)

The 9 signals AI shopping agents weight when ranking brands in 2026, sorted by influence, with how to audit each one on your store. Operator's playbook, not speculation.

11 min readStrategy

Open ChatGPT. Ask it for the best of whatever you sell, under your price ceiling, for your buyer's most common use case. Read the brands it surfaces. Now answer one question: do you know why those brands ranked above yours. If the answer is no, you are operating in the agent era the same way brands operated in 1998. You are assuming the surface is a black box, hoping for visibility, watching competitors with no obvious advantage rank above you.

The agent's ranking logic is not a black box. It is a set of signals. Some are technical, some are editorial, some are reputational. They are knowable, observable, and influenceable. This post lists the nine that move the needle in 2026, sorted by influence, with the audit you can run on your own store for each one. Twenty minutes per signal. Most operators will find at least three they're failing on today.

How the ranking actually works

Every modern shopping agent (ChatGPT, Claude, Gemini, Perplexity, Amazon Rufus) runs the same three-stage pipeline behind the conversational surface. Stage one, retrieval: the agent assembles a candidate set of brands and products that could plausibly answer the buyer's query, pulling from its training corpus, live web search, and commerce APIs like ACP. Stage two, scoring: the agent ranks candidates against the buyer's stated and inferred constraints (price, use case, preferences, history). Stage three, trust: the agent decides which of the top-ranked candidates it is confident enough to actually recommend, with reasoning the buyer can act on.

Each stage has its own signals. Retrieval signals decide whether you are even in the candidate set; if you fail here, none of the downstream work matters. Scoring signals decide your position within the set once you are in. Trust signals decide whether the agent surfaces you confidently or buries you behind hedges. The nine signals below map to one of the three buckets, and the bucket determines what kind of work moves the signal: retrieval is mostly technical work, scoring is mostly reputational, trust is mostly operational.

Three buckets, nine signals

Each stage in the agent's pipeline has its own signal set. Fail at retrieval and nothing downstream matters.

RetrievalDo you get included in the candidate set?
Structured product dataThird-party citationsServer-side renderingBrand mention frequency
ScoringDo you rank highly within the set?
Review density and recency
TrustDoes the agent recommend you with confidence?
Real-time inventory and pricingReturn policy precisionCross-platform consistencyResponse time and uptime

The 9 signals, ranked by influence

Sorted by how much each one moves agent rankings against a typical 2026 baseline, weighted from published research, vendor documentation, and observation across hundreds of buyer-side prompts. Each signal includes the audit you can run, the work that moves it, and the benchmark that counts as good. Where the public evidence is thin, the section flags it.

The 9 signals, at a glance

Sorted by influence on agent rankings in 2026. Bucket chip shows where each signal acts in the retrieval / scoring / trust pipeline.

01

Review density and recency

Scoring

How many reviews per SKU, and how recent.

02

Structured product data

Retrieval

JSON-LD completeness on every product page.

03

Third-party citations

Retrieval

Editorial mentions in authoritative publishers.

04

Real-time inventory and pricing

Trust

How stale the feed the agent reads is.

05

Server-side rendering

Retrieval

What first-paint HTML actually contains.

06

Brand mention frequency

Retrieval

How often you appear in the open web.

07

Return policy precision

Trust

How specific your return language is.

08

Cross-platform consistency

Trust

Same product data across every surface.

09

Response time and uptime

Trust

How fast and reliable your storefront is.

01. Review density and recency

The single highest-weighted scoring signal. Research referenced in Bain's analysis of agentic commerce shows agents weight review aggregate and review count as primary confidence inputs when scoring within a candidate set. The same research also finds recency dampens older reviews materially: a 2019 review carries less weight than a 2025 review, even when both are positive. The brand with more reviews from the last ninety days wins the disambiguation tiebreaker against the brand with more total reviews but a stale recency profile.

Audit: Count reviews on your top five product pages versus the same SKUs on Amazon and on your top two DTC competitors. Note the date of the most recent review on each surface. Improve: Ship a post-purchase review flow that fires at the moment the product is most fresh in the buyer's mind (typically 7 to 14 days after delivery for consumables, 30 to 60 for durables). Layer in review-syndication that publishes to your JSON-LD so the count surfaces structurally. What good looks like: if you sell on Amazon, aim for 30 to 40% of the Amazon listing review count on your DTC product page. If DTC-only, 100 or more reviews per top SKU with 20% or more from the last ninety days. Recency matters more than total count once you're past a baseline of roughly 50 reviews.

02. Structured product data completeness

The retrieval signal that gates everything else. Agents read structured product data through JSON-LD blocks embedded on the page (and, for ACP-supporting merchants, through the ACP-format feed Stripe handles). The fields they care about: product name, brand, GTIN, price, availability, image variants, dimensions, materials, shipping window, return policy, review aggregate. Missing fields do not block indexing outright, but they collapse the agent's confidence in your candidacy. The technical baseline for this is covered in detail in the OAI-SearchBot and robots.txt post; this section is the ranking-layer summary of why it matters.

Audit: Run your top ten product URLs through schema.org's structured-data validator. Note completeness per page. Improve: Implement complete JSON-LD Product schema with every field above. For Stripe merchants, enable ACP-format feed support (configuration, not engineering). What good looks like: schema validator returns zero warnings on every product URL in your top decile of revenue, GTINs present where applicable, real-time inventory hook rather than a daily snapshot.

03. Third-party citations from authoritative sources

Agents weight editorial citations on tier-one publishers as canonical evidence that a brand belongs in the candidate set. A Wirecutter recommendation, a New York Times round-up mention, a high-engagement Reddit thread, a category-publication review: these are the citations that show up in the open-web corpus the agent was trained on and the live-web search the agent runs at retrieval time. Self-published content does not substitute for them; the agent's retrieval logic discounts content on your own domain relative to content about you on someone else's.

Audit: Search "best [your category]" on Google. Read the top fifteen organic results. Count how many cite your brand by name. Repeat for "[your brand name] review" and note which publishers come back. Improve: PR cycles into category publications, expert positioning for founder bylines, Reddit/community presence on the actual subreddits buyers in your category use. What good looks like: three or more named citations from tier-one publishers per top-three SKU, with at least one citation per quarter to keep the recency signal alive.

04. Real-time inventory and pricing accuracy

The trust signal that catches operators by surprise. Agents penalize stale data heavily because recommending an out-of-stock or mispriced product damages buyer trust on the agent's surface, which is a cost the agent's reward model attributes back to the merchant. Deloitte's agentic commerce guide flags inventory accuracy as a top-five operational driver of agent recommendation share. A static daily snapshot is no longer good enough; agents reading your feed three hours after a stockout will route around you for the rest of the day.

Audit: Check the cache-control headers on your product feed and sitemap. Check the time delta between your inventory management system updating and the public feed reflecting the change. Check CDN cache TTLs. Improve: Real-time inventory hook (sub-15-minute refresh on stockouts), ETag and Last-Modified headers on the feed, CDN cache TTL capped at one hour for inventory-sensitive surfaces. What good looks like: stockout reflected in the public feed within fifteen minutes, price changes reflected within five minutes, no out-of-stock items surfaced anywhere a crawler can see them.

05. Server-side rendering and crawler accessibility

Agent crawlers read first-paint HTML and do not execute the client-side JavaScript that hydrates a single-page application. If your product name, price, reviews, or return policy only render after JavaScript runs, the crawler sees an empty shell and walks away with nothing usable. This is a retrieval signal because it gates whether the agent can extract structured data from your page at all. Common on older Shopify themes that overuse client-side personalization, and on custom React or Vue stacks that did not adopt SSR.

Audit: Open your top product page in Chrome, disable JavaScript in DevTools, reload. Note what is visible. Repeat with the OpenAI user-agent string set to confirm crawler-eye behavior. Improve: Server-side rendering for product pages, hydration after first paint, no critical content gated behind client-side JS. Next.js App Router gives you this by default; older stacks need explicit SSR work. What good looks like: product name, price, brand, reviews, return policy, and JSON-LD block all visible in first-paint HTML with JavaScript disabled.

06. Brand mention frequency across the open web

The training-data signal. Agents are trained on the open web at scale, and brands mentioned more frequently across more sources get stronger priors when the retrieval logic runs. This is the slowest signal to move and the most compounding once it does, because it operates on the corpus the next generation of models is trained on rather than on the corpus this generation already has. The Stripe and OpenAI ACP rollout and the agent commerce protocols ecosystem provide the structured layer for the agent to transact, but the brand-recognition layer that makes you a candidate in the first place lives in the open-web corpus the agent has internalized.

Audit: Run three searches. (1) Search "[your brand]" plus 'reddit', count threads where you're mentioned in the last twelve months. (2) Open ChatGPT in a fresh session and ask "what are some [your category] brands worth knowing about?", note whether you appear unprompted. (3) Search "[your brand] review" on Google, count distinct authoritative domains in the top twenty results. This is your baseline. The benchmark below is where you want to be in twelve months, not today. Improve: Brand-led editorial calendars, PR campaigns into category publications, podcast appearances, expert positioning on platforms the agent's training corpus weights heavily. What good looks like: hundreds of mentions across at least fifteen distinct authoritative sources, with growth tracked quarterly rather than monthly.

07. Return policy precision and clarity

Vague return policies create ambiguity that the agent passes on to the buyer as hedging. "Returns may apply" or "30-day return policy on most items" reads to the agent as low-confidence information, which lowers the confidence the agent has in recommending you over a competitor whose policy is precise. Precise policies ("free returns within 30 days, prepaid label included, no questions asked") read as high-confidence information and earn the agent's stronger recommendation. This is one of the cheapest signals to fix and one of the highest-leverage on per-recommendation conversion.

Audit: Read your current return policy. Count how many fields are vague ("may," "usually," "on most items") versus specific (numbers, named conditions, named exceptions). Improve: Rewrite to a six-to-eight-sentence policy that answers the buyer's most likely questions without ambiguity: timeframe, who pays return shipping, what condition the item must be in, what categories are exempt. Mirror the precise policy fields in your JSON-LD schema and (if applicable) your ACP feed. What good looks like: zero ambiguous qualifiers in the policy text, full field coverage in structured data, alignment between the storefront page and the feed.

08. Cross-platform consistency

Agents cross-check signals across surfaces. If your Amazon listing says one thing and your DTC product page says another (different title, different price beyond a reasonable delta, different return policy, different shipping window), the agent's confidence in either source drops. This shows up especially in the Amazon dynamic covered in the Amazon problem post: when the agent has to disambiguate between routing the buyer to your DTC site or your Amazon listing, inconsistency between the two pushes the agent toward Amazon's listing as the more authoritative source.

Audit: Pick your top three SKUs. Pull product title, price, shipping window, and return policy from your DTC site, your Amazon listing, your Google Shopping listing, and your Walmart Marketplace listing (where applicable). Note discrepancies. Improve: A single source of truth for product data (a PIM, or even a spreadsheet rigorously enforced) with automated propagation to every surface. Most discrepancies are not deliberate; they are stale updates on one surface that never made it to another. What good looks like: identical product titles across all surfaces, prices within 5% across all surfaces, identical return policy language, same shipping window claim.

09. Response time and uptime

The signal we have the least direct public evidence on, but the reasoning holds. Agents are responsible to the buyer for the quality of the merchant they recommend, and unreliable merchants damage agent trust in the same way unreliable hotels damaged Expedia's trust in early hotel-aggregator scoring. A slow storefront produces buyer frustration the agent absorbs as recommendation-quality cost. A storefront that goes down during a recommendation impression produces worse frustration. Retail Bulletin's reporting on agent ranking flags merchant reliability as one of the operational inputs the largest agent platforms now factor into recommendation eligibility, though specific thresholds are not yet public.

Audit: Test product-page time-to-first-paint on a throttled mid-tier mobile connection. Pull uptime data for the last ninety days from your monitoring tool (or set one up if you do not have one). Improve: CDN coverage for static assets, edge rendering for product pages, uptime monitoring with at least 99.9% target. What good looks like: sub-2-second first paint on a 4G mobile, 99.95% rolling 90-day uptime, no incidents during high-traffic moments traceable to under-provisioned infrastructure.

What to audit this week

If you skip every other piece of this post, do the 30-minute audit below. It surfaces where your brand stands across the nine signals without requiring engineering work, tooling, or budget.

The 30-minute audit

No engineering required. Five checks, ranked by leverage per minute.

01

Run 10 agent queries, note your rank

10 min

Pick ten queries your buyer would ask. Run them through ChatGPT, Claude, and Perplexity. Note which brands surface and where you sit relative to them.

02

Validate JSON-LD on your top 3 product URLs

5 min

Run each through schema.org's structured-data validator. Note which fields are missing or warning.

03

Search 'best [your category]' and count brand citations

5 min

Read the top fifteen organic results on Google. Count named citations of your brand on tier-one publishers.

04

Audit return policy precision

5 min

Read your current policy. Count vague qualifiers ("may," "usually," "on most items") versus specific fields (timeframes, conditions, exceptions).

05

Disable JavaScript, reload a product page

5 min

Note what is visible without JS. If product name, price, reviews, or return policy are blank, you have a server-side rendering gap.

What this changes about marketing in 2026

Marketing in the agent era is closer to SEO in 2010 than to social ads in 2020. Both are optimization games against a ranking algorithm. The difference is volume: agent-driven shopping queries grew 4,700% through 2025 per Retail Bulletin's reporting, while the number of brands actively optimizing for agent rank is still measured in the low thousands. The competitive-density-to-volume ratio in 2026 looks like Google in 2005. The 2026 brands-stop-advertising-start-answering post covered this as a thesis; this post is the operator playbook.

The window for asymmetric advantage is twelve to twenty-four months. Brands that ship the nine-signal work now compound across the next ten years the way brands that figured out backlinks in 2008 to 2010 compounded across the decade after. The signals consolidate slowly. The compounding compounds. The brand that audits this week and ships against the audit by next quarter has a structural lead that the brand starting in 2028 cannot close without paying a meaningfully larger acquisition tax.

Cresva tracks your agent rank against the nine-signal map continuously, not as a quarterly audit. Per-query ranking across ChatGPT, Claude, Gemini, and Perplexity; signal decomposition so a rank drop points at the cause; alerts on structured-data regressions, inventory drift, and citation gain or loss.

Frequently asked questions

Is this just SEO with a different name?
Adjacent, not identical. Some signals overlap (structured data, third-party citations, server-side rendering) because agents and search engines both crawl the open web. Others are new (return policy precision, cross-platform consistency, real-time inventory) because agents take responsibility for recommendation quality in a way search engines never did. The right frame is that agent ranking is what SEO would have been if Google had been responsible to users for the quality of every site it ranked.
Do different agents (ChatGPT vs Claude vs Gemini) weight these signals differently?
Yes, in detail but not in kind. The nine signals matter to every agent because each one solves a real problem in the retrieval-scoring-trust pipeline that every modern agent runs. The relative weights vary: ChatGPT's commerce surface weights ACP-format structured data more heavily than Claude's; Gemini's commerce surface inherits more from Google Shopping's traditional ranking weights. The operator implication is to optimize against the nine-signal map without worrying about per-agent tuning, because the variance between agents is smaller than the variance between brands that have done the work and brands that have not.
How long does it take to move my ranking once I fix a signal?
Variable by signal. Technical signals (structured data, SSR, real-time inventory) move within a recrawl cycle, typically one to three weeks. Trust signals (return policy precision, cross-platform consistency, uptime) move on the agent's next confidence reassessment, typically two to six weeks. Reputational signals (third-party citations, brand mention frequency) move on the timescale the training corpus updates, typically months to a year for the underlying model and weeks to a month for the live-web-search layer. The 30-minute audit catches the technical signals first because they pay back fastest.
Can I pay to override these signals (like paid search)?
Paid placement on agent surfaces (OpenAI Ads, paid placement inside Perplexity Pages, sponsored slots inside Amazon Rufus) is real and growing. It does not override the nine signals; it bids alongside them. A paid placement on a query where your brand is missing structured data, weak on reviews, and inconsistent across platforms will still convert poorly because the buyer arrives at a product page that does not match the recommendation context. Paid placement amplifies signal strength; it does not substitute for it.
What if my category has heavy Amazon dominance, do these signals matter?
They matter more, not less. Heavy Amazon dominance means the agent's default destination for category queries is Amazon, which means the nine signals on your DTC site are what move the agent away from that default. The Amazon problem post covers the strategic decision (fight to keep the click on DTC versus accept Amazon as the conversion channel); the nine-signal work is the technical and reputational baseline that makes the fight winnable in the first place.
How do I track my agent ranking over time?
Two practical paths. Manual: run the same set of ten buyer-side queries through ChatGPT, Claude, Gemini, and Perplexity monthly, log which brands surface and where you sit. The manual path is reliable but operator-intensive. Automated: there are emerging tools (Cresva among them) that run agent-rank tracking continuously, alert on rank changes, and decompose changes against the nine signals to point you at the underlying cause. Either path beats no measurement, because the alternative is shipping signal work without knowing whether it moved your rank.

Written by the Cresva Team

Have a question? Email us