Pilot live: ACP for AI commerce.Explore ACP
Skip to content
Back to Blog

Returns in Agent Commerce: The Operator's Guide When the Buyer Is an AI

Returns are 20-30% of DTC P&L and almost nobody has written about how agent-mediated buying changes them. The new return triangle, what shifts, what stays, and the operator playbook for 2026.

11 min readStrategy

Open your returns dashboard. Look at the last 90 days. The line is probably trending up, somewhere between 4 and 8 points worse than this time last year. You have probably blamed it on Gen Z, or on sizing, or on the cheap freight option you added last spring.

It might be those things. It is more likely that the buyer interface changed underneath you.

When a human buyer browses your store, the return is a transaction between you and them. When an AI agent buys on behalf of a human, the return is also a transaction between you and the agent's memory of you. The first transaction empties your stockroom shelf. The second one shapes the next thousand recommendations.

Returns in agent commerce is the operating reality where an AI agent's recommendation gets tested against the buyer's actual experience, and the verdict gets written back to the agent's memory of your brand. This post is the diagnostic and the operator playbook. The Agent Return Triangle: FIT, EXPECTATION, FRICTION. What each lever costs, what each lever earns.

In a human funnel a bad return loses a customer. In agent commerce a bad return loses a thousand.

Why returns are different now

Returns have always been the quiet drag on DTC economics, running far higher online than in store. The National Retail Federation puts online return rates in the high-teens to mid-twenties percent range, several times the in-store rate. That part is not new. What is new is what a single return event does after it happens.

In a human funnel, a return is a closed event: the item comes back, you refund, the relationship with that one buyer takes a small hit. In agent commerce, the same return also writes to the agent's memory of your brand, at the post-purchase memory stage of the agent checkout. That memory is not scoped to one buyer. It feeds the agent's next recommendation to the next buyer who asks for something in your category. A return is no longer a transaction that ends; it is a signal that propagates. That single structural change is why your returns line can climb even when your product, sizing, and fulfillment have not changed: the buyer interface changed, and the new interface keeps score differently.

The Agent Return Triangle, vertex by vertex

Three forces determine return rate when the buyer is an agent, and they form a triangle because they trade against each other. FIT is whether the agent recommended the right product in the first place. EXPECTATION is whether the agent's pre-purchase description matched what arrived. FRICTION is how cleanly the return resolves when one is needed. Each vertex has a specific operator lever, and the costs are real: industry benchmarks put the processing cost of a single return at roughly $10 to $20 before the cost of the lost sale (Loop Returns), which is why moving the triangle is a margin lever, not a soft one.

FIT is the vertex operators most undervalue. A return driven by a FIT failure (the agent recommended a product that was never right for the buyer's stated need) looks on the dashboard exactly like a sizing problem, so it gets blamed on the product or the supply chain. It is actually a recommendation problem, and the lever is schema accuracy and taxonomy depth, not the factory. EXPECTATION failures come from the agent reading a description or structured field that overstated reality. FRICTION failures are the familiar ones: a return window too short, a restocking fee, a slow refund. The triangle's value is that it tells you which vertex you are actually bleeding from before you spend on the wrong fix.

The Agent Return Triangle

Three vertices, three operator levers. The edges are schema accuracy (FIT to EXPECTATION), return policy depth (FIT to FRICTION), and post-purchase memory (EXPECTATION to FRICTION).

FIT

Did the agent recommend the right product for the buyer's actual need?

Operator lever: schema accuracy and product-taxonomy depth. Failure mode: looks like a sizing problem, is a recommendation problem.

EXPECTATION

Did the agent's pre-purchase description match what arrived?

Operator lever: description fact-density, image accuracy, shipping data. Failure mode: schema overstates reality.

FRICTION

Did the return process resolve cleanly when one was needed?

Operator lever: return window, restocking, refund speed. Failure mode: short window, slow refund, a memory entry.

FIT is the most-undervalued vertex because schema-accuracy failures look like sizing problems and get blamed on supply chain.

What changes at each vertex when the buyer is an agent

The triangle keeps its shape when the buyer changes from a human to an agent, but every lever moves. At FIT, a human compares five to ten products and picks one, absorbing a lot of recommendation imprecision through their own browsing. An agent shortlists one to three, so your recommendation precision matters several times more: a wrong shortlist is a near-certain return. At EXPECTATION, a human reads your marketing copy and your reviews; an agent reads your schema, so your priceValidUntil and shippingDetails accuracy literally is your expectation accuracy. That is why the work in the 12-field Agent SKU that decides expectation accuracy is returns work, not just visibility work.

FRICTION changes most subtly. A human absorbs friction: they call support, they navigate your returns portal, they grumble but they stay. An agent does not call. It writes a negative memory entry and routes the next recommendation elsewhere. The friction you used to be able to push onto a patient human now converts directly into a propagating signal. This matters more every quarter because the agent-mediated share of commerce is rising fast: Bain projects agentic AI will account for 25% of U.S. ecommerce sales by 2030. A return-handling process tuned for patient humans is mistuned for the buyer that is taking over.

What changes at each vertex when the buyer is an agent

The buyer interface changed at every vertex. The triangle shape is the same; the levers moved.

VertexHuman buyerAgent buyer (2026)
FITCompares 5 to 10 products, picks one.Shortlists 1 to 3. Recommendation precision matters several times more.
EXPECTATIONReads marketing copy and reviews.Reads schema: priceValidUntil, shippingDetails, hasMerchantReturnPolicy.
FRICTIONAbsorbs friction. Calls support, navigates the portal.Writes a negative memory entry. Never calls.

The memory cost: what one return is actually worth in 2026

Price a single return honestly and the agent era adds a line the old model did not have. In 2020, a return cost you the processing (roughly $10 to $20 by Loop Returns benchmarks), a small restocking cost, and some lifetime-value erosion from that one disappointed customer. Call it a modeled $70 all in. In 2026, the first three lines are unchanged. The new line is the memory-propagation cost: the negative entry the agent writes, which lowers your recommendation odds for future buyers in the category.

How large is that propagation cost? Honestly, nobody has a clean public number yet, so treat what follows as a modeled order-of-magnitude, not a precise figure. For a high-velocity agent surfacing thousands of category recommendations, the propagated cost of one badly-handled return is plausibly 10 to 100 times the direct cost, because the signal touches every downstream recommendation rather than one customer. The multiplier is modeled; the direction is not. And the volume that makes it bite is real and climbing: Adobe reported AI-driven traffic to U.S. retail sites grew 393% year over year in Q1 2026. The more demand routes through agents, the more a single memory entry is worth.

What one return costs in 2020 vs. 2026

The first three lines are public benchmarks. The memory-propagation line is a modeled assumption: large, and it propagates across the recommendation network.

2020 · human buyer (modeled)

Processing cost$15
Restocking$5
LTV erosion (one customer)~$50
Total~$70

2026 · agent buyer (modeled)

Processing cost$15
Restocking$5
LTV erosion (one customer)~$50
Memory propagation (modeled)10-100x
Total$70 + memory

Processing cost is the Loop Returns benchmark. LTV erosion and the 10-100x memory multiplier are modeled. The order of magnitude is the point, not the precision.

The operator playbook: 5 levers to compress agent-era return rate

Five moves compress the triangle, and they sort by leverage rather than by effort. The first two look like marketing work and get skipped; they are returns work. Tighten FIT by auditing your structured data so the agent recommends the right product. Calibrate EXPECTATION by rewriting product descriptions for fact density rather than marketing voice, because the agent extracts facts. Smooth FRICTION by extending the return window: most DTC brands run 30 days, and the 2026 math favors 60 to 90, because a generous window lowers the memory-damaging friction more than it raises abuse. Surface the policy as hasMerchantReturnPolicy structured data, not a footer link. And close the memory loop with a structured post-return confirmation the agent can index, not just an email. This is also why brands that get the triangle right compound on OpenAI Ads: paid placement amplifies a clean memory signal, and amplifies a dirty one just as efficiently.

The reason to do this now rather than next year is that the agent share of your buyers is already material. ChatGPT alone drives a fifth of some major retailers' referral traffic, and that share writes to the memory that decides your next recommendation. Compressing the triangle is how you keep that memory positive, which is the same compounding engine described in the Recommendation Loop that this all feeds into.

The 5 levers, ordered by leverage

Five levers, one operating quarter. Levers 01 and 02 get skipped because they look like marketing work. They are returns work.

01

Tighten FIT (audit the 12-field Agent SKU)

1-2 weeks

FIT vertex

02

Calibrate EXPECTATION (rewrite descriptions for fact density)

2-4 weeks

EXPECTATION vertex

03

Smooth FRICTION (extend return window to 60-90 days)

1 day

FRICTION vertex

04

Surface the policy (hasMerchantReturnPolicy as structured data)

1 week

FRICTION vertex

05

Close the memory loop (structured post-return confirmation)

3-6 weeks

All three vertices

What does not change

The triangle is not a replacement for the fundamentals; it sits on top of them. Returns optimization always rested on supply-chain quality, sizing accuracy at the manufacturing level, and fulfillment accuracy, and it still does. Best-in-class fulfillment runs around 99.5% order accuracy (ShipBob); the gap between that and your actual rate is a return-driver no schema field can fix. Agent commerce amplifies these levers because a fulfillment error now writes a memory entry instead of just a refund, but it does not replace them. If the product is wrong, the box is wrong, or the size runs two sizes off, the cleanest Agent Return Triangle in the world will still bleed. The triangle tells you which of your returns are interface problems and which are fundamentals problems, which is the same discipline as what makes an agent recommend you in the first place: get the substance right, then get the signals right.

By 2028 return rate stops being only a P&L line and becomes a brand-level signal the agents read, the same way reviews and citations are signals today. A brand that compresses the triangle keeps its agent memory positive and compounds its recommendation share; a brand that treats returns as a back-office cost watches the memory turn against it one entry at a time. The fundamentals still decide whether the product is right. The triangle decides whether the agent remembers it that way. Cresva instruments the triangle on your live store: schema audit, expectation calibration, memory-loop closure. Request early access to see which vertex is bleeding.

Returns are no longer a P&L line. They are a GEO signal. Cresva audits the Agent Return Triangle on your live store and tells you which vertex is bleeding. Request early access.

Frequently asked questions

How do returns work when an AI agent buys on behalf of a customer?
Mechanically the return is the same physical process, but it now has a second consequence. Beyond the refund and the restocked item, the outcome is written to the agent's memory of your brand at the post-purchase stage. That memory feeds future recommendations to other buyers, so a badly-handled return is no longer scoped to one customer; it lowers your recommendation odds across the agent's network.
What is the Agent Return Triangle?
The Agent Return Triangle is the three forces that determine return rate when an AI agent is the buyer: FIT (did the agent recommend the right product), EXPECTATION (did the description match what arrived), and FRICTION (how cleanly the return resolves). Each vertex has a specific operator lever: schema accuracy for FIT, description fact-density for EXPECTATION, and return-policy depth for FRICTION. FIT is the most undervalued.
Are return rates higher or lower with AI-agent purchasing?
It depends on the vertex you manage. Agents can lower returns by recommending better-fitting products when your schema is accurate, because the recommendation is more precise than a human's self-directed browse. But agents raise returns when your structured data overstates the product, because they take the data literally. The net effect is a function of how well you manage FIT and EXPECTATION, not a fixed direction.
How should a DTC brand calculate the cost of a return in 2026?
Start with the direct cost: processing (roughly $10 to $20 per Loop Returns benchmarks), restocking, and lifetime-value erosion from the disappointed customer. Then add the new line, the memory-propagation cost: the negative entry written to the agent's memory that lowers future recommendation odds. The propagation cost is hard to price precisely and should be treated as a modeled multiplier, but for high-velocity agents it is large because it touches many future recommendations.
What is the single highest-leverage lever to compress return rate in the agent era?
Tightening FIT by auditing your structured product data. A wrong recommendation is a near-certain return, and because an agent shortlists only one to three products, recommendation precision matters several times more than it did for a human comparing ten. Most operators skip this because a FIT-driven return looks like a sizing problem on the dashboard, so they fix the factory instead of the schema.
Does extending my return window from 30 to 90 days hurt margin?
Usually less than operators fear, and in the agent era often the opposite. A longer window lowers the friction that produces memory-damaging negative entries, and the data on return windows consistently shows that longer windows reduce panic returns more than they invite abuse. The processing cost per return is unchanged; what changes is the volume of friction-driven returns and the quality of the memory entry each return leaves behind.

Written by the Cresva Team

Have a question? Email us