The CEO's Guide to Willingness-to-Pay Research Across Commercial Archetypes
Willingness-to-pay research is often treated as a single methodology, but the operators who extract real revenue from it run multi-instrument programs matched to the channel they are measuring. This guide walks through the four-channel WTP program Antler & Forge Beverage built across DTC, specialty retail, restaurant wholesale, and duty-free. It shows how instrument choice, triangulation, and pricing discipline convert $148K of research spend into $1.2M of first-year lift.
The Operator's Guide to Willingness-to-Pay Research Across Commercial Archetypes
Most willingness-to-pay research fails not because the math is wrong but because the operator asked one channel's question and applied the answer to four channels. It is the single most expensive habit in commercial pricing, and it is still the default in companies that have otherwise professionalised everything else.
TL;DR
- Channel dictates instrument. There is no universal WTP method; DTC, specialty retail, restaurant wholesale, and duty-free each answer a different question and require a different tool.
- Triangulation beats precision. A triangulated estimate across two imperfect methods is reliably better than a single-method study that claims high confidence.
- Pricing decisions come from the narrowest overlap, not the widest range. The discipline is in what you refuse to translate.
- A four-channel program at Antler & Forge Beverage cost $148K over 14 weeks and produced $1.2M in first-year revenue lift: because the operator, Mikhail, refused to run a single-method study.
Exhibit: Four-channel WTP instrument matrix
Exhibit: Van Westendorp composite for flagship SKU
The core problem
Antler & Forge Beverage is a 55-person craft spirits house doing $24M in revenue across four channels: 38% DTC, 32% specialty liquor retail, 22% restaurant and bar wholesale, and 8% duty-free. Their portfolio is 180 SKUs across three price tiers: a $38 everyday expression, a $72 flagship, and a $145 limited release. They serve 820 restaurant accounts, 240 specialty retailers, and maintain duty-free presence in 18 countries.
Mikhail, the Head of Commercial at Antler & Forge, inherited a pricing architecture built on intuition and importer feedback. Every channel was being priced against a different unspoken assumption, and nobody on the team could articulate why. When he pushed on the flagship's $68 price point, the answer came back as "that's what the market will bear." Nobody could say which market, which buyer, or which purchase occasion.
Confusion is the enemy of willingness to pay. Before a number can be defended, the question underneath it has to be sharpened until it has exactly one interpretation. Mikhail's starting move was not to commission research: it was to list, channel by channel, the specific pricing decision he needed to make. That list forced the instrument choice that followed.
Part one: channel-specific WTP method choice
A DTC buyer browsing a Sunday email is not the same decision-maker as a sommelier building a by-the-glass programme, and neither resembles the duty-free traveller choosing a gift in a twenty-minute layover. Each has a different reference set, substitutes, decision horizon, and social signal. Pricing is a signal before it is a number, and the signal is read differently in each aisle.
Antler & Forge ran four instruments in parallel, each matched to the channel's decision structure:
- DTC flagship: Van Westendorp PSM paired with a choice-based conjoint, n=840 from the house list and a matched lookalike panel.
- Specialty retail: a 14-person retailer panel: store owners and category buyers: combined with a shelf scan-data regression across 240 accounts.
- Restaurant and bar wholesale: 22 sommelier and beverage-director ride-alongs, paired with a menu-placement A/B across 48 accounts.
- Duty-free: a traveller-intercept survey at four airports, n=380, stratified across inbound, outbound, and transit travellers.
Each instrument answered a question the others could not. The retailer panel surfaced shelf economics no consumer survey can see. The ride-alongs surfaced the rebottling-story narrative no conjoint attribute list could have anticipated. The intercept caught impulse, gift, and ritual triggers that only emerge in the physical environment. Together, the DTC Van Westendorp and conjoint triangulated a defensible flagship optimal.
Part two: the instrument library
An operator running a serious WTP programme should know five instruments cold, and know which question each is designed to answer.
- Van Westendorp PSM: four questions about what the buyer considers too cheap, a bargain, expensive, and too expensive. Best for an acceptable-range read on a known product. Weak on trade-offs.
- Choice-based conjoint: buyers choose between bundles at varying prices. Best for isolating WTP for a specific feature. Weak when the real purchase driver sits outside the attribute list.
- Shelf scan-data regression: uses point-of-sale data to estimate elasticity. Best for specialty and mass retail. Weak for new products and limited releases.
- Qualitative ride-along: sits with the buyer in their own environment. Best where the purchase is a relationship. Weak on projectability.
- Traveller intercept: captures the buyer at the moment of consideration. Best for duty-free and airport. Weak on sample control.
The best operators compete on discipline, not instinct, and discipline starts with refusing to use an instrument outside its design envelope.
Part three: triangulation and why single-method studies fail
Antler & Forge's most expensive prior mistake was a single-method study. Two years before Mikhail joined, a conjoint-only DTC study returned an optimal flagship price of $85. The team lifted the price. DTC held. Restaurant accounts did not: forty-one delisted within the quarter because the wholesale price implied by $85 retail pushed the by-the-glass pour past the $18 ceiling operators in that tier defended.
The study was not wrong. The question was wrong. The conjoint measured DTC willingness-to-pay and was asked to stand in for a four-channel pricing decision. A parallel failure: a pure Van Westendorp study on the limited release that said $160 and missed the duty-free ceiling at $180: an under-priced product is as expensive as an over-priced one, just quieter.
Triangulation is the correction. The DTC Van Westendorp composite identified a flagship range of $66–$78 with an optimal around $74. The conjoint returned an independent optimum of $70. The retailer panel converged on $72. The sommelier work confirmed $72 retail translated to a defensible by-the-glass pour. Four instruments, one overlap: $72. The new price went live at $72: a 4% lift from $68: and held across every channel.
Part four: pricing decisions from WTP findings
Research produces findings. Findings do not produce prices. The translation is where most programmes quietly waste their budget.
Antler & Forge's four-channel programme produced three decisions:
- Flagship lifted from $68 to $72, a 4% move validated across DTC, specialty, restaurant, and the retailer panel. First-year DTC contribution: $640K.
- Restaurant accounts were offered a rebottling-story SKU at $88, built from the sommelier finding that restaurants would pay a premium for a pour with a narrative staff could tell tableside. First-year restaurant lift: $310K.
- Duty-free limited release priced at $175, five dollars under the $180 ceiling the intercept survey identified. First-year duty-free lift: $250K.
Total first-year lift: $1.2M against $148K over 14 weeks. The ROI is not the headline. The headline is that the same operator, asking a different question, would have produced a different number: the programme's discipline came from refusing to take any single instrument's answer as the final one.
Pricing maturity is measured by what you stop doing. Mikhail stopped three things: translating DTC findings directly into wholesale list prices, treating conjoint output as a decision, and running studies without pre-registering the decision they were meant to inform.
Three failure modes
- Method monoculture. Running one instrument across all channels. Every channel outside the instrument's design envelope ends up priced by analogy, and analogy is the most expensive form of pricing.
- Sample of loyalists. Surveying the house list and calling it a WTP read. Loyalists reveal the ceiling of the existing customer, not the ceiling of the next customer. Antler & Forge's DTC study matched a lookalike panel to the house list; the lookalike WTP was $4 lower, and the lookalike number is the one they used.
- WTP-to-list-price straight translation. Treating the survey output as the price. WTP is an input to a decision that also has to absorb channel margin, promotional cadence, competitive reference sets, and psychological thresholds: $18 pour, $100 gift, $200 luxury: that survey instruments often miss.
A 30-60-90 sprint
Days 1–30: scope and pre-register. List every pricing decision the programme must inform, channel by channel. For each, write the question the instrument must answer and the decision rule that will convert the output into a price. If the rule cannot be written in advance, the study is not ready.
Days 31–60: run instruments in parallel. Sequential studies leak budget and let early findings bias later ones. Antler & Forge ran all four instruments across weeks 4–10. Parallel fielding forces triangulation into the analysis plan.
Days 61–90: triangulate, decide, document. Build the overlap chart: every instrument's range on one axis, the proposed price on the other. The price lives in the narrowest overlap. Document what was decided, what was rejected, and what the next study will test.
FAQ
How much should an operator spend on WTP research? A reasonable benchmark is 0.5–1% of the revenue the pricing decision touches. Antler & Forge spent $148K against $24M of channel revenue: 0.6%. The spend is not the variable to optimise; decision clarity is. A $40K single-method study that produces the wrong answer is more expensive than a $150K multi-instrument programme, because the first failure costs delisted accounts and the second costs only a budget line.
Is Van Westendorp still credible as a standalone method? It is credible as a range-finder and as one leg of a triangulated estimate, not as a standalone basis for a pricing decision in any channel where trade-offs, substitutes, or shelf economics are material. Designed in 1976 for CPG with a clear reference set, it performs best when triangulated against at least one other instrument.
When is conjoint the right instrument, and when is it wrong? Conjoint is right when the decision is about relative value of specific attributes: pack size, feature bundle, service tier: and the buyer's real-world decision resembles the choice task. It is wrong when the dominant purchase driver sits outside the attribute list, as is common in category-driven, ritual-driven, or narrative-driven purchases. The earlier conjoint-only DTC study missed the ritual-purchase driver, which is why the $85 number collapsed when restaurants priced it.
How do you handle duty-free pricing without a large sample? Duty-free is the hardest channel to sample because travellers are transient and the context cannot be replicated online. Antler & Forge used a four-airport intercept with n=380, stratified across inbound, outbound, and transit travellers, and triangulated against the DTC Van Westendorp composite. The overlap, not the intercept number alone, informed the $175 price. Passive data: category sell-through, price-band density: can supplement a modest primary sample.
What is the single most common mistake in WTP research? Treating the survey output as the price. WTP is an input to a decision that also has to absorb margin architecture, channel conflict, competitive positioning, and psychological thresholds buyers defend without articulating. The translation from finding to price is the craft, and it is the step most programmes shortcut. The second most common mistake is surveying only loyalists and calling the result the market.
How do you prevent channel conflict when pricing varies across channels? Pricing varies across channels by design, not by accident. Antler & Forge publishes a single SRP but differentiates through channel-specific promotional cadence, pack configurations, and limited releases. The WTP programme measures each channel's ceiling separately and the commercial team reconciles through product architecture, not list price. Conflict emerges when the same SKU is priced inconsistently, not when channels carry different propositions.
How often should a WTP programme be refreshed? Major programmes every 24–36 months, with targeted instrument refreshes annually on products and channels that move. Van Westendorp drifts quickly when the competitive reference set shifts. Sommelier ride-alongs get refreshed yearly because the by-the-glass environment is the most volatile variable in the restaurant channel.
How does this translate to non-beverage archetypes? The channel-and-instrument logic transfers directly. B2B software operators have buyer personas instead of channels; each persona has a different reference set, and conjoint alone is as dangerous there as for the flagship spirits study. Marketplaces need both sides measured separately. Hardware-plus-consumables operators must separate acquisition WTP from consumable WTP. Professional services require ride-alongs because the decision is narrative-driven.
Dual CTA
If you are an operator: before commissioning the next WTP study, write the exact pricing decision you need to make, channel by channel. If you cannot write it in one sentence per channel, the study will waste its budget. Start with the decision, not the instrument.
If you are building a pricing function from scratch: run the 30-60-90 sprint in parallel, not in sequence. Pre-register decision rules, triangulate across at least two instruments per channel, and document rejections as carefully as conclusions. The asset is the discipline; the revenue lift is the receipt.
Run the free assessment or book a consultation to apply this framework to your specific situation.
Questions, answered
3 QuestionsWhy does willingness-to-pay research so often fail to produce pricing lift?
Because the operator asks one channel's question and applies the answer to four channels. A DTC buyer browsing a Sunday email is not the same decision-maker as a sommelier building a by-the-glass program, and neither resembles a duty-free traveller choosing a gift in a twenty-minute layover. Each has a different reference set, substitutes, decision horizon, and social signal. Instrument choice follows the channel's decision structure, not the researcher's preference.
When should we use choice-based conjoint versus Van Westendorp?
Choice-based conjoint is right when the product has multiple features that interact. It gives part-worth utilities and lets you simulate demand for specific bundles. Van Westendorp is a price sensitivity tool for a known product at varying price levels. Use conjoint to design the offer. Use Van Westendorp to pressure-test the price band. Gabor-Granger ladders a single product across price points to find acceptance curves.
What sample size do we need for credible willingness-to-pay work?
For qualitative WTP interviews, eight to fifteen conversations per segment usually surface the decision logic. For choice-based conjoint among consumers, aim for 200 to 400 respondents per segment to stabilize part-worths. For B2B buyer conjoint, 60 to 120 qualified respondents is often enough if the segment is tight. Van Westendorp needs 150 to 300 respondents per segment to produce clean intersection points.
Willingness-to-pay research is often treated as a single methodology, but the operators who extract real revenue from it run multi-instrument programs matched to the channel they are measuring. This guide walks through the four-channel WTP program Antler & Forge Beverage built across DTC, specialty retail, restaurant wholesale, and duty-free. It shows how instrument choice, triangulation, and pricing discipline convert $148K of research spend into $1.2M of first-year lift.
How relevant and useful is this article for you?
About the Author(s)
Emily Ellis is the Founder of FintastIQ. Emily has 20 years of experience leading pricing, value creation, and commercial transformation initiatives for PE portfolio companies and high-growth businesses. She has previous experience as a leader at McKinsey and BCG and is the Founder of FintastIQ and the Growth Operating System.
References
- Thomas Nagle & Georg Müller. The Strategy and Tactics of Pricing. Routledge, 2016
- Jagmohan Raju & John Zhang. Smart Pricing. Wharton School Publishing, 2010
- Hermann Simon. Confessions of the Pricing Man. Springer, 2015
- Robert J. Dolan. How Do You Know When the Price Is Right?. Harvard Business Review, 1995
- Rafi Mohammed. The Good-Better-Best Approach to Pricing. Harvard Business Review, 2018
