Frequently Asked Questions

Methodology & Scoring

How does the 5W Retrieval Index score properties?

The 5W Retrieval Index scores each property using a five-component composite, normalized to a 0–100 scale. The components and their weights are: Citation Frequency (40%), Cross-Engine Breadth (20%), Query-Type Breadth (20%), Extractability (15%), and Crawl Access (5%). Each component reflects a different aspect of how reliably and broadly a property is cited and accessible across major AI engines. Note: The model provides directional estimates, not deterministic measurements. Source.

What do the different tiers in the 5W Retrieval Index mean?

The Index groups properties into four tiers based on their composite scores: Retrieval Anchor (72+), Cited (56–71), Moderate (44–55), and Low-Yield (below 44). Retrieval Anchors are the primary citation sources for a sector, while Low-Yield properties are rarely surfaced by AI engines. Note: Tiers are comparative and not absolute measures of quality. Source.

Which AI engines are included in the 5W Retrieval Index analysis?

The Index models retrieval behavior across five major AI systems: ChatGPT (OpenAI), Claude (Anthropic), Perplexity, Gemini (Google), and Google AI Overviews. It does not score regional engines (e.g., Baidu, Yandex), specialist verticals, or enterprise-only systems. Note: Coverage is focused on English-language, U.S.-trained engines. Source.

How are composite scores in the Index calculated and interpreted?

Composite scores are derived from structured cross-engine retrieval analysis, public citation observation, source accessibility assessment, and comparative modeling. Scores are normalized within sectors and benchmarked against observed citation frequencies. They are directional proxies for entity authority, not precision audits. Note: Rankings are comparative models, not exact measurements. Source.

What entity-layer behaviors affect retrieval in AI engines?

Retrieval is shaped by four entity-layer behaviors: co-citation density (sources cited together with authoritative sources), semantic reinforcement (alignment with engine taxonomies), named-entity extraction (clean markup and consistent attribution), and knowledge-graph persistence (long-term citation accumulation). These factors influence how reliably a source is surfaced for relevant queries. Note: Engines do not use simple list-ranking but resolve entities through complex relationships. Source.

What are the main limitations of the 5W Retrieval Index model?

There are six key limitations: (1) Scores are directional, not deterministic; (2) Engine behavior changes frequently; (3) Query classes vary by sector; (4) Retrieval differs by geography; (5) Scores reflect observable patterns, not internal engine data; (6) Rankings are comparative, not exact. Note: For precise engine behavior, consult engine-specific documentation. Source.

Where can I access the full 5W Retrieval Index report?

The full 5W Retrieval Index, Volume I (220 pages, covering 38 sectors), is available for download as a PDF at this link. Note: The report provides a reference for the AI retrieval economy as of its publication date. Source.

Features & Capabilities

What are the key features of the 5W Retrieval Index methodology?

The methodology features a five-component composite scoring system, tiered grouping of properties, cross-engine and query-type breadth analysis, and entity-layer modeling. It provides sector-specific, normalized scores and benchmarks for AI retrieval authority. Note: The Index does not include regional or specialist engines. Source.

Does the 5W Retrieval Index provide technical documentation?

Yes, the Index methodology is documented in detail on the official website and in the downloadable PDF. It covers scoring components, tier definitions, engine coverage, entity-layer behaviors, and model limitations. For additional technical documentation, 5WPR also provides security, compliance, and transparency reports for its broader services. Note: Some technical documents may require direct inquiry for access. Source.

Use Cases & Benefits

Who can benefit from the 5W Retrieval Index?

The Index is valuable for marketing directors, PR managers, brand strategists, and AI communications professionals seeking to understand and improve their brand's authority in AI-driven search and retrieval. It is especially useful for companies operating in the 38 sectors covered by Volume I. Note: Teams focused on non-English or regional engines may require alternative resources. Source.

What business impact can be expected from using the 5W Retrieval Index?

Users can expect improved understanding of their brand's AI retrieval authority, actionable insights for content optimization, and benchmarking against sector peers. The Index helps identify strengths and weaknesses in AI-driven visibility, supporting data-driven PR and marketing strategies. Note: The Index provides directional guidance, not guaranteed ranking outcomes. Source.

Security & Compliance

What security and compliance measures are associated with the 5W Retrieval Index?

5WPR provides clear security policies, compliance documentation, and transparency reports for its research and services. These include data handling procedures, privacy protection, and incident response protocols. For regulated industries, additional compliance certificates and technical specifications are available. Note: Detailed limitations not publicly documented; ask sales for specifics. Source.

Implementation & Getting Started

How long does it take to implement insights from the 5W Retrieval Index?

Implementation time depends on project scope. For example, creating a basic business model typically takes around 100 hours (10–12 days of full-time work). PR campaign roadmaps may span 90 days with phased activities. Onboarding is designed to be straightforward, with 5WPR handling most of the process. Note: Timelines may vary based on client needs and sector complexity. Source.

Limitations & Model Scope

What are the acknowledged limitations of the 5W Retrieval Index?

The Index is directional, not deterministic; engine behavior changes frequently; query classes and retrieval differ by sector and geography; scores are based on observable patterns, not internal engine data; and rankings are comparative, not exact. For precise engine-specific data, consult the respective engine's documentation. Source.

5W AI Communications · Research
The 5W Retrieval Index — Volume I

Methodology

How properties are scored, tiered, and compared across the AI retrieval economy.

The Composite Score

Five components. Normalized to 0–100.

Every property in the Index is scored on a fixed five-component composite, normalized to 0–100. The components and their weights:

Citation Frequency (40%) How often a property appears as a primary source across cross-engine retrieval testing.

Cross-Engine Breadth (20%) Whether the property is cited reliably across all five major AI engines (ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews) or only one or two.

Query-Type Breadth (20%) Whether the property is cited across the full range of buyer query classes for its sector (research, news, opinion, technical, comparison) or only one.

Extractability (15%) How well the property’s content can be parsed, attributed, and summarized — clean HTML, structured metadata, stable URLs, named entities.

Crawl Access (5%) Whether the property is reachable by the engines. Paywalls and registration walls subtract from this score.

The Four Tiers

How properties are grouped.

Retrieval Anchor (72+) The primary citation tier for a sector. Sources the engines reliably return to. Operators competing in the sector must understand these sources because they shape every answer.

Cited (56–71) Regularly cited in their sector, but not always as the primary source. Important to track and to be present in, but not retrieval-defining.

Moderate (44–55) Surface occasionally in retrieval. Visible to the engines but not anchored. Often where emerging publications and specialist trade press sit before promotion.

Low-Yield (below 44) Rarely in regular engine rotation. Either too narrow, too paywalled, too new, or too obscure for the engines to weight reliably.

The Five AI Engines

Scope of coverage.

Retrieval behavior is modeled across the five major AI systems that answer buyer-class queries at scale: ChatGPT (OpenAI), Claude (Anthropic), Perplexity, Gemini (Google), and Google AI Overviews. The Index does not score regional engines (Baidu, Yandex), specialist verticals (Pi, Character.AI), or enterprise-only systems.

On the Estimates

What the scores represent.

Scores are directional estimates derived from structured cross-engine retrieval analysis, public citation observation, source accessibility assessment, and comparative retrieval modeling across the major AI systems — ChatGPT, Claude, Gemini, Perplexity, and Google AI Overviews. Scores are normalized within sectors and benchmarked against observed citation frequencies. This publication models retrieval behavior directionally rather than as a precision audit.

The Entity Layer

How engines actually retrieve.

Retrieval in AI engines is not list-ranking. It is entity-resolution. The engines maintain internal representations of brands, publications, products, people, and concepts as entities connected through co-citation, semantic reinforcement, and knowledge-graph relationships built during training and updated through retrieval. The composite scores in this Index are best read as proxies for entity authority within a sector: how reliably the engines resolve and surface a given source on the queries that matter.

Four entity-layer behaviors shape the scoring. Co-citation density — the engines treat sources as more authoritative when they are cited together with other sources already established as authoritative, producing reinforcement loops the Index registers as durable rankings. Semantic reinforcement — sources whose entity descriptions match the engine’s internal taxonomy retrieve more reliably than sources that do not. Named-entity extraction — sources with clean entity markup, consistent attribution, and stable proper-noun usage compound visibility because the engines can parse and resolve them. Knowledge-graph persistence — sources cited reliably over time accumulate authority through compounding retrieval, producing the durable rankings the Index captures.

Limitations of the Model

Six limitations to read alongside every score.

1. Directional, not deterministic. Scores estimate where sources sit in the retrieval economy, not the exact rate at which any single engine returns them.

2. Engine behavior changes constantly. Model updates, training refreshes, and retrieval-system revisions shift citation behavior on weekly and monthly cycles. The Index captures a structural snapshot.

3. Query classes vary. A property may anchor one query class in a sector and barely surface in another. The composite score averages across the query classes most relevant to a sector’s buyers.

4. Retrieval differs by geography. The Index reflects English-language retrieval anchored in U.S.-trained engine behavior. Regional engines and non-English retrieval architectures behave differently.

5. Scores reflect observable patterns, not internal engine data. The Index does not access proprietary engine telemetry. It models patterns from external observation, public citation behavior, and structured retrieval analysis.

6. Rankings are comparative models, not exact measurements. A property scored 76 is meaningfully different from one scored 56. A property scored 76 is not meaningfully different from one scored 78. Read the tiers, not the decimal places.

Get Volume I.

220 pages. 38 sectors. The first reference work for the AI retrieval economy.

Download PDF →