Methodology

How we score a Level 1 GTM Due Diligence report

A Level 1 report is generated in ~1-2 minutes from external signals only, no company cooperation required. This page documents the scoring formulas, grade cutoffs, data sources, and limitations so every grade and finding is defensible.

The three scored dimensions

Each dimension is scored 0-100, then mapped to a letter grade. The overall grade is the average of the three weighted equally.

AI Visibility

0.4 × SoV + 0.3 × (100 − 10 × VisibilityGaps) + 0.3 × CitationAuthority

Share of Voice (SoV): Percentage of AI-generated answers about your category that mention your company by name. Measured across Perplexity answer market queries.
Visibility Gaps: Count of high-intent category queries where no authoritative source cites your company. Each gap subtracts 10 points.
Citation Authority: Share of citations in AI answers that originate from your own domain (docs, blog, case studies). Reflects content depth, not just mentions.

Market Perception

0.35 × AverageReviewRating_normalised + 0.25 × ReviewVolume_log + 0.2 × StrengthCount − 0.2 × WeaknessSeverity

Review signals: Aggregated ratings and sentiment from G2, Capterra, TrustRadius. Extracted via structured LLM analysis with verified quote citations.
Strength/weakness themes: Recurring themes across reviews, clustered and frequency-weighted. Thin signals (<2 strengths + >1 weakness on zero reviews) are rejected as unreliable.

Competitive Position

0.3 × MarketShareProxy + 0.25 × FundingDeltaSignal + 0.25 × ProductVelocitySignal + 0.2 × WinLossPattern

Competitor activity: For each named competitor: funding events, product launches, hiring velocity, strategic moves (acquisitions, partnerships). Threat level 0-100.
Win/loss pattern extraction: Where the target wins vs. loses in the category, extracted from review switch-signals, G2 comparison pages, and category analyst commentary.

Grade rubric

80-100 · Dominant — best-in-category signals

65-79 · Strong — defensible position with clear strengths

45-64 · Mid-pack — visible gaps, requires investigation

25-44 · Weak — material concerns in this dimension

0-24 · Critical — broken or invisible channel

Confidence derivation

Each dimension carries a confidence badge (high / medium / low) based on signal density: the count of independent data points the grade rests on.

High: 30+ weighted signals (reviews, citations, competitor data points, successful queries)
Medium: 10-29 signals. Grade is directionally correct; specific figures may have ±30% variance
Low: Fewer than 10 signals. Treat the grade as a directional hypothesis to investigate, not a conclusion

Data sources

Perplexity Sonar

AI answer market queries across 40-70 targeted prompts per report

Grok (xAI) X Search

Real-time X/Twitter signal extraction: brand mentions, competitor activity, thought leadership, buyer discussions

G2 / Capterra / TrustRadius

Structured review aggregation with theme clustering + verified quote citations

Crunchbase / LinkedIn

Funding timeline, headcount trajectory, exec turnover signals (fallback search when primary extraction is thin)

Synthesis (why the engine isn't an AI wrapper)

Data extraction runs on Claude Haiku 4.5 (volume-optimised). The three killer synthesis layers run on Claude Opus 4.5 (reasoning-optimised) with a judge-and-retry loop:

Constraint Hypothesis: the one-paragraph diagnosis above the fold
Investor Questions: 5-7 sharp questions to ask the founder, each citing specific evidence
Peer Benchmark: side-by-side grades for 3-4 closest competitors on the same dimensions

A Haiku judge scores every Opus output 0-10 against a per-analyzer rubric (cites specific evidence? partner-level phrasing? banned template phrases absent?). Outputs below 7 trigger regeneration with injected feedback, up to 3 attempts.

What a Level 1 report does NOT do

Level 1 is deliberately scoped to external signals. It will not quantify:

Gross or net revenue retention, magic number, CAC payback, or any unit-economics metric
Actual pipeline health, win rates, sales cycle length, or deal-size distribution
Organisational capacity, team skill gaps, or internal process friction
Customer-specific churn predictors or segment-level NRR

These require internal data. They're covered by the Level 2 GTM Intelligence Report.

The GRIP Framework

GRIP is the diagnostic framework behind this DD tool. Built on Theory of Constraints: every revenue system has exactly one binding constraint. Fix that, the whole system improves. Fix anything else, nothing changes. GRIP maps that constraint across four dimensions, twelve modules, seventy-two pillars.

Guidance

Is direction clear?

Resources

Is the system equipped?

Implementation

Does execution convert?

Performance

Do outcomes compound?

Each dimension contains 3 deep-dive modules, each with 6 pillars and ~60 diagnostic questions. The 12 modules below:

G · GSL

GTM Strategy & Leadership

Strategic clarity, leadership alignment, resource allocation, operating rhythm, cross-functional alignment, adaptability.

G · MI

Market Intelligence

Market research, competitive intelligence, customer insights, segmentation, trend analysis.

G · PMM

Product Marketing

Positioning, messaging, launch execution, competitive differentiation, content strategy, sales enablement.

R · PP

Pricing & Packaging

Pricing strategy, packaging design, value metrics, discount governance, pricing experimentation.

R · PR

Product Readiness

Launch readiness, beta programs, documentation, training, feedback loops, product-GTM alignment.

R · EN

Enablement

Onboarding, ongoing training, content delivery, coaching, certification, enablement measurement.

I · DG

Demand Generation

Pipeline generation, channel mix, campaigns, conversion optimization, ABM maturity, marketing-sales alignment.

I · SE

Sales Execution

Sales process maturity, pipeline management, forecasting, deal execution, territory, methodology.

I · RevOps

Revenue Operations

Tech stack, data architecture, process automation, reporting, planning systems, operational efficiency.

P · CSE

Customer Success & Expansion

Onboarding, adoption, retention, expansion, health scoring, customer advocacy programs.

P · DI

Data & Insights

Data infrastructure, analytics maturity, reporting cadence, insight generation, attribution, governance.

P · AG

Alignment & Governance

Cross-functional governance, decision-making frameworks, change management, communication, accountability.

The Level 1 DD tool above touches 3 of these 12 modules (MI, PMM, and the GTM-execution slice of GSL) from external signals only. The Level 2 GTM Intelligence Report quantifies all 12 with internal operational data.

Level 2 · GTM Intelligence Report

12 pillars · 72 engines · 265 diagnostic questions

The full Caugia GTM framework combines internal operational data (CRM, marketing automation, product analytics, finance) with the external signals captured in Level 1. Typically surfaces 3-5 additional constraints invisible from outside data, quantifies monthly revenue leakage, and prioritises interventions by ROI and time-to-impact.

Learn about Level 2 →

Edge cases

Why is AI Visibility an A when my company barely spends on marketing?

AI Visibility measures presence in AI-generated answers, not marketing spend. A category-defining product with strong docs, an active community, and G2 leadership can earn an A with zero paid marketing: the LLMs pick up your mentions organically.

Why do two runs of the same company produce slightly different scores?

The Perplexity + Grok search engines evolve constantly: new articles, new competitor moves, new reviews. Scores drift ±3-5 points run-to-run. For a stable baseline, run on the 1st of each month and compare to the prior month.

How accurate are the ARR / headcount estimates?

Public figures (disclosed in press, Crunchbase, LinkedIn) are used verbatim. Estimates fall back to valuation × typical SaaS multiple or LinkedIn employee buckets when specific numbers are absent. Confidence badge on the grade card signals how tight the estimate is.

Run a report now →