Free tool — runs entirely in your browser

Backlink Relevance
Cosine Scorer

Measure how semantically aligned your backlinks really are. Powered by the same AI model we use in our audits — deterministic, transparent, free.

Loading AI model...
Three signals. One score.

Google doesn't just count links — it evaluates if the linking page is contextually relevant to your content. We model this with three semantic signals.

1

Context ↔ Target Body

We compare the paragraph surrounding your backlink on the referring page against the full body text of your target page. This is the strongest relevance signal — worth 50% of the score. Models Google's context2 signal.

2

Context ↔ Keywords

The surrounding paragraph is also compared to your keyword cluster, capturing topical alignment even when anchors are generic. This contributes 30% — context signals together dominate at 80%.

3

Anchor ↔ Keywords

The anchor text is matched against your target keywords. Important but less dominant than context — worth 20%. A branded anchor on an irrelevant page won't inflate the score.

0.50 × cos(context, body) + 0.30 × cos(context, keywords) + 0.20 × cos(anchor, keywords)
Strong ≥ 0.45
Moderate 0.25 – 0.44
Weak < 0.25

Model: all-MiniLM-L6-v2 (384-dim) via ONNX Runtime. Deterministic — same inputs always produce the same score.

Score your backlinks

Enter URLs and we'll extract the content automatically, or paste text directly. Use bulk mode for scale.

The page that links to you — we'll find the link and extract anchor + surrounding paragraph
Detected Link
Your page — we'll extract the body content
Extracted Body
Comma-separated keywords you're targeting — needed for the full weighted formula (context↔kw 30% + anchor↔kw 20%)
-

-

-

e.g. the paragraph surrounding your backlink
e.g. your target page body or keyword cluster
-

-

-

The surrounding paragraph - the most important signal (50%)
Your page's main content - what the link points to
The clickable link text (20%)
Comma-separated keyword cluster — context↔keywords is 30%
-

-

-

Upload a CSV with URLs or text. Columns: url_a, url_b (fetches & extracts) or text_a, text_b (direct text).

📄

Drop your CSV here or click to browse

Accepts .csv - max 200 rows for URL mode, 500 for text mode

Download text template  ·  Download URL template

Processing 0 / 0
# Source Target Score Tier
Why cosine similarity matters for backlinks

The Reasonable Surfer Model

Google's Reasonable Surfer patent (US 7,716,225 — filed 2004, updated 2010) assigns different weights to links based on their probability of being clicked. A link in a contextually relevant paragraph carries more weight than one in a footer or sidebar. Our formula models this: context signals = 80%, anchor text = 20%.

Confirmed by the 2024 API Leak

The 2024 Google Content Warehouse API leak (documented by iPullRank, SparkToro) exposed real production ranking fields:

context2 — hash of terms near the anchor (paragraph-level context, NOT full page body)
fullLeftContext / fullRightContext — extended text window around the link
anchorMismatchDemotion — penalty when anchor topic doesn't match destination
sourceType — quality tier of the linking page (HIGH/MEDIUM/LOW)
siteFocusScore — how topically focused the target site is
siteRadius — how far individual pages deviate from the site's topic centroid

Critically: Google evaluates the paragraph around the link, not the full referring page body. A 2,000-word page about marketing with one paragraph about SEO tools — only that paragraph matters for a link to an SEO tool site.

Why 80/20 Context vs Anchor?

The leak shows context2 is a primary signal while anchor text is secondary and subject to anchorMismatchDemotion. A branded anchor ("Pet Circle") on an irrelevant page (fashion site) would score high on anchor-match alone — but Google demotes this. Our 0.50/0.30/0.20 weighting ensures context dominates, preventing anchor-only inflation.

Why all-MiniLM-L6-v2?

384-dimensional sentence embeddings, 82.03 Spearman on STS Benchmark. We tested Nomic (768-dim) — it compressed all scores to 0.42-0.84, making tier differentiation impossible. MiniLM's wider spread maps naturally to meaningful quality tiers. ONNX runtime ensures deterministic FP32 output: same inputs, same score, every time.

Threshold Calibration

Strong (≥0.45) requires genuine contextual alignment across multiple signals — not achievable by anchor match alone. Moderate (0.25-0.44) indicates topical connection with room to improve. Weak (<0.25) means the linking context has minimal semantic overlap. Calibrated against 1,186 real backlinks across 6 domains.

WLDM Cosine Scoring Pipeline
─────────────────────────────

Step 1: Extract
  URL → fetch HTML → strip tags
  → clean body text (no nav/footer)

Step 2: Embed
  All text → all-MiniLM-L6-v2 (ONNX)
  → 384-dimensional unit vectors

Step 3: Compare
  Cosine similarity = dot product
  of L2-normalized vectors

  Score range: 0.0 → 1.0

Step 4: Weight
  0.50 × cos(context ↔ body)    ← strongest
  0.30 × cos(context ↔ keywords)
  0.20 × cos(anchor  ↔ keywords)
  context signals = 80%, anchor = 20%

Step 5: ClassifyStrong   ≥ 0.45
  ● Moderate 0.25 – 0.44
  ● Weak     < 0.25

Deterministic Guarantee
  Same inputs → same ONNX graph
  → same FP32 result every time
  No randomness. No sampling.

Want the full picture?

Our free backlink audit scores every link, maps competitors, and identifies the gaps holding you back.

Book a Free Audit Call