Persona-aware disclosure

One answer rarely fits everyone. A recruiter scanning your portfolio in 90 seconds between meetings does not want the same paragraph as a staff engineer pulling apart your architecture, who does not want the same paragraph as a VP deciding whether to fund the work. The facts are identical. The shape — vocabulary, depth, citation density, what you ask them to do next — is not.

Persona-aware disclosure is the pattern: detect which audience you are talking to, then route through a system prompt that wraps the same retrieved facts in audience-appropriate framing. It is what darrenhead.com does on its chat surface — the README lists it as one of the demonstrated capabilities ("Recruiter vs. technical reader get different shapes of answer"). The principle generalises to any AI app whose users span more than one role.

If you serve a single homogeneous audience, skip this skill. If your audience is split, not branching is a choice — and it usually means half your users get an answer that is either condescending or overwhelming.

When to use this skill

Your AI product serves at least two stakeholder types whose vocabulary, depth tolerance, and decision criteria genuinely differ.
The same underlying knowledge base needs to be presented differently (sales vs support, recruiter vs engineer, clinician vs patient, trader vs compliance).
You have a reliable signal for which persona is asking — onboarding question, route parameter, account metadata, or a high-confidence content classifier.
You are willing to maintain the branches. Persona variants are code; they need eval coverage like any other prompt.

When NOT to use it

Your audiences are functionally identical. Two engineers from different teams do not need different answers. One register works.
Your house voice is already universal. If the brand is calibrated to land for everyone (think: a well-written newspaper), branching fragments the voice without lifting the answer.
The persona signal is a guess. Inferring "this looks like a CFO" from one sentence and rewriting the entire response around it is worse than a neutral default. Overfitting is the failure mode.
You cannot evaluate the branches separately. If your eval set collapses all personas into a single score, you will not notice when one branch quietly regresses.

The five axes of adaptation

These are the dials. You rarely move all of them. The skill is knowing which ones matter for the gap between your personas.

1. Language register

Vocabulary, jargon density, sentence length. A recruiter benefits from "we shipped an evaluation harness that grades model output". An engineer wants "autorater suite with rubric-based scoring, gated in CI". An executive wants "automated quality gate; cut release risk in half". Same fact. Three registers. Wrong register reads as either smug or patronising.

2. Depth

How many levels down do you go before stopping? Recruiter answers top out at what and why-it-matters. Engineer answers go to how and what-broke-when-we-tried-X. Executive answers compress to outcome and risk. Picking the wrong depth is the most common failure: engineers given a 60-word summary feel hand-waved at; recruiters given a 600-word architecture deep-dive bounce.

3. Citation density

Engineers want links — to the repo, the ADR, the PR, the eval result. Recruiters want one or two anchor citations and prose. Executives want zero inline citations and a "details on request". Over-citing reads as defensive; under-citing reads as bluster. The retrieval layer fetches the same sources every time — the persona decides how many surface in the rendered answer.

4. Next-step CTA

The follow-up question, the button, the link. Recruiter answers end with book a call or see the case study. Engineer answers end with read the ADR or clone the repo. Executive answers end with here is the one-pager or who else has used this. The CTA is where the persona model actually pays for itself — it is what converts an answer into a next action.

5. Signal preservation

What you do not change across personas. Names, numbers, dates, causal claims, anything load-bearing. Personas reshape; they never distort. If the engineer answer says "we cut p95 by 40%", the executive answer says "we cut p95 by 40%" — possibly without the "p95" — but the 40% is not negotiable. This axis is the one anti-patterns most often violate.

Detecting persona signals

You have two clean options and one good hybrid.

Explicit (ask). An onboarding question, a role-selector chip, a sign-up field. High confidence, zero inference cost. Downsides: friction on the first interaction, and users sometimes lie or pick whatever lets them in fastest. Best when the persona is durable across sessions (B2B SaaS where each account has a known role).

Implicit (infer). Classify from the first message — vocabulary, question shape, referrer, account metadata. Zero friction. Downsides: classifier error means some users get the wrong branch, and the cost of being wrong is high (engineer routed to recruiter branch will disengage immediately). Best when you have strong priors (referrer = LinkedIn → likely recruiter; referrer = GitHub → likely engineer).

Hybrid (recommended). Infer a default from available signals, then offer a one-tap correction in the UI — a chip, a toggle, a follow-up question if confidence is low. This is what most mature implementations land on. You get zero-friction defaults and a cheap out when the model gets it wrong. Treat the correction as training data for the next iteration of the classifier.

Whatever you pick, persist the choice for the session and surface it back to the user. A persona that silently changes mid-conversation is a bug.

Implementation pattern

The factual content — what you retrieved, what you computed, what you know — is shared. Only the wrapping prompt branches.

type Persona = "recruiter" | "engineer" | "executive"

const PERSONA_PROMPTS: Record<Persona, string> = {
  recruiter: `You are answering a recruiter or hiring manager. They have
~90 seconds and want signal, not depth. Lead with the outcome and the
role-relevant proof point. Use plain English; expand acronyms on first
use. Keep paragraphs to 2 sentences. Cite at most 2 sources inline,
preferably a case study or shipped product. End with ONE next-step CTA
inviting a conversation ("Want to see the case study?" / "Book a call?").
Do not go below the surface unless asked.`,

  engineer: `You are answering a technical reader (staff/senior eng,
ML researcher, architect). They want the how, not the what. Use precise
technical vocabulary; do not expand standard acronyms. Show the
trade-off, not just the choice. Include links to the repo, ADR, or PR
for every load-bearing claim. Code blocks welcome. End with ONE next-step
CTA pointing to deeper material ("Read ADR-0007?" / "See the eval rig?").`,

  executive: `You are answering a senior decision-maker (VP, director,
founder). They want outcome, risk, and cost. Compress aggressively —
3 short paragraphs max. No inline citations; offer a one-pager or
follow-up on request. Frame everything in business terms (revenue,
risk, time-to-ship, headcount). End with ONE next-step CTA inviting
a meeting or a written brief.`,
}

function buildPrompt(persona: Persona, retrievedFacts: string): string {
  return `${PERSONA_PROMPTS[persona]}

# Facts to draw from
${retrievedFacts}

# Hard rules (all personas)
- Names, numbers, dates, and causal claims are identical across personas.
- Never invent a citation. If you have no source, say so.
- If the user's follow-up implies a different persona, surface the
  mismatch and offer to switch.`
}

Two things to notice. First, the retrieval layer runs before the branch — every persona sees the same evidence. Second, the hard rules at the bottom are persona-invariant. That is where you protect signal integrity from rewrite drift.

Example: same answer, three audiences

The question: "What's the difference between fine-tuning and RAG?"

Recruiter branch.

Both are ways to make a general LLM useful for a specific job. RAG (retrieval-augmented generation) lets the model look things up in your documents at answer time — fast to ship, easy to update. Fine-tuning bakes new behaviour into the model itself — slower, more expensive, but better for changing how the model writes or thinks, not just what it knows. Most production systems start with RAG and only fine-tune when they have to.

→ Want to see how I built the RAG layer for this site?

Engineer branch.

RAG and fine-tuning solve different problems and compose well.

RAG is inference-time augmentation: embed the corpus, retrieve top-k chunks per query, stuff them into the context window, let the model generate over the union of its weights and the retrieved evidence. Cost lives in the embedding pipeline + vector store + the extra tokens per request. Wins: knowledge is hot-swappable, citations are trivial, factual drift is bounded by retrieval quality.

Fine-tuning mutates the weights — typically LoRA/QLoRA against a task-specific dataset. Cost lives in dataset curation + the training run + an eval rig to confirm you did not regress general capability. Wins: format adherence, tone, structured-output reliability, latency (no retrieval round-trip).

Heuristic: if the problem is "the model does not know X", reach for RAG. If it is "the model knows X but will not say it the way we need", reach for fine-tuning. They stack — fine-tune a small model on your domain's shape, then RAG against your live corpus for the content. See ADR-0003 for how I picked retrieval over fine-tuning for the portfolio chat layer.

→ Want to walk through the pgvector + Gemini embedding setup?

Executive branch.

Two ways to make an LLM useful for your business. Retrieval (RAG) plugs your documents in at query time — cheap, fast to update, low-risk. Fine-tuning rebuilds the model around your data — more expensive, slower to iterate, but stronger control over how the model behaves.

Most teams start with retrieval and only fine-tune when retrieval hits a ceiling. Time-to-first-value is weeks for retrieval, months for fine-tuning.

→ Want a one-pager on which to pick for your use case?

Same five facts in all three. Three different shapes. Each one ends with a CTA the audience actually wants to click.

Anti-patterns

Stereotyping the persona. "Executives don't read" is a caricature. Compress for time, not for intelligence. The branch should adjust shape, not condescend.
Letting persona override truth. If the recruiter-friendly framing requires softening a number, you have crossed a line. Reshape, never distort.
Branching too narrowly. Three personas is a starting set. Twelve is a maintenance nightmare and an eval impossibility. Collapse until every branch earns its keep.
Inconsistent answers across reloads. If the same user asks the same question twice and gets two different personas' answers because your classifier flipped, trust collapses. Persist the persona.
Leaking persona signals across contexts. If the engineer branch links to internal-only material and the persona detection ever misfires, you have a disclosure bug. Gate sensitive content on identity, not on persona.
Skipping evals per branch. If your autorater scores the union of all persona outputs, you will not see the engineer branch quietly regressing while the recruiter branch carries the average. Score each branch separately.
Hidden persona switches. Changing the user's persona mid-session without telling them feels like the product is gaslighting them. Surface the switch; offer the correction.
Treating CTA as decoration. The CTA is the conversion. A generic "let me know if you have questions" wastes the persona signal entirely. Make the next step concrete and persona-specific.

Persona-aware disclosure

Install

When to use this skill

SKILL.md

Persona-aware disclosure

When to use this skill

When NOT to use it

The five axes of adaptation

1. Language register

2. Depth

3. Citation density

4. Next-step CTA

5. Signal preservation

Detecting persona signals

Implementation pattern

Example: same answer, three audiences

Anti-patterns

Further reading

Related skills