Reverse-engineer a prompt from examples — with pattern detection and consistency scoring
Show the AI 2-3 outputs you love; it detects structural, tonal, formatting, and content patterns (plus what the examples consistently AVOID), then writes a prompt that reliably reproduces them — with a consistency confidence score.
Have examples of the output you want but not the prompt? Reverse-engineer it with systematic pattern detection: 'You are a prompt engineer who reverse-engineers prompts by treating examples as training data — finding the PATTERNS, not just the surface features. Below are [2-3] examples of my ideal output. Analyze them across four pattern dimensions before writing the prompt: PATTERN ANALYSIS: 1. STRUCTURAL PATTERNS: What's the consistent structure? (section order, paragraph count, heading style, list vs prose, opening/closing formula, length range) 2. TONAL PATTERNS: What's the voice? (formal/casual, authoritative/conversational, serious/witty, first/second/third person, sentence length rhythm) 3. FORMATTING PATTERNS: What formatting is always/never used? (bullet points, numbered lists, bold text, headers, code blocks, emoji, paragraph breaks) 4. CONTENT PATTERNS: What type of content always appears? (examples, data, caveats, calls to action, analogies, questions, specific vs general) 5. ANTI-PATTERNS — what do the examples consistently AVOID? (jargon, hedging, long intros, rhetorical questions, filler phrases, certain structures) Then deliver: (A) THE PROMPT: A single reusable prompt with [BRACKETS] for variable parts. Bake the detected patterns into the prompt's constraints — don't just describe the output, engineer the prompt to FORCE the patterns. (B) PATTERN REPORT: A short table showing each detected pattern and where it appeared in each example. (C) CONSISTENCY SCORE (0-100): How reliably would this prompt reproduce outputs that match the examples? Score based on: - 90-100: Patterns are strong and unambiguous — prompt will nail it nearly every time - 70-89: Most patterns are clear but some elements vary — prompt should be solid with occasional drift - 50-69: Patterns are loose or examples conflict with each other — prompt will capture the gist but not the specifics - Below 50: Examples are too different — you may need separate prompts Explain what's driving the score down, if anything. (D) FEW-SHOT VERSION: If the score is below 85, provide an alternative version of the prompt that includes 1-2 of the original examples as built-in few-shot demonstrations to anchor quality. Examples: [PASTE EXAMPLE 1] [PASTE EXAMPLE 2] [PASTE EXAMPLE 3 — optional]' Tips: 2-3 examples is the sweet spot — 1 isn't enough to detect patterns, 4+ can introduce contradictions; if your task is a transformation (input → output), include both the input AND the ideal output for each example; if the score comes back below 70, your examples may actually represent different 'modes' — consider splitting into two prompts; the anti-pattern detection is often more valuable than the pattern detection — knowing what to AVOID is half the prompt.
- Source
- promptfork seed
- License
- CC-BY-4.0
- Published
- 6/22/2026