Prompt Engineering Best Practices for Level 3

Level 3 (Prompt Engineering) has the lowest first-attempt pass rate of any quiz level. It's not because the concepts are harder — it's because candidates underestimate the jump from "using AI" to "engineering prompts systematically." Here's a study framework that bridges that gap.

Why Level 3 is different

Levels 1 and 2 test recognition — you identify the right concept or tool from a list. Level 3 tests application. Questions present a prompt, an output, and a problem, then ask you to diagnose what went wrong and how to fix it. You need to understand not just what techniques exist, but when to use each one and why it produces better results.

The six techniques you must understand cold

1. Zero-shot prompting — Asking the model to do something with no examples. Works well for simple, well-defined tasks. Know when it fails: tasks with ambiguous output formats, or where the model lacks context to calibrate tone/style.

2. Few-shot prompting — Providing 2–5 examples before the actual request. The key insight: examples communicate output format and reasoning pattern more efficiently than instructions. If your zero-shot prompt is producing inconsistent outputs, add examples first before adding more instructions.

3. Chain-of-thought (CoT) — Asking the model to reason step-by-step before answering. The trigger phrase "think step by step" or "let's work through this" activates it. Know when it helps (multi-step reasoning, math, classification with explanation) and when it wastes tokens (simple factual retrieval, formatting tasks).

4. System prompts — Instructions that define the model's role, constraints, and output format before any user message. A good system prompt specifies: the role ("you are a financial analyst"), the task scope ("your job is to summarise earnings calls"), the output format ("always return JSON with these fields"), and the constraints ("never speculate about future performance").

5. Output formatting — Asking for structured outputs (JSON, Markdown tables, numbered lists) dramatically improves reliability when outputs feed into downstream systems. Know the difference between asking for Markdown (human-readable) and asking for JSON (machine-readable) and when each is appropriate.

6. Role prompting — Assigning a specific persona or expertise ("you are a senior contract lawyer reviewing for risk"). This isn't just stylistic — it shifts the model's prior on what constitutes a good response. A question about contract risk asked to a "helpful assistant" gets a different answer than the same question asked to a "senior contracts partner who flags every potential liability."

A study method that works

For each technique, do this exercise: find a real task you've done or seen at work, then write three versions of the prompt — one zero-shot, one with system prompt + formatting, one with chain-of-thought or few-shot examples. Observe what changes in the output.

This experiential understanding transfers directly to the quiz. When you see a prompt-output pair and something looks wrong, you'll recognise the pattern because you've seen it in your own experiments.

Common wrong answers on Level 3

Assuming "more instructions = better prompt" — verbosity often hurts. Prompts that try to constrain every possible output usually produce worse results than clear, focused prompts with a strong example.
Confusing chain-of-thought with role prompting — they serve different purposes and are often used together, but they're not the same technique.
Thinking few-shot examples must exactly match the target task — examples that share the same reasoning *pattern* (not the same topic) still improve outputs.

After you pass

Level 3 unlocks project submissions. Don't wait — apply what you've just studied immediately by starting a project. Candidates who move from Level 3 to their first project within a week retain the material much better than those who wait.