Zero-Shot vs Few-Shot Prompting Explained

When you send a message to ChatGPT without any examples, you are using zero-shot prompting. When you include one or more examples of what you want, you are using few-shot prompting. These two strategies are the foundation of almost every prompting technique that exists.

Knowing when to use each one, and how to do it well, immediately improves the reliability of your AI outputs.

Zero-shot prompting

A zero-shot prompt gives the model a task and lets it draw entirely on its pre-trained knowledge to respond. No examples, no demonstrations, no prior context. Just the request.

This works well for tasks that are clear and self-contained, where the expected format and approach are already embedded in how the question is phrased.

ZERO-SHOT EXAMPLE

"What is the capital of Italy?"

The model uses its pre-trained knowledge to answer "Rome" without needing any examples. The task is straightforward and the expected output is obvious from the phrasing.

MORE DEMANDING ZERO-SHOT

"Classify the following customer review as positive, negative, or neutral: 'The delivery was fast but the packaging was damaged.'"

Still zero-shot, but now the task definition is built into the prompt. The model knows what categories to use and what text to classify.

When zero-shot works well

  • -Factual questions with established answers
  • -Summarization tasks where the format is obvious
  • -Translation between languages
  • -Simple classification with clearly defined categories
  • -Explanation requests for well-documented concepts

Few-shot prompting

Few-shot prompting provides the model with a small number of examples before asking it to perform the task. Those examples act as a template: they show the model the structure, tone, and format of the output you want.

The term "few-shot" refers to the number of example "shots" you include. Two or three examples is usually enough. More rarely helps and lengthens the prompt unnecessarily.

FEW-SHOT EXAMPLE: SENTIMENT CLASSIFICATION

Classify each review as positive, negative, or neutral.


Review: "The food was delicious and the service was excellent."

Sentiment: Positive


Review: "The movie was okay, but I expected more from the plot."

Sentiment: Neutral


Review: "I had a terrible experience and would not recommend this place."

Sentiment: Negative


Review: "The staff were friendly but the wait time was unreasonable."

Sentiment:

Notice that the examples do two things simultaneously: they define the task format (label on a new line after each review) and they calibrate the model's judgment (what counts as "neutral" versus "negative" in this context). You cannot achieve that calibration with a description alone.

When few-shot works better than zero-shot

  • -Tasks with a specific output structure the model might not infer from the question alone
  • -Tone or style matching: when you want responses that sound a particular way
  • -Domain-specific classification: where the categories require calibration
  • -Generating content that must follow a template: product descriptions, email subject lines
  • -Any task where zero-shot outputs are consistently missing the target

How to choose between them

Start with zero-shot. If the output is on-target, you are done. If it misses the structure, tone, or format you needed, switch to few-shot and provide two or three concrete examples of what you want.

The deciding question is: does the model need to see what "right" looks like, or is the task self-evident from the question? When the task involves judgement calls or non-obvious formatting, examples communicate more precisely than descriptions.

DECISION GUIDE

Simple factual questionZero-shot
Complex task with obvious structureZero-shot
Specific output format requiredFew-shot
Tone or style matchingFew-shot
Domain-specific classificationFew-shot
Zero-shot output keeps missing the markSwitch to few-shot

How to write effective few-shot examples

The quality of your examples determines the quality of few-shot outputs. A poorly chosen example teaches the model the wrong pattern. Here is what makes examples work.

Make examples representative, not exceptional

Your examples should reflect the full range of inputs the model will encounter, not just easy cases. If you are building a sentiment classifier and all three examples are strongly positive or strongly negative, the model will struggle with ambiguous inputs. Include at least one example that demonstrates a borderline case and how you want it handled.

Keep the format perfectly consistent

Every example must follow the exact same structure. If your first example uses "Input:" and "Output:" labels, every example must use those labels. If the label format shifts, the model picks up on the inconsistency and its outputs become inconsistent too.

CONSISTENT FORMAT

Input: "The battery drains in two hours."

Category: Hardware defect


Input: "The app crashes when I open the settings menu."

Category: Software bug


Input: "I can't find the export button."

Category:

Two to three examples is usually enough

Adding more examples beyond three rarely improves results and increases prompt length. The model identifies the pattern from two or three instances. If two examples are not enough to get consistent outputs, the issue is usually example quality, not quantity. Replace a weak example rather than adding a fourth.

Common mistakes with few-shot prompting

Using examples that all show the same output

If all three examples produce the label "Positive", the model learns a bias toward that label. Your examples should demonstrate the full range of outputs the task requires.

Inconsistent label placement

Some prompts put the label before the input, others after. Pick one position and use it in every example. Inconsistency in example structure produces inconsistency in outputs.

Examples that are too similar to each other

Three examples of five-word product descriptions teach the model less than three examples at different lengths and complexities. Variety in your examples builds a more flexible pattern.

Adding examples when the real problem is task description

If the task itself is unclear, examples will not fix it. The model will match the example pattern but may still misunderstand what the task is asking for. Clarify the task description first, then add examples.

The third type: chain-of-thought

Zero-shot and few-shot handle most everyday tasks. For complex problems that require multi-step reasoning, there is a third strategy: chain-of-thought prompting, where you explicitly ask the model to work through the problem step by step before reaching a conclusion.

Chain-of-thought can be used in both zero-shot form ("Think through this step by step before answering") and few-shot form (where your examples themselves show step-by-step reasoning). See the full guide on ChatGPT prompt strategies for detailed examples.

One-shot prompting: the middle ground

One-shot prompting provides exactly one example. It falls between zero-shot and few-shot and is often the right choice when you need to demonstrate a format without writing multiple examples.

One example is enough to anchor the model to a specific structure, tone, or length. Two or three examples give you more calibration, especially for tasks involving judgment calls. The decision comes down to what the task requires.

ONE-SHOT EXAMPLE: MEETING SUMMARY FORMAT

Summarize the following meeting transcript in this format:


Example:

Date: March 12

Decisions made: Approved Q2 budget. Postponed product launch to April.

Action items: Sarah to send revised timeline by Friday. Tom to update stakeholders.

Open questions: Final pricing model still under discussion.


Now summarize this transcript:

[paste transcript]

A single well-chosen example like this is often more effective than a detailed description of the format. The model extracts the pattern (field labels, sentence structure, level of detail) directly from the example rather than interpreting abstract instructions.

For more on how examples interact with prompt instructions, see the full ChatGPT prompt guide.

What to do now

Pick a repetitive task where ChatGPT's output is inconsistent. Write a few-shot prompt with two or three examples of the output you actually want. Run it several times and compare the consistency against your previous zero-shot attempts. The difference in reliability is usually substantial.

Practical Prompt Engineering by Vajo Lukic covers zero-shot, few-shot, and chain-of-thought techniques in depth, with template libraries for each and 250+ ready-to-use prompts across 12 categories. Get the book here.

Related guides