The Prompt Template I Use for Every Data Analysis Request

Two years ago I started using ChatGPT to help with data analysis tasks: writing queries, explaining outputs, drafting summaries for stakeholders. The early prompts were improvised, and the results were inconsistent in exactly the way improvised prompts tend to be.

Sometimes the model made assumptions about the schema that didn't match my tables. Sometimes it produced a query that worked syntactically but wasn't what I needed semantically. Sometimes it gave me Python when I needed SQL, or assumed a library version I wasn't using. Each of these required a follow-up correction that took more time than writing a better prompt upfront would have.

I gradually settled on a template. I use it for almost every data-related request now, with small modifications depending on the task. It took longer than it should have to arrive at it, so here it is.

The template

Context: [One sentence describing the data source and what it represents]

Task: [What I need done, as specifically as possible]

Format: [What the output should look like: SQL query, Python snippet, 
plain-English summary, table, etc.]

Constraints: [Anything to avoid or specific requirements: technology stack, 
column names, no hard-coded values, etc.]

Data sample (optional): [Paste a small sample if the structure helps the model]

Five fields. Most requests only need three or four. The optional field is there for tasks where the data structure is non-obvious and two rows of sample data would eliminate more ambiguity than two sentences of description.

What it looks like in practice

Here's a real example from a few months ago. I needed to identify customers who had placed more than three orders in the previous 90 days but had not placed one in the last 30. Potentially lapsing customers who were recently active.

Without the template, I might have written: "SQL query to find inactive customers who were recently active."

The model would have had to decide: What database? What's the table name? What columns exist? What counts as "inactive"? What counts as "recently active"? It would have produced something generic that required several rounds of correction.

With the template:

Context: PostgreSQL database with an orders table containing these columns: 
order_id (int), customer_id (int), order_date (timestamp), total_value (decimal).
Each row is one completed order.

Task: Find customers who placed more than 3 orders in the 90 days before today, 
but placed zero orders in the last 30 days. Return customer_id and their 
order count from that 90-day window.

Format: A single SQL query. Add a short comment above each major clause 
explaining what it does.

Constraints: Use CTEs, not subqueries. Do not hard-code today's date; 
use CURRENT_DATE.

The first response was correct and production-ready. No corrections needed.

Why each field carries its weight

Context tells the model what kind of data it's working with. Without it, the model makes assumptions. Those assumptions are sometimes right. When they're wrong, the output looks plausible but breaks on real data. A single sentence specifying the table name, relevant columns, and what each row represents eliminates most of these mismatches.

Task is where most people put all their effort and where the prompt often still falls short, because the task field alone doesn't tell the model what success looks like. "Find customers who were recently active but have now lapsed" could produce a dozen different interpretations. Specificity here means defining the exact threshold, the exact time windows, and what to return.

Format removes ambiguity about the output type. "Python snippet" and "plain-English explanation" are completely different responses to the same data task. Without this field, the model picks one. For data work, the format also covers things like whether to include comments, whether to use specific functions, and whether the output is meant to run as-is or serve as a starting point.

Constraints are the fastest way to prevent the specific problems you already know about. If the codebase uses CTEs, say so. If the team is on PostgreSQL 12 and certain window functions aren't available, say so. If the output will be read by non-technical stakeholders and the summary should avoid jargon, say so. The model cannot know these things unless you tell it.

Data sample is optional but valuable for anything involving non-obvious structure. A two-row sample eliminates more ambiguity than two sentences of description. For standard schemas (e.g., an orders table with typical columns), it's usually not necessary.

What I skip

I leave out Format when the output type is obvious from context. If I ask for an explanation of why a query is slow, the format is clearly prose. No need to state it.

I leave out Data sample for standard schemas the model handles well from description alone.

I almost never leave out Context and Constraints. Those are the two fields that prevent the most rework. The time cost of writing them is always less than the time cost of one round of corrections.

Adapting the template for other tasks

The same structure works for non-SQL tasks with small adjustments. For Python data work, Context includes the library stack. For stakeholder summaries, Constraints includes the audience and tone. For data modeling questions, Context includes the system being designed and the scale.

The principle is the same: remove the decisions the model would otherwise make on your behalf, and make them yourself before the prompt is sent.

This is one of the six design principles for writing effective prompts covered in Practical Prompt Engineering. The others build on the same idea: every degree of freedom you leave in a prompt is a decision the model makes without knowing what you actually need.

If you want to understand the theory behind why explicit constraints improve outputs, the ChatGPT prompt guide covers the full range of techniques.

The Prompt Template I Use for Every Data Analysis Request

The template

What it looks like in practice

Why each field carries its weight

What I skip

Adapting the template for other tasks

Enjoyed this article? Share it!

About the Author

Vajo Lukic

Related Articles

How to Write Better Prompts: 5 Changes That Make an Immediate Difference

The Habit That Kept My Prompts Mediocre

Why the Same Prompt Gives Different Results Every Time

Ready to Transform Your Life?