Chain of Thought Prompting Explained (With Examples)
Apr 15, 2026 6 Min Read 37 Views
(Last Updated)
If you’ve ever asked an AI a multi-step math problem and got a completely wrong answer, only for it to be right when you asked it to “think step by step”, you’ve already seen Chain of Thought prompting at work.
Most AI models are trained to predict the next most likely token. That works great for simple questions. But the moment you ask something that requires reasoning, like a logic puzzle, a word problem, or a multi-step decision, the model can easily go off track.
Chain of Thought (CoT) prompting fixes this by guiding the model to reason through a problem one step at a time before arriving at a final answer. The result? More accurate, more explainable, and more reliable outputs.
In this article, you’ll learn exactly how Chain of Thought prompting works, see it applied across different use cases, and walk away with practical techniques you can start using right away, whether you’re working with GPT-4, Claude, Gemini, or any other large language model.
TL;DR Summary
- Chain of Thought (CoT) prompting is a technique that guides AI models to reason step by step before arriving at a final answer, making outputs significantly more accurate on complex tasks.
- This article walks you through how CoT works under the hood, from how the model generates intermediate reasoning steps to why that leads to better results than standard prompting.
- You’ll learn the three main types of CoT prompting, few-shot, zero-shot, and Auto-CoT, and understand when to use each one depending on your task and model size.
- The article includes real, hands-on examples of CoT in action across math word problems, logical reasoning, and code debugging, so you can see the difference it makes firsthand.
- It also covers best practices for writing effective CoT prompts, common errors you may run into, and how CoT compares to other prompting techniques like Tree of Thought and Self-Consistency.
Table of contents
- What is Chain of Thought Prompting?
- How Chain of Thought Prompting Works?
- Types of Chain of Thought Prompting
- Few-Shot Chain of Thought
- Zero-Shot Chain of Thought
- Auto-CoT (Automatic Chain of Thought)
- Chain of Thought Prompting: Real Examples
- Example 1: Math Word Problem
- Example 2: Logical Reasoning
- Example 3: Code Debugging
- When Should You Use Chain of Thought Prompting?
- Best Practices for Chain of Thought Prompting
- Be Explicit About the Reasoning Format
- Use Few-Shot Examples for Complex Tasks
- Keep Your Examples Consistent
- Verify the Reasoning, Not Just the Answer
- Combine CoT With Self-Consistency
- Common Errors and How to Fix Them
- Conclusion
- FAQs
- What is Chain of Thought prompting in simple terms?
- Who invented Chain of Thought prompting?
- Does Chain of Thought prompting work with all AI models?
- What is zero-shot Chain of Thought?
- Is Chain of Thought prompting the same as few-shot prompting?
What is Chain of Thought Prompting?
Chain of Thought prompting is a technique where you encourage a language model to show its reasoning process, step by step, before giving a final answer.
Instead of asking: “What is 15% of 240?”
You prompt it as: “Let’s think through this step by step. What is 15% of 240?”
The model then works through the logic out loud, which leads to a more accurate result.
The concept was formally introduced in a 2022 paper by Google researchers, “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,” and it quickly became one of the most influential ideas in prompt engineering.
The core insight is simple: when a model is asked to generate intermediate reasoning steps, it performs significantly better on tasks that require logic, arithmetic, and multi-step thinking.
The original Chain of Thought paper by Google Brain researchers showed that CoT prompting improved the performance of PaLM (a 540-billion parameter model) on a math reasoning benchmark called GSM8K from around 17% to over 56% accuracy, just by changing how the prompt was written, with no changes to the model itself.
How Chain of Thought Prompting Works?
At its core, CoT prompting works by adding reasoning steps into the prompt, either through examples or through a simple instruction like “Let’s think step by step.”
The model picks up on this pattern and replicates it when generating a response. Instead of predicting the final answer directly, it generates a chain of intermediate thoughts that lead to the answer.
Here’s a simple breakdown of what happens:
- You give the model a question (and optionally, examples with reasoning)
- The model generates intermediate reasoning steps
- The final answer follows naturally from those steps
- The answer is more accurate because it’s grounded in logic, not just pattern matching
This approach works best with large language models (typically 100B+ parameters), though smaller models have shown improvements too with the right prompting strategy.
If you want to learn more about Prompt Engineering and how to implement it, read the blog – What is Prompt Engineering?
Types of Chain of Thought Prompting
Not all CoT prompting looks the same. Depending on your use case, you’ll likely use one of these three approaches.
Few-Shot Chain of Thought
This is the original approach from the 2022 paper. You provide the model with a few examples that include both the question and the step-by-step reasoning before giving it the actual question you want answered.
Example:
Q: A bag has 5 red balls and 3 blue balls. If you remove 2 red balls, how many balls are left?
A: There are 5 red + 3 blue = 8 balls in total. Removing 2 red balls leaves 8 − 2 = 6 balls.
Q: A store had 120 items. They sold 45 in the morning and 30 in the afternoon. How many are left?
A: [Model generates reasoning here]
By showing the model how to reason in your examples, it learns to apply the same pattern to new problems.
Zero-Shot Chain of Thought
This is the simpler, more widely used approach. You don’t provide any examples, you just add a phrase like “Let’s think step by step” to your prompt.
Introduced by Kojima et al. in 2022, this technique works surprisingly well across a range of tasks.
Example:
Q: If a train travels at 60 km/h and needs to cover 210 km, how long will the journey take? Let’s think step by step.
The model then generates its own reasoning chain before answering.
Auto-CoT (Automatic Chain of Thought)
Auto-CoT automates the few-shot process by having the model generate its own reasoning examples. Instead of writing examples manually, you let the model create them, then use those as the few-shot demonstrations.
This is particularly useful when you’re working at scale and don’t want to hand-craft prompts for every task category.
Chain of Thought Prompting: Real Examples
Let’s look at how CoT prompting actually performs across different types of tasks.
Example 1: Math Word Problem
Standard Prompt:
Sarah has 3 times as many apples as Tom. Tom has 8 apples. They give away 10 apples together. How many do they have left?
Model Output (without CoT): 14 (incorrect)
CoT Prompt:
Sarah has 3 times as many apples as Tom. Tom has 8 apples. They give away 10 apples together. How many do they have left? Let’s think step by step.
Model Output (with CoT):
Tom has 8 apples. Sarah has 3 × 8 = 24 apples. Together they have 8 + 24 = 32 apples. After giving away 10, they have 32 − 10 = 22 apples.
Answer: 22 (correct)
Example 2: Logical Reasoning
Standard Prompt:
All mammals are warm-blooded. Dolphins are mammals. Are dolphins warm-blooded?
This one is simple enough that both approaches work. But watch what happens with a more layered version:
CoT Prompt:
All mammals are warm-blooded. Whales are mammals. Fish are not mammals. Sharks are fish. Are sharks warm-blooded? Let’s think through this carefully.
Model Output:
Sharks are fish. Fish are not mammals. Since only mammals are stated to be warm-blooded in this context, and sharks are not mammals, sharks are not warm-blooded in this scenario.
Without CoT, models often confuse the syllogism. With it, the reasoning stays clean.
Example 3: Code Debugging
CoT prompting isn’t just for math, it works well for technical tasks too.
CoT Prompt:
The following Python function is supposed to return the largest number in a list, but it’s returning the wrong result. Let’s debug this step by step.
def find_max(nums):
max_val = 0
for n in nums:
if n > max_val:
max_val = n
return max_val
Model Output:
Step 1: The function initialises max_val to 0. Step 2: It iterates through the list and updates max_val when it finds a larger number. Step 3: The bug is that if all numbers in the list are negative, max_val stays at 0, which is incorrect. Fix: Initialise max_val to nums[0] or use float(‘-inf’) instead of 0.
That’s a genuinely useful debugging walkthrough, and it came from a simple CoT prompt.
Chain of Thought prompting works significantly better with larger models. Research has shown that in models with fewer than 10 billion parameters, CoT prompting can sometimes hurt performance, because smaller models don’t have enough reasoning capacity to benefit from the technique. This is why model size matters when choosing your prompting strategy.
When Should You Use Chain of Thought Prompting?
CoT prompting shines in specific situations. Here’s where it makes the biggest difference:
- Multi-step arithmetic or algebra: Any problem where the final answer depends on intermediate calculations
- Logic and commonsense reasoning: Syllogisms, if-then scenarios, cause-and-effect questions
- Code generation and debugging: Especially when diagnosing complex bugs or designing algorithms
- Decision-making tasks: When you want the model to weigh options before recommending one
- Medical, legal, or scientific reasoning: Domains where the reasoning process matters as much as the conclusion
If your task involves multiple steps, dependencies between steps, or requires the model to consider trade-offs, CoT prompting is the right move.
Read More: ChatGPT Prompt Engineering for Developers
Best Practices for Chain of Thought Prompting
Getting the most out of CoT isn’t just about adding “think step by step”; how you construct the prompt matters too.
Be Explicit About the Reasoning Format
Instead of a vague instruction, tell the model exactly how you want it to think:
“Solve this problem by first identifying what information is given, then figuring out what formula applies, then computing the answer.”
This gives the model a clear structure to follow.
Use Few-Shot Examples for Complex Tasks
For highly specialised or technical domains, don’t rely on zero-shot CoT alone. Write 2–3 high-quality examples that demonstrate the kind of reasoning you want.
Keep Your Examples Consistent
If you’re using few-shot CoT, make sure all your examples follow the same reasoning format. Inconsistent examples confuse the model and produce inconsistent outputs.
Verify the Reasoning, Not Just the Answer
One of the biggest advantages of CoT is that you can check the model’s work. Make it a habit to read through the reasoning steps, not just the final answer, especially in high-stakes applications.
Combine CoT With Self-Consistency
Self-consistency is a technique where you run the same CoT prompt multiple times and take the majority answer. It’s one of the most effective ways to improve reliability, especially for math and logic tasks.
Common Errors and How to Fix Them
Even with CoT, things can go wrong. Here are the issues you’re most likely to run into:
The model ignores the reasoning instruction: This usually happens with smaller models. Try making the instruction more explicit: “Before giving your final answer, write out each step of your reasoning clearly.”
The reasoning chain is correct, but the final answer is wrong: The model sometimes makes an arithmetic error in the last step, even when the logic is right. Add: “Double-check your final calculation before answering.”
The reasoning is verbose but unhelpful: The model is padding rather than reasoning. Use fewer but more focused examples in your few-shot demonstrations to guide it toward concise, structured reasoning.
CoT makes the output too long: Set a max token limit or ask the model to “keep each reasoning step to one sentence.”
Inconsistent outputs across runs: Use self-consistency, run the prompt 3–5 times, and select the most common answer.
If you’re serious about building RAG applications with premium AI tools and want to apply them in real-world scenarios, don’t miss the chance to enroll in HCL GUVI’s Intel & IITM Pravartak Certified Artificial Intelligence & Machine Learning Course, co-designed by Intel. It covers Python, Machine Learning, Deep Learning, Generative AI, Agentic AI, and MLOps through live online classes, 20+ industry-grade projects, and 1:1 doubt sessions, with placement support from 1000+ hiring partners.
Conclusion
In conclusion, chain of Thought prompting is one of the most practical and impactful techniques in prompt engineering today. By guiding AI models to reason step by step, you get outputs that are more accurate, more transparent, and far more useful for complex tasks.
Whether you’re building AI-powered applications, working with LLMs professionally, or just trying to get better results from the tools you already use, understanding CoT gives you a real edge. As language models continue to evolve, the ability to prompt them effectively will only become more valuable.
FAQs
1. What is Chain of Thought prompting in simple terms?
It’s a technique where you ask an AI model to show its reasoning step by step before giving a final answer. This helps it arrive at more accurate results, especially for complex problems.
2. Who invented Chain of Thought prompting?
It was introduced by researchers at Google Brain in a 2022 paper titled “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,” authored by Jason Wei and colleagues.
3. Does Chain of Thought prompting work with all AI models?
It works best with large language models (100B+ parameters). Smaller models may see limited or no improvement, and in some cases, CoT can reduce accuracy in very small models.
4. What is zero-shot Chain of Thought?
Zero-shot CoT means you don’t provide any examples. You simply add a phrase like “Let’s think step by step” to your prompt, and the model generates its own reasoning chain.
5. Is Chain of Thought prompting the same as few-shot prompting?
Not exactly. Few-shot prompting provides examples to guide the model. Chain of Thought prompting adds reasoning steps to those examples, or uses a zero-shot instruction. The two are often combined.



Did you enjoy this article?