Artificial Intelligence and Machine Learning Articles

Get In Touch For Details! Request More Information

Name

Email ID

Phone Number

Education Qualification

Current Profile

Select your interested program

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

What is DSPy? Build a Text-to-SQL App with Python (Complete Guide)

By Vaishali

Apr 10, 2026 5 Min Read 21 Views

(Last Updated)

Quick Answer: DSPy (Declarative Self-improving Python) is a framework that helps developers build reliable LLM applications by structuring prompts as programs instead of raw text. It enables automatic optimization, modular design, and evaluation-driven improvements. Using DSPy, you can build a Text-to-SQL app in Python that converts natural language queries into accurate SQL by defining modules, signatures, and optimization pipelines.

How do you convert a simple question like “Show total revenue last month” into a precise SQL query without manually crafting fragile prompts every time? Traditional prompt engineering often breaks in production due to inconsistencies and lack of evaluation.

This is where DSPy changes the game. Instead of treating prompts as static text, it treats them as structured programs that can be optimized, tested, and improved over time.

In this guide, you will learn how to build a Text-to-SQL app using DSPy in Python, step by step. By the end, you will understand how to create modular LLM pipelines that generate accurate SQL queries from natural language inputs.

What is DSPy?
What is a Text-to-SQL Application?
Step-by-Step Guide: Build a Text-to-SQL App with DSPy

Step 1: Install Dependencies
Step 2: Define the Signature
Step 3: Create a DSPy Module
Step 4: Configure the Language Model
Step 5: Add Training Examples
Step 6: Apply Optimizer
Step 7: Run the Application

Full Code Example
DSPy vs Traditional Prompt Engineering
Best Practices for Building a Text-to-SQL App with DSPy
Conclusion
FAQs

What is DSPy used for?
Is DSPy better than prompt engineering?
Do I need SQL knowledge for Text-to-SQL apps?
Can DSPy be used with any LLM?

What is DSPy?

DSPy (Declarative Self-improving Python) is a framework for building reliable LLM applications by treating prompts as structured programs instead of raw text. It uses signatures (input-output definitions), modules (reusable components), and optimizers (automatic improvements) to separate logic from prompt design.

What is a Text-to-SQL Application?

A Text-to-SQL application converts natural language queries into SQL statements.

Example:

Input: “Show top 5 products by sales”
Output:

SELECT product_name, SUM(sales)

FROM products

GROUP BY product_name

ORDER BY SUM(sales) DESC

LIMIT 5;

These applications are widely used in:

Business intelligence dashboards
Data analytics platforms
Chat-based database querying tools

Step-by-Step Guide: Build a Text-to-SQL App with DSPy

Step 1: Install Dependencies

The first step is to install the required Python packages. At minimum, you need the DSPy library. The official DSPy getting-started docs show installation with pip install -U dspy. If you want typed output validation, adding pedantic is also useful in many DSPy workflows.

You also need to prepare your LLM setup. DSPy works by connecting your program to a language model backend, so you must have access to a supported provider and an API key. In practice, this means selecting a model, exporting the provider key in your environment, and ensuring your Python environment can call that model successfully. DSPy’s tutorials show model setup through dspy.LM(…) and global configuration with dspy.configure(lm=lm).

pip install -U dspy

Step 2: Define the Signature

In DSPy, a signature defines the contract between the input and output of a task. For a Text-to-SQL app, the input is a natural language query, and the output is a SQL statement. This step is important because DSPy encourages you to declare what the system should do before worrying about the exact wording of prompts. That is a core part of its programming model.

A clean signature makes the Text-to-SQL pipeline easier to test, optimize, and reuse. You can keep it simple at first with just a user question and SQL output, then later extend it with database schema, business rules, or SQL dialect hints.

import dspy

class TextToSQL(dspy.Signature):

    """Convert a natural language question into a valid SQL query."""

    question: str = dspy.InputField(desc="User's natural language database question")

    sql: str = dspy.OutputField(desc="Executable SQL query")

Step 3: Create a DSPy Module

Once the signature is defined, wrap the logic in a reusable DSPy module, the core building block of the application. Instead of relying on a single prompt, modules enable reusable, inspectable, and independently improvable components, making the system more robust than brittle prompt strings.

For a Text-to-SQL app, use dspy.Predict inside a custom module to take a user query and schema context as input and generate SQL. This modular design supports scalability, allowing easy addition of validation, query explanation, and execution checks later.

class TextToSQLModule(dspy.Module):

    def __init__(self):

        super().__init__()

        self.generator = dspy.Predict(TextToSQL)

    def forward(self, question: str):

        return self.generator(question=question)

Step 4: Configure the Language Model

After creating the module, configure the language model DSPy will use. DSPy tutorials show this pattern clearly: initialize the LM, then register it using dspy.configure. This central configuration allows the rest of the DSPy program to call the model consistently.

You must set your API key before creating the model object. In a real application, keep credentials in environment variables rather than hardcoding them. You should also choose a model that is good at structured generation, because SQL tasks depend on syntax precision and schema alignment.

import os

import dspy

os.environ["OPENAI_API_KEY"] = "your_api_key_here"

lm = dspy.LM("openai/gpt-4o-mini")

dspy.configure(lm=lm)

Step 5: Add Training Examples

DSPy becomes much more powerful when you give it examples. Instead of manually rewriting prompts, you provide a small training set of natural-language questions and their expected SQL outputs. DSPy can then use those examples to improve how the program behaves. Its FAQ and documentation emphasize that DSPy programs can be compiled into optimized prompts or weights based on your pipeline and your data.

For a Text-to-SQL app, your examples should cover different query types such as filtering, aggregation, sorting, grouping, and date-based conditions. Better example coverage generally leads to better SQL generation quality.

trainset = [

    dspy.Example(

        question="Show all customers from Mumbai",

        sql="SELECT * FROM customers WHERE city = 'Mumbai';"

    ).with_inputs("question"),

    dspy.Example(

        question="Count total orders placed in 2025",

        sql="SELECT COUNT(*) FROM orders WHERE YEAR(order_date) = 2025;"

    ).with_inputs("question"),

    dspy.Example(

        question="Get the top 5 products by sales",

        sql="SELECT product_name, SUM(sales) AS total_sales FROM products GROUP BY product_name ORDER BY total_sales DESC LIMIT 5;"

    ).with_inputs("question"),

]

Step 6: Apply Optimizer

This is the step where DSPy stands out. The framework is designed to automatically optimize prompts and program behavior using your examples and evaluation signals, instead of relying only on hand-tuned prompt engineering. DSPy explicitly positions this capability as a key difference from prompt-only workflows.

In practical terms, you define your module, prepare examples, and then compile or optimize the program so DSPy can learn a better prompting strategy for the task. This is especially useful in Text-to-SQL, where small changes in wording can greatly affect SQL correctness.

from dspy.teleprompt import BootstrapFewShot

optimizer = BootstrapFewShot()

compiled_text_to_sql = optimizer.compile(

    student=TextToSQLModule(),

    trainset=trainset

)

Step 7: Run the Application

Once optimized, test the app with real queries. The DSPy Text-to-SQL system converts plain English into SQL, which can be displayed, validated, or executed after safety checks. Since SQL is executable, validation is essential in production. DSPy also emphasizes debugging and transparency as systems scale.

question = "List the names of customers who placed more than 3 orders"

result = compiled_text_to_sql(question=question)

print("Generated SQL:")

print(result.sql)

Full Code Example

import os

import dspy

from dspy.teleprompt import BootstrapFewShot

# Step 1: Set API key

os.environ["OPENAI_API_KEY"] = "your_api_key_here"

# Step 2: Configure LM

lm = dspy.LM("openai/gpt-4o-mini")

dspy.configure(lm=lm)

# Step 3: Define signature

class TextToSQL(dspy.Signature):

    """Convert a natural language question into a valid SQL query."""

    question: str = dspy.InputField(desc="User's natural language database question")

    sql: str = dspy.OutputField(desc="Executable SQL query")

# Step 4: Create module

class TextToSQLModule(dspy.Module):

    def __init__(self):

        super().__init__()

        self.generator = dspy.Predict(TextToSQL)

    def forward(self, question: str):

        return self.generator(question=question)

# Step 5: Add training examples

trainset = [

    dspy.Example(

        question="Show all customers from Mumbai",

        sql="SELECT * FROM customers WHERE city = 'Mumbai';"

    ).with_inputs("question"),

    dspy.Example(

        question="Count total orders placed in 2025",

        sql="SELECT COUNT(*) FROM orders WHERE YEAR(order_date) = 2025;"

    ).with_inputs("question"),

    dspy.Example(

        question="Get the top 5 products by sales",

        sql="SELECT product_name, SUM(sales) AS total_sales FROM products GROUP BY product_name ORDER BY total_sales DESC LIMIT 5;"

    ).with_inputs("question"),

]

# Step 6: Apply optimizer

optimizer = BootstrapFewShot()

compiled_text_to_sql = optimizer.compile(

    student=TextToSQLModule(),

    trainset=trainset

)

# Step 7: Run the application

question = "List the names of customers who placed more than 3 orders"

result = compiled_text_to_sql(question=question)

print("Generated SQL:")

print(result.sql)

DSPy vs Traditional Prompt Engineering

Feature	DSPy	Prompt Engineering
Structure	Programmatic (signatures, modules)	Manual prompt writing
Optimization	Automated via optimizers	Manual iteration
Scalability	High, modular and reusable	Limited and hard to scale
Reliability	Strong, evaluation-driven	Inconsistent and fragile
Maintainability	Easy to update components independently	Difficult to manage large prompts
Debugging	Structured and traceable	Hard to debug prompt behavior
Reusability	High, supports modular pipelines	Low, prompts are often task-specific

Build production-ready AI applications like Text-to-SQL systems with structured learning and expert guidance. Join HCL GUVI’s Artificial Intelligence and Machine Learning Program to learn from industry experts and Intel engineers through live online classes, master Python, SQL, ML, MLOps, Generative AI, and Agentic AI, and gain hands-on experience with 20+ industry-grade projects, 1:1 doubt sessions, and placement support with 1000+ hiring partners.

Best Practices for Building a Text-to-SQL App with DSPy

Use Clear and Structured Schema Context: Always provide the database schema (tables, columns, relationships) in a clean and structured format so the model generates accurate SQL without hallucinating fields.
Design Strong Signatures: Define precise input-output contracts in your DSPy signatures to ensure consistent and predictable SQL generation across different queries.
Evaluate Outputs Regularly: Use test datasets and evaluation metrics to continuously monitor accuracy and identify failure patterns.
Keep Modules Modular and Reusable: Break your pipeline into smaller DSPy modules so you can debug and scale components independently.
Handle Edge Cases and Ambiguity: Add fallback logic or clarification steps when user queries are vague or incomplete.
Optimize for SQL Dialect Compatibility: Ensure generated queries match your database system (MySQL, PostgreSQL, etc.) to avoid syntax errors in production.

Conclusion

Building a Text-to-SQL app with DSPy demonstrates how structured LLM programming outperforms traditional prompt engineering. By using signatures, modules, and optimizers, you can create scalable and reliable AI systems. As AI adoption grows, frameworks like DSPy will become essential for production-ready applications.

FAQs

What is DSPy used for?

DSPy is used to build structured, reliable LLM applications such as chatbots, RAG systems, and Text-to-SQL tools.

Is DSPy better than prompt engineering?

Yes, for production use cases because it enables automation and scalability.

Do I need SQL knowledge for Text-to-SQL apps?

Basic SQL understanding is recommended to validate outputs.

Can DSPy be used with any LLM?

Yes, DSPy supports multiple LLM providers through configurable interfaces.

Success Stories

About the Author

Vaishali

I'm a seasoned writer with four years of experience across technical, non-technical, and just about every genre or niche you can imagine. Adaptable and curious, I enjoy exploring new topics and making information engaging and easy to understand. Fueled by a steady stream of tea, I approach each project with creativity, reliability, and genuine enthusiasm for storytelling.

View all posts by Vaishali

Did you enjoy this article?

Recommended Courses

Artificial Intelligence and Machine Learning Course

Available in

English

Blog Categories

Interview Questions

Artificial Intelligence and Machine Learning Articles

What is DSPy? Build a Text-to-SQL App with Python (Complete Guide)

Table of contents

What is DSPy?

What is a Text-to-SQL Application?

Step-by-Step Guide: Build a Text-to-SQL App with DSPy

Step 1: Install Dependencies

Step 2: Define the Signature

Step 3: Create a DSPy Module

Step 4: Configure the Language Model

Step 5: Add Training Examples

Step 6: Apply Optimizer

Step 7: Run the Application

Full Code Example

DSPy vs Traditional Prompt Engineering

Best Practices for Building a Text-to-SQL App with DSPy

Conclusion

FAQs

What is DSPy used for?

Is DSPy better than prompt engineering?

Do I need SQL knowledge for Text-to-SQL apps?

Can DSPy be used with any LLM?

Success Stories

About the Author

Vaishali

Did you enjoy this article?

Recommended Courses

Most Popular

Artificial Intelligence and Machine Learning Course

Syllabus

Know More

Chatgpt for Everyone

Natural Language Processing Us...

Dalle in French

Machine Learning and AI Servic...

ChatGPT for Programmers

Keras for Beginners

Keras for Beginners in Hindi

Keras for Beginners in Telugu

Deep learning using Pytorch

Deep learning using Pytorch

Practical Machine Learning

Building a Virtual AI Assistan...

Schedule 1:1 free counselling

Similar Articles

Artificial Intelligence and Machine Learning Articles