Prompt engineering has evolved from simple text instructions to a sophisticated discipline essential for building reliable AI applications. As models become more capable, effective prompting becomes the key differentiator between mediocre and exceptional AI implementations. This guide covers proven techniques for getting consistent, high-quality outputs from modern LLMs.
Core Principles
- Be Specific: Vague instructions produce vague outputs
- Provide Context: Give the model relevant background information
- Use Examples: Few-shot learning dramatically improves consistency
- Structure Output: Specify the exact format you need
- Iterate and Test: Prompts require refinement based on results
Structured Output Patterns
Getting LLMs to produce consistently structured output is crucial for production applications. Here are proven patterns that work across models.
# Pattern 1: JSON Schema Enforcement
import json
from openai import OpenAI
client = OpenAI()
def extract_entities(text: str) -> dict:
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[
{
"role": "system",
"content": """You are an entity extraction system. Extract entities from the
provided text and return them in the exact JSON format specified.
Output Format:
{
"people": [{"name": string, "role": string | null}],
"organizations": [{"name": string, "type": string | null}],
"locations": [{"name": string, "type": "city" | "country" | "address"}],
"dates": [{"text": string, "normalized": "YYYY-MM-DD" | null}]
}
Rules:
- Return ONLY valid JSON, no other text
- Use null for unknown values, not empty strings
- Normalize dates to ISO format when possible"""
},
{
"role": "user",
"content": f"Extract entities from: {text}"
}
],
response_format={"type": "json_object"},
temperature=0
)
return json.loads(response.choices[0].message.content)
# Usage
text = """Apple CEO Tim Cook announced the new iPhone at the company's
headquarters in Cupertino on September 12, 2024."""
entities = extract_entities(text)
print(json.dumps(entities, indent=2))
# Output:
# {
# "people": [{"name": "Tim Cook", "role": "CEO"}],
# "organizations": [{"name": "Apple", "type": "company"}],
# "locations": [{"name": "Cupertino", "type": "city"}],
# "dates": [{"text": "September 12, 2024", "normalized": "2024-09-12"}]
# }# Pattern 2: Few-Shot Learning for Consistent Formatting
def classify_support_ticket(ticket: str) -> dict:
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[
{
"role": "system",
"content": """You are a support ticket classifier. Analyze tickets and categorize them.
Categories: billing, technical, account, feature_request, other
Priority: low, medium, high, urgent
Sentiment: positive, neutral, negative, angry"""
},
# Few-shot examples
{
"role": "user",
"content": "Ticket: I can't log in to my account. I've tried resetting my password but it still doesn't work."
},
{
"role": "assistant",
"content": json.dumps({
"category": "account",
"priority": "high",
"sentiment": "negative",
"summary": "User unable to log in despite password reset",
"suggested_action": "Check for account lock, verify email"
})
},
{
"role": "user",
"content": "Ticket: Would be great if you could add dark mode to the app!"
},
{
"role": "assistant",
"content": json.dumps({
"category": "feature_request",
"priority": "low",
"sentiment": "positive",
"summary": "User requesting dark mode feature",
"suggested_action": "Add to feature backlog"
})
},
# Actual ticket to classify
{
"role": "user",
"content": f"Ticket: {ticket}"
}
],
response_format={"type": "json_object"},
temperature=0
)
return json.loads(response.choices[0].message.content)Chain-of-Thought Prompting
Chain-of-thought (CoT) prompting encourages the model to reason step by step, significantly improving accuracy on complex tasks. This is especially effective for math, logic, and multi-step reasoning.
# Chain-of-Thought Pattern
def analyze_code_complexity(code: str) -> dict:
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[
{
"role": "system",
"content": """You are a code analysis expert. Analyze code complexity step by step.
Follow this reasoning process:
1. Identify the main function/method
2. Count nested loops and conditionals
3. Identify recursive calls
4. Assess cyclomatic complexity factors
5. Consider readability and maintainability
6. Provide final assessment
Output your reasoning in a "thinking" field, then provide structured results."""
},
{
"role": "user",
"content": f"""Analyze this code:
```python
{code}
```
Output format:
{{
"thinking": "step by step reasoning...",
"complexity_score": 1-10,
"cyclomatic_complexity": number,
"issues": ["list of concerns"],
"recommendations": ["list of improvements"]
}}"""
}
],
response_format={"type": "json_object"},
temperature=0.2
)
return json.loads(response.choices[0].message.content)
# The "thinking" field captures the reasoning process
# This improves accuracy and provides transparencyRole Prompting and Personas
Assigning specific roles or personas to the model can dramatically improve output quality for specialized tasks.
# Expert Persona Pattern
SECURITY_EXPERT = """
You are a senior security engineer with 15 years of experience in:
- Application security (OWASP Top 10)
- Penetration testing
- Secure code review
- Compliance (SOC 2, PCI-DSS, HIPAA)
When reviewing code or configurations:
1. Think like an attacker - what could be exploited?
2. Consider both obvious and subtle vulnerabilities
3. Prioritize findings by actual risk, not theoretical concerns
4. Provide specific, actionable remediation steps
5. Reference relevant standards (CWE, OWASP) when applicable
Be thorough but practical. Focus on real-world exploitability.
"""
def security_review(code: str, language: str) -> dict:
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[
{"role": "system", "content": SECURITY_EXPERT},
{
"role": "user",
"content": f"""Review this {language} code for security vulnerabilities:
```{language}
{code}
```
Provide findings in this format:
{{
"risk_level": "critical" | "high" | "medium" | "low" | "info",
"findings": [
{{
"title": "string",
"severity": "critical" | "high" | "medium" | "low",
"line_numbers": [int],
"description": "string",
"cwe_id": "CWE-XXX" | null,
"remediation": "string",
"code_fix": "string" | null
}}
],
"summary": "string"
}}"""
}
],
response_format={"type": "json_object"},
temperature=0
)
return json.loads(response.choices[0].message.content)Production Prompt Management
Managing prompts in production requires versioning, testing, and monitoring. Here's a pattern for maintainable prompt management.
# Production Prompt Management System
from dataclasses import dataclass
from typing import Dict, Any, Optional
import hashlib
import json
@dataclass
class PromptTemplate:
"""Versioned prompt template with metadata"""
name: str
version: str
system_prompt: str
user_template: str
model: str = "gpt-4-turbo"
temperature: float = 0.7
max_tokens: Optional[int] = None
response_format: Optional[Dict] = None
@property
def hash(self) -> str:
"""Generate hash for tracking prompt versions"""
content = f"{self.system_prompt}{self.user_template}{self.model}{self.temperature}"
return hashlib.sha256(content.encode()).hexdigest()[:12]
def render(self, **variables) -> list:
"""Render the prompt with variables"""
return [
{"role": "system", "content": self.system_prompt},
{"role": "user", "content": self.user_template.format(**variables)}
]
class PromptRegistry:
"""Central registry for all prompts"""
def __init__(self):
self._prompts: Dict[str, PromptTemplate] = {}
def register(self, prompt: PromptTemplate):
key = f"{prompt.name}:v{prompt.version}"
self._prompts[key] = prompt
# Also register as latest
self._prompts[f"{prompt.name}:latest"] = prompt
def get(self, name: str, version: str = "latest") -> PromptTemplate:
key = f"{name}:v{version}" if version != "latest" else f"{name}:latest"
if key not in self._prompts:
raise KeyError(f"Prompt not found: {key}")
return self._prompts[key]
# Initialize registry
registry = PromptRegistry()
# Register prompts
registry.register(PromptTemplate(
name="entity_extraction",
version="1.0",
system_prompt="""You are an entity extraction system. Extract named entities
from text and return them as structured JSON.""",
user_template="Extract entities from: {text}",
temperature=0,
response_format={"type": "json_object"}
))
registry.register(PromptTemplate(
name="entity_extraction",
version="2.0",
system_prompt="""You are an advanced entity extraction system with improved
accuracy. Extract named entities and their relationships from text.""",
user_template="""Extract entities and relationships from the following text.
Include confidence scores for each extraction.
Text: {text}""",
temperature=0,
response_format={"type": "json_object"}
))
# Usage with version pinning
prompt = registry.get("entity_extraction", "1.0") # Pin to v1.0
# or
prompt = registry.get("entity_extraction") # Use latestHandling Edge Cases
# Robust Prompt with Fallback Handling
def analyze_with_fallback(text: str, max_retries: int = 3) -> dict:
"""Analyze text with retry and fallback logic"""
system_prompt = """
Analyze the provided text. If the text is:
- Empty or whitespace only: Return {"error": "empty_input", "result": null}
- Not in English: Return {"error": "unsupported_language", "detected_language": "xx", "result": null}
- Too short to analyze meaningfully: Return {"error": "insufficient_content", "result": null}
For valid input, return:
{
"error": null,
"result": {
"summary": "string",
"sentiment": "positive" | "neutral" | "negative",
"topics": ["string"],
"confidence": 0.0-1.0
}
}
"""
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": f"Analyze: {text}"}
],
response_format={"type": "json_object"},
temperature=0
)
result = json.loads(response.choices[0].message.content)
# Validate response structure
if "error" not in result or "result" not in result:
raise ValueError("Invalid response structure")
return result
except json.JSONDecodeError:
if attempt == max_retries - 1:
return {"error": "parse_error", "result": None}
except Exception as e:
if attempt == max_retries - 1:
return {"error": str(e), "result": None}
return {"error": "max_retries_exceeded", "result": None}Best Practices Summary
Prompt Engineering Checklist
Start with clear, specific instructions
Use system prompts to set context and constraints
Provide few-shot examples for consistent formatting
Request structured output (JSON) for reliable parsing
Use chain-of-thought for complex reasoning
Assign expert personas for specialized tasks
Handle edge cases explicitly in prompts
Version control your prompts
Test prompts with diverse inputs
Monitor output quality in production
Conclusion
Effective prompt engineering is both an art and a science. The techniques covered here - structured outputs, chain-of-thought reasoning, role prompting, and production management - form the foundation for building reliable AI applications. As models continue to evolve, these patterns will remain relevant while new techniques emerge.
Need help optimizing your AI prompts or building production AI systems? Contact Jishu Labs for expert AI development services.
About Sarah Johnson
Sarah Johnson is the CTO at Jishu Labs with extensive experience building AI systems. She has developed prompt engineering frameworks used by enterprise clients across industries.