arrow_backBACK TO BECOME AN AI ENGINEER — PRACTICAL GUIDE
Lesson 06Become an AI Engineer — Practical Guide13 min read

Capstone: Build an AI Content Creation Pipeline

April 17, 2026

TL;DR

The capstone ties everything together: deep research with parallel sub-agents (L3+L4), RAG for brand-voice grounding (L2), LLM APIs with streaming for article writing (L1), reasoning models for planning and review (L4), multi-modal generation for images and audio (L5), plus a full evaluation pipeline and FastAPI backend. One project, every skill.

Capstone: Build an AI Content Creation Pipeline

You’ve built a playground, a chatbot, a web agent, a research system, and a multi-modal generator. Now you’ll combine every single technique into one production-grade system.

The project: an AI Content Creation Pipeline that takes a topic and produces a complete, published article — researched, written, illustrated, narrated, and evaluated — without human intervention.

What You’re Building

AI Content Pipeline Architecture

The pipeline has five phases, each drawing on a different lesson:

Phase What Happens Lessons Used
① Research Plan research strategy, search the web in parallel, extract facts with citations L3 (Agents), L4 (Deep Research)
② Ground Retrieve brand guidelines and past content via RAG, match tone/style L2 (RAG)
③ Write Generate outline with reasoning model, draft section-by-section with streaming L1 (LLM APIs), L4 (Reasoning)
④ Enrich Generate cover image, section images, audio narration L5 (Multi-modal)
⑤ Evaluate Fact-check, RAG triad evaluation, plagiarism detection, quality gates L2 (Evaluation), L4 (Verification)

Skills Map — Every Lesson in One Project


Project Setup

mkdir ai-content-pipeline && cd ai-content-pipeline
python -m venv venv
source venv/bin/activate

pip install openai anthropic httpx chromadb tiktoken \
    tavily-python beautifulsoup4 fastapi uvicorn \
    python-dotenv pydantic
# .env
OPENAI_API_KEY=sk-your-key
ANTHROPIC_API_KEY=sk-ant-your-key
TAVILY_API_KEY=tvly-your-key
REPLICATE_API_TOKEN=r8-your-key

Directory Structure

ai-content-pipeline/
├── .env
├── server.py              # FastAPI backend
├── pipeline.py            # Main orchestrator
├── research/
│   ├── planner.py         # Research planning (o3)
│   ├── searcher.py        # Web search sub-agents
│   └── synthesizer.py     # Fact synthesis
├── grounding/
│   ├── indexer.py          # RAG: index brand docs
│   ├── retriever.py        # RAG: retrieve context
│   └── style_matcher.py   # Tone/voice matching
├── writer/
│   ├── outliner.py         # Article outline (o3)
│   ├── drafter.py          # Section-by-section writing
│   └── reviewer.py         # Self-review + revision
├── enrichment/
│   ├── image_gen.py        # Cover + section images
│   ├── audio_gen.py        # TTS narration
│   └── prompt_enhancer.py  # Modality-specific prompts
├── evaluation/
│   ├── fact_checker.py     # Claim verification
│   ├── quality_scorer.py   # RAG triad + plagiarism
│   └── cost_tracker.py     # Token/cost accounting
└── knowledge_base/         # Brand docs for RAG
    ├── brand_guidelines.md
    ├── style_guide.md
    └── past_articles/

Phase 1: Research (Lessons 3 + 4)

This phase uses tool calling (L3), parallel sub-agents (L4), and reasoning models (L4) to research a topic from scratch.

Research Planner

# research/planner.py
import json
from openai import OpenAI

client = OpenAI()


def plan_research(topic: str, target_depth: str = "comprehensive") -> dict:
    """Use a reasoning model to decompose a topic into research subtasks."""
    response = client.chat.completions.create(
        model="o3",
        messages=[{
            "role": "system",
            "content": f"""You are a research director planning an article.
Decompose this topic into 3-5 independent research subtasks.

Target depth: {target_depth}

Return JSON:
{{
  "title_suggestion": "Suggested article title",
  "angle": "What makes this article unique/valuable",
  "subtasks": [
    {{
      "id": 1,
      "focus": "What this subtask covers",
      "search_queries": ["query1", "query2"],
      "key_questions": ["What to find out"]
    }}
  ],
  "target_word_count": 1500
}}"""
        }, {
            "role": "user",
            "content": f"Plan research for an article about: {topic}"
        }],
        response_format={"type": "json_object"},
    )
    return json.loads(response.choices[0].message.content)

Parallel Web Searcher

# research/searcher.py
import json
import asyncio
from openai import AsyncOpenAI
import httpx
from bs4 import BeautifulSoup
import os

async_client = AsyncOpenAI()


async def search_web(query: str, num_results: int = 5) -> list[dict]:
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.tavily.com/search",
            json={
                "api_key": os.getenv("TAVILY_API_KEY"),
                "query": query,
                "max_results": num_results,
            },
            timeout=15,
        )
    return response.json().get("results", [])


async def fetch_page(url: str) -> str:
    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(url, timeout=15, follow_redirects=True)
        soup = BeautifulSoup(response.text, "html.parser")
        for tag in soup(["script", "style", "nav", "footer"]):
            tag.decompose()
        return soup.get_text(separator="\n", strip=True)[:3000]
    except Exception:
        return ""


async def execute_subtask(subtask: dict) -> dict:
    """Run one research subtask: search + read + summarize."""
    all_results = []
    for query in subtask["search_queries"]:
        results = await search_web(query)
        all_results.extend(results)

    pages = []
    for result in all_results[:4]:
        content = await fetch_page(result["url"])
        if content:
            pages.append({
                "url": result["url"],
                "title": result["title"],
                "content": content,
            })

    pages_text = json.dumps(pages, indent=2)[:6000]
    response = await async_client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": "Summarize these search results into key findings. "
                       "Include specific data, quotes, and source URLs for every claim."
        }, {
            "role": "user",
            "content": f"Task: {subtask['focus']}\n\n"
                       f"Key questions: {subtask['key_questions']}\n\n"
                       f"Sources:\n{pages_text}"
        }],
        temperature=0.2,
    )

    return {
        "subtask_id": subtask["id"],
        "focus": subtask["focus"],
        "findings": response.choices[0].message.content,
        "sources": [p["url"] for p in pages],
    }


async def research_all(subtasks: list[dict]) -> list[dict]:
    """Run all subtasks in parallel."""
    results = await asyncio.gather(
        *[execute_subtask(st) for st in subtasks],
        return_exceptions=True,
    )
    return [r for r in results if not isinstance(r, Exception)]

Phase 2: Ground with RAG (Lesson 2)

Before writing, retrieve brand guidelines and past content so the article matches your voice.

Knowledge Base Indexer

# grounding/indexer.py
import os
import chromadb
from openai import OpenAI
from pathlib import Path

client = OpenAI()
chroma = chromadb.PersistentClient(path="./chroma_db")


def chunk_text(text: str, chunk_size: int = 500, overlap: int = 50) -> list[str]:
    chunks = []
    start = 0
    while start < len(text):
        end = start + chunk_size
        chunks.append(text[start:end])
        start = end - overlap
    return [c.strip() for c in chunks if c.strip()]


def index_knowledge_base(kb_dir: str = "./knowledge_base"):
    collection = chroma.get_or_create_collection(
        name="brand_knowledge",
        metadata={"hnsw:space": "cosine"},
    )

    all_chunks = []
    all_metas = []
    all_ids = []
    idx = 0

    for root, dirs, files in os.walk(kb_dir):
        for fname in files:
            if not fname.endswith((".md", ".txt")):
                continue
            filepath = os.path.join(root, fname)
            text = Path(filepath).read_text(encoding="utf-8")
            chunks = chunk_text(text)
            for chunk in chunks:
                all_chunks.append(chunk)
                all_metas.append({"source": fname, "type": "brand"})
                all_ids.append(f"kb_{idx}")
                idx += 1

    if not all_chunks:
        return collection

    batch_size = 100
    for i in range(0, len(all_chunks), batch_size):
        batch = all_chunks[i:i + batch_size]
        resp = client.embeddings.create(model="text-embedding-3-small", input=batch)
        embeddings = [e.embedding for e in resp.data]
        collection.add(
            ids=all_ids[i:i + batch_size],
            documents=batch,
            embeddings=embeddings,
            metadatas=all_metas[i:i + batch_size],
        )

    print(f"Indexed {len(all_chunks)} chunks from knowledge base")
    return collection

Style Matcher

# grounding/style_matcher.py
import chromadb
from openai import OpenAI

client = OpenAI()
chroma = chromadb.PersistentClient(path="./chroma_db")


def retrieve_brand_context(topic: str, top_k: int = 5) -> list[str]:
    """Retrieve relevant brand guidelines and past content."""
    collection = chroma.get_collection("brand_knowledge")
    resp = client.embeddings.create(
        model="text-embedding-3-small", input=[topic]
    )
    results = collection.query(
        query_embeddings=[resp.data[0].embedding],
        n_results=top_k,
    )
    return results["documents"][0]


def build_style_prompt(brand_context: list[str]) -> str:
    """Build a style instruction from retrieved brand docs."""
    context_block = "\n\n---\n\n".join(brand_context)
    return f"""Match the following brand voice and style guidelines:

{context_block}

Apply these guidelines consistently throughout the article. Match the tone,
terminology, and formatting conventions shown above."""

Phase 3: Write the Article (Lessons 1 + 4)

Uses reasoning models for outlining (L4), multi-provider LLM APIs for drafting (L1), and reflection for self-review (L4).

Outliner

# writer/outliner.py
import json
from openai import OpenAI

client = OpenAI()


def generate_outline(topic: str, research_findings: str,
                     style_prompt: str, word_count: int = 1500) -> dict:
    """Use a reasoning model to create a detailed article outline."""
    response = client.chat.completions.create(
        model="o3",
        messages=[{
            "role": "system",
            "content": f"""Create a detailed article outline.

{style_prompt}

Target: ~{word_count} words.

Return JSON:
{{
  "title": "Article title",
  "subtitle": "One-line subtitle",
  "sections": [
    {{
      "heading": "Section heading",
      "key_points": ["point1", "point2"],
      "target_words": 300,
      "needs_image": true/false,
      "image_concept": "what the image should show (if needed)"
    }}
  ],
  "meta_description": "SEO meta description (under 160 chars)"
}}"""
        }, {
            "role": "user",
            "content": f"Topic: {topic}\n\nResearch findings:\n{research_findings}"
        }],
        response_format={"type": "json_object"},
    )
    return json.loads(response.choices[0].message.content)

Section Drafter with Streaming

# writer/drafter.py
from openai import OpenAI

client = OpenAI()


def draft_section(heading: str, key_points: list[str],
                  research_context: str, style_prompt: str,
                  target_words: int = 300, stream: bool = False):
    """Draft a single article section."""
    messages = [{
        "role": "system",
        "content": f"""You are writing one section of an article.

{style_prompt}

Rules:
- Write exactly the section described, nothing more
- Include specific data points and cite sources as [Source Name](URL)
- Target ~{target_words} words
- Use markdown formatting (headers, bold, lists where appropriate)
- Be engaging and informative — not generic"""
    }, {
        "role": "user",
        "content": f"Section: {heading}\n\n"
                   f"Key points to cover:\n"
                   + "\n".join(f"- {p}" for p in key_points)
                   + f"\n\nResearch context:\n{research_context}"
    }]

    if stream:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            temperature=0.5,
            stream=True,
        )
        full_text = ""
        for chunk in response:
            if chunk.choices[0].delta.content:
                text = chunk.choices[0].delta.content
                full_text += text
                yield text
        return full_text
    else:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            temperature=0.5,
        )
        return response.choices[0].message.content


def draft_full_article(outline: dict, research_findings: str,
                       style_prompt: str) -> str:
    """Draft the complete article section by section."""
    sections = []
    sections.append(f"# {outline['title']}\n\n*{outline.get('subtitle', '')}*\n")

    for section in outline["sections"]:
        print(f"  Drafting: {section['heading']}...")
        text = draft_section(
            heading=section["heading"],
            key_points=section["key_points"],
            research_context=research_findings,
            style_prompt=style_prompt,
            target_words=section.get("target_words", 300),
        )
        sections.append(text)

    return "\n\n".join(sections)

Self-Review with Reflection

# writer/reviewer.py
from openai import OpenAI

client = OpenAI()


def review_article(article: str, research_findings: str,
                   style_prompt: str, max_revisions: int = 2) -> str:
    """Use reflection to review and improve the article."""
    current = article

    for i in range(max_revisions):
        critique = client.chat.completions.create(
            model="o3",
            messages=[{
                "role": "system",
                "content": """Review this article for:
1. Factual accuracy — are claims supported by the research?
2. Completeness — are key points covered?
3. Flow — does it read naturally?
4. Citations — are all claims sourced?
5. Engagement — is it interesting?

If the article is good, respond with "APPROVED".
Otherwise, list specific improvements needed."""
            }, {
                "role": "user",
                "content": f"Article:\n{current}\n\nResearch:\n{research_findings[:3000]}"
            }],
        ).choices[0].message.content

        if "APPROVED" in critique.upper():
            print(f"  Article approved after {i} revisions")
            return current

        print(f"  Revision {i + 1}: applying feedback...")
        current = client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": f"""Revise this article based on the editorial feedback.
{style_prompt}
Make the requested improvements while keeping the overall structure.
Only return the revised article, no commentary."""
            }, {
                "role": "user",
                "content": f"Article:\n{current}\n\nFeedback:\n{critique}"
            }],
        ).choices[0].message.content

    return current

Phase 4: Enrich with Multi-modal (Lesson 5)

Generate visual assets and audio narration.

Image Generation

# enrichment/image_gen.py
from openai import OpenAI

client = OpenAI()


def enhance_image_prompt(concept: str, article_title: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "system",
            "content": "Create a detailed image generation prompt. Include style, "
                       "composition, lighting, and mood. Under 100 words."
        }, {
            "role": "user",
            "content": f"Article: {article_title}\nImage concept: {concept}"
        }],
        temperature=0.7,
    )
    return response.choices[0].message.content


def generate_article_images(outline: dict) -> list[dict]:
    """Generate images for sections that need them."""
    images = []

    # Cover image
    cover_prompt = enhance_image_prompt(
        f"Cover image for article: {outline['title']}",
        outline["title"],
    )
    cover = client.images.generate(
        model="dall-e-3",
        prompt=cover_prompt,
        size="1792x1024",
        quality="hd",
        n=1,
    )
    images.append({
        "type": "cover",
        "url": cover.data[0].url,
        "prompt": cover.data[0].revised_prompt,
    })

    # Section images
    for section in outline["sections"]:
        if not section.get("needs_image"):
            continue
        prompt = enhance_image_prompt(
            section.get("image_concept", section["heading"]),
            outline["title"],
        )
        result = client.images.generate(
            model="dall-e-3",
            prompt=prompt,
            size="1024x1024",
            quality="standard",
            n=1,
        )
        images.append({
            "type": "section",
            "section": section["heading"],
            "url": result.data[0].url,
            "prompt": result.data[0].revised_prompt,
        })

    return images

Audio Narration

# enrichment/audio_gen.py
import re
from openai import OpenAI

client = OpenAI()


def prepare_for_tts(markdown_text: str) -> str:
    """Clean markdown for natural speech."""
    text = re.sub(r'#+ ', '', markdown_text)
    text = re.sub(r'\*\*(.+?)\*\*', r'\1', text)
    text = re.sub(r'\*(.+?)\*', r'\1', text)
    text = re.sub(r'\[(.+?)\]\(.+?\)', r'\1', text)
    text = re.sub(r'^\s*[-*]\s+', '', text, flags=re.MULTILINE)
    text = re.sub(r'\|.+\|', '', text)
    text = re.sub(r'\n{3,}', '\n\n', text)
    return text.strip()


def generate_narration(article: str, voice: str = "nova") -> dict:
    """Generate audio narration of the article."""
    clean_text = prepare_for_tts(article)

    if len(clean_text) > 4096:
        clean_text = clean_text[:4096]

    response = client.audio.speech.create(
        model="tts-1-hd",
        voice=voice,
        input=clean_text,
    )

    output_path = "/tmp/article_narration.mp3"
    response.stream_to_file(output_path)

    return {
        "file_path": output_path,
        "voice": voice,
        "text_length": len(clean_text),
    }

Phase 5: Evaluate (Lessons 2 + 4)

Quality gates before publishing.

Fact Checker

# evaluation/fact_checker.py
import json
from openai import OpenAI

client = OpenAI()


def extract_claims(article: str) -> list[str]:
    """Extract verifiable factual claims from the article."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f"Extract all specific factual claims from this article. "
                       f"Only include claims that state a fact (number, date, name, comparison). "
                       f"Return as a JSON array of strings.\n\n{article[:4000]}"
        }],
        response_format={"type": "json_object"},
        temperature=0.0,
    )
    data = json.loads(response.choices[0].message.content)
    return data.get("claims", [])


def verify_claims(claims: list[str], research_findings: str) -> list[dict]:
    """Verify each claim against research sources."""
    results = []
    for claim in claims[:10]:
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{
                "role": "user",
                "content": f"Is this claim supported by the research context? "
                           f"Respond with JSON: {{\"supported\": true/false, \"evidence\": \"...\"}}\n\n"
                           f"Claim: {claim}\n\nResearch:\n{research_findings[:3000]}"
            }],
            response_format={"type": "json_object"},
            temperature=0.0,
        )
        data = json.loads(response.choices[0].message.content)
        results.append({"claim": claim, **data})

    supported = sum(1 for r in results if r.get("supported"))
    return {
        "claims": results,
        "total": len(results),
        "supported": supported,
        "score": supported / len(results) if results else 0,
    }

Quality Scorer

# evaluation/quality_scorer.py
from openai import OpenAI

client = OpenAI()


def score_article_quality(article: str, research: str, brand_context: str) -> dict:
    """Comprehensive quality evaluation."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": """Score this article on each dimension (0-10).
Return JSON:
{
  "factual_accuracy": {"score": N, "notes": "..."},
  "completeness": {"score": N, "notes": "..."},
  "readability": {"score": N, "notes": "..."},
  "brand_voice_match": {"score": N, "notes": "..."},
  "citation_quality": {"score": N, "notes": "..."},
  "originality": {"score": N, "notes": "..."},
  "overall": N,
  "publish_ready": true/false,
  "improvements_needed": ["..."]
}"""
        }, {
            "role": "user",
            "content": f"Article:\n{article[:4000]}\n\n"
                       f"Research basis:\n{research[:2000]}\n\n"
                       f"Brand guidelines:\n{brand_context[:1000]}"
        }],
        response_format={"type": "json_object"},
        temperature=0.2,
    )
    import json
    return json.loads(response.choices[0].message.content)

Cost Tracker

# evaluation/cost_tracker.py
from dataclasses import dataclass, field


@dataclass
class CostTracker:
    entries: list = field(default_factory=list)

    pricing = {
        "o3": {"input": 10.00, "output": 40.00},
        "gpt-4o": {"input": 2.50, "output": 10.00},
        "gpt-4o-mini": {"input": 0.15, "output": 0.60},
        "text-embedding-3-small": {"input": 0.02, "output": 0.0},
        "dall-e-3-hd": {"per_image": 0.08},
        "dall-e-3-standard": {"per_image": 0.04},
        "tts-1-hd": {"per_1m_chars": 30.00},
        "tavily-search": {"per_search": 0.01},
    }

    def add(self, phase: str, model: str, input_tokens: int = 0,
            output_tokens: int = 0, units: int = 0):
        p = self.pricing.get(model, {})
        if "per_image" in p:
            cost = p["per_image"] * units
        elif "per_1m_chars" in p:
            cost = p["per_1m_chars"] * units / 1_000_000
        elif "per_search" in p:
            cost = p["per_search"] * units
        else:
            cost = (input_tokens * p.get("input", 0) +
                    output_tokens * p.get("output", 0)) / 1_000_000
        self.entries.append({
            "phase": phase,
            "model": model,
            "cost": cost,
        })

    def summary(self) -> dict:
        total = sum(e["cost"] for e in self.entries)
        by_phase = {}
        for e in self.entries:
            by_phase[e["phase"]] = by_phase.get(e["phase"], 0) + e["cost"]
        return {"total_cost": total, "by_phase": by_phase, "entries": self.entries}

The Complete Pipeline

# pipeline.py
import asyncio
import json
import time
from research.planner import plan_research
from research.searcher import research_all
from grounding.style_matcher import retrieve_brand_context, build_style_prompt
from writer.outliner import generate_outline
from writer.drafter import draft_full_article
from writer.reviewer import review_article
from enrichment.image_gen import generate_article_images
from enrichment.audio_gen import generate_narration
from evaluation.fact_checker import extract_claims, verify_claims
from evaluation.quality_scorer import score_article_quality
from evaluation.cost_tracker import CostTracker


class ContentPipeline:
    def __init__(self):
        self.cost = CostTracker()

    async def run(self, topic: str) -> dict:
        start = time.time()
        print(f"Starting content pipeline: {topic}\n")

        # ── Phase 1: Research ──
        print("Phase 1: Research...")
        plan = plan_research(topic)
        print(f"  Plan: {plan['title_suggestion']}")
        print(f"  Subtasks: {len(plan['subtasks'])}")

        findings = await research_all(plan["subtasks"])
        research_text = "\n\n".join(f["findings"] for f in findings)
        all_sources = []
        for f in findings:
            all_sources.extend(f.get("sources", []))
        sources = list(dict.fromkeys(s for s in all_sources if s))
        print(f"  Found {len(sources)} unique sources")

        # ── Phase 2: Ground ──
        print("\nPhase 2: Grounding with brand context...")
        brand_docs = retrieve_brand_context(topic)
        style_prompt = build_style_prompt(brand_docs)
        print(f"  Retrieved {len(brand_docs)} brand context chunks")

        # ── Phase 3: Write ──
        print("\nPhase 3: Writing article...")
        outline = generate_outline(
            topic, research_text, style_prompt,
            plan.get("target_word_count", 1500),
        )
        print(f"  Outline: {len(outline['sections'])} sections")

        draft = draft_full_article(outline, research_text, style_prompt)
        print(f"  Draft complete: {len(draft.split())} words")

        article = review_article(draft, research_text, style_prompt)
        print(f"  Review complete: {len(article.split())} words")

        # ── Phase 4: Enrich ──
        print("\nPhase 4: Generating multi-modal assets...")
        images = generate_article_images(outline)
        print(f"  Generated {len(images)} images")

        audio = generate_narration(article)
        print(f"  Generated audio narration")

        # ── Phase 5: Evaluate ──
        print("\nPhase 5: Evaluating quality...")
        claims_result = verify_claims(
            extract_claims(article), research_text
        )
        print(f"  Fact check: {claims_result['supported']}/{claims_result['total']} claims verified")

        quality = score_article_quality(
            article, research_text, "\n".join(brand_docs[:3])
        )
        print(f"  Quality score: {quality.get('overall', 'N/A')}/10")
        print(f"  Publish ready: {quality.get('publish_ready', False)}")

        elapsed = time.time() - start

        # ── Compile output ──
        output = {
            "article": {
                "title": outline["title"],
                "subtitle": outline.get("subtitle", ""),
                "content": article,
                "meta_description": outline.get("meta_description", ""),
                "word_count": len(article.split()),
            },
            "assets": {
                "images": images,
                "audio": audio,
            },
            "sources": sources,
            "evaluation": {
                "fact_check": claims_result,
                "quality": quality,
            },
            "metadata": {
                "topic": topic,
                "generation_time_seconds": elapsed,
                "cost": self.cost.summary(),
            },
        }

        print(f"\nDone in {elapsed:.1f}s")
        return output


async def main():
    pipeline = ContentPipeline()
    result = await pipeline.run(
        "The rise of AI coding assistants: how they work, "
        "which ones to use, and what it means for developers"
    )

    print("\n" + "=" * 70)
    print(f"Title: {result['article']['title']}")
    print(f"Words: {result['article']['word_count']}")
    print(f"Images: {len(result['assets']['images'])}")
    print(f"Sources: {len(result['sources'])}")
    print(f"Quality: {result['evaluation']['quality'].get('overall', 'N/A')}/10")
    print(f"Time: {result['metadata']['generation_time_seconds']:.0f}s")

    with open("output.json", "w") as f:
        json.dump(result, f, indent=2, default=str)
    print("\nFull output saved to output.json")


if __name__ == "__main__":
    asyncio.run(main())

FastAPI Backend

# server.py
import asyncio
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from pipeline import ContentPipeline

app = FastAPI(title="AI Content Pipeline")
app.add_middleware(
    CORSMiddleware, allow_origins=["*"],
    allow_methods=["*"], allow_headers=["*"],
)


class GenerateRequest(BaseModel):
    topic: str
    depth: str = "comprehensive"


@app.post("/generate")
async def generate(req: GenerateRequest):
    pipeline = ContentPipeline()
    result = await pipeline.run(req.topic)
    return result


@app.get("/health")
def health():
    return {"status": "ok"}
# Run the API
uvicorn server:app --reload --port 8000

# Test it
curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"topic": "How reasoning models are changing AI engineering"}'

Cost Analysis

Cost Breakdown Per Article

Phase Model(s) Est. Cost
Research planning o3 $0.25–0.35
Web search (4 subtasks) Tavily + gpt-4o $0.06–0.10
RAG retrieval text-embedding-3-small $0.01–0.02
Outline generation o3 $0.10–0.20
Section drafting gpt-4o $0.15–0.30
Self-review o3 + gpt-4o $0.15–0.35
Image generation (3×) DALL·E 3 $0.12–0.24
Audio narration tts-1-hd $0.03–0.06
Evaluation gpt-4o + gpt-4o-mini $0.05–0.10
Total $0.92–1.72

That’s a comprehensive, researched, illustrated, narrated article for about a dollar.


Extension Ideas

Once your pipeline works, consider these enhancements:

Extension What It Adds Complexity
SEO optimization Keyword analysis, meta tags, internal linking Low
Social media variants Twitter thread, LinkedIn post, Instagram caption from same research Medium
Scheduled publishing Cron-based pipeline that generates content on a schedule Medium
Human-in-the-loop Pause after outline for human approval before drafting Medium
Multi-language Translate the finished article into multiple languages Low
A/B headline testing Generate 5 headlines, test with audience Medium
Video summary Generate a 30-second video summary with Runway/Sora High
Real-time topics Monitor news feeds, auto-generate articles on trending topics High

Key Takeaways

  1. Real AI engineering is orchestration — the hard part isn’t calling one API, it’s wiring together research, grounding, writing, enrichment, and evaluation into a reliable pipeline
  2. Reasoning models are for planning and review, standard models are for generation — mixing model tiers optimizes cost and quality
  3. RAG grounds your content in reality — without retrieval, you’re just generating plausible-sounding text with no source of truth
  4. Evaluation isn’t optional — fact-checking, quality scoring, and plagiarism detection are what separate a toy from a product
  5. Cost tracking matters — at ~$1 per article, the economics work; but without tracking, costs can spiral with reasoning model calls
  6. Multi-modal enrichment multiplies value — an article with images and audio narration is 10× more useful than plain text
  7. Async and parallel execution are essential — sub-agents and image generation should always run concurrently

Course Complete

You’ve gone from zero to building production-grade AI applications:

  • Lesson 1: LLM APIs, streaming, multi-provider
  • Lesson 2: RAG, vector search, prompt engineering, evaluation
  • Lesson 3: Agents, tool calling, ReACT, MCP
  • Lesson 4: Reasoning models, inference-time scaling, deep research
  • Lesson 5: Multi-modal generation, diffusion models, orchestration
  • Lesson 6: Everything combined into a real product

The AI engineering field moves fast, but the fundamentals you’ve learned — API integration, retrieval, agents, reasoning, multi-modal, and evaluation — are the building blocks that every new technique builds on.

Now go build something.