Workflow

Understanding XunLong's workflow helps you optimize your content generation process and troubleshoot issues effectively.

High-Level Workflow

Detailed Workflow Stages

Stage 1: Requirement Analysis 🔍

Goal: Understand user intent and prepare for execution.

Activities:

Parse command-line arguments
Identify content type (report/fiction/ppt)
Extract parameters (style, depth, chapters, etc.)
Validate configuration
Initialize project structure

Output:

python

{
    "content_type": "report",
    "query": "AI Industry Trends 2025",
    "style": "business",
    "depth": "comprehensive",
    "project_id": "20251005_143022"
}

Duration: ~1 second

Stage 2: Task Decomposition 📋

Goal: Break down complex request into manageable subtasks.

Process:

Example Decomposition:

::: tabs

== Report

json

{
  "tasks": [
    {
      "type": "search",
      "queries": [
        "AI industry market size 2025",
        "Latest AI technology trends",
        "AI applications by sector"
      ]
    },
    {
      "type": "outline",
      "sections": [
        "Executive Summary",
        "Market Overview",
        "Technology Trends",
        "Applications",
        "Future Outlook"
      ]
    }
  ]
}

== Fiction

json

{
  "tasks": [
    {
      "type": "plot",
      "structure": "three_act"
    },
    {
      "type": "characters",
      "count": 5,
      "depth": "detailed"
    },
    {
      "type": "chapters",
      "count": 12,
      "style": "mystery"
    }
  ]
}

== PPT

json

{
  "tasks": [
    {
      "type": "outline",
      "slides": 15
    },
    {
      "type": "design",
      "style": "business",
      "colors": "professional"
    },
    {
      "type": "content",
      "distribution": "balanced"
    }
  ]
}

:::

Duration: ~2-3 seconds

Stage 3: Parallel Search 🌐

Goal: Gather relevant information from the web.

Execution Flow:

Search Process:

Execute Query - Use Perplexity or Playwright
Extract Content - Parse HTML and extract text
Filter Results - Remove irrelevant content
Summarize - Create concise summaries
Cite Sources - Track URLs and dates

Duration: ~5-15 seconds (depends on query count)

Stage 4: Content Integration 📦

Goal: Organize and structure collected information.

Activities:

Deduplicate information
Categorize by topic
Rank by relevance
Create knowledge graph
Prepare context for generation

Data Structure:

python

{
    "topic": "AI Industry Trends",
    "sources": [
        {
            "url": "https://...",
            "date": "2025-09-15",
            "relevance": 0.95,
            "summary": "...",
            "key_points": [...]
        }
    ],
    "knowledge_graph": {
        "entities": [...],
        "relationships": [...]
    }
}

Duration: ~2-3 seconds

Stage 5: Intelligent Generation ✨

Goal: Create high-quality content based on gathered materials.

Generation Strategy:

LLM Usage Pattern:

python

# Sequential generation with context
for section in outline:
    prompt = build_prompt(
        section=section,
        context=search_results,
        previous_sections=generated_content
    )
    content = await llm.generate(prompt)
    generated_content.append(content)

Quality Metrics:

Coherence score
Factual accuracy
Citation coverage
Readability index

Duration: ~30-120 seconds (varies by length)

Stage 6: Quality Review ✅

Goal: Ensure content meets quality standards.

Review Checklist:

Category	Checks	Auto-Fix
Structure	Heading hierarchy, section balance	✅
Content	Fact-checking, completeness	⚠️ Manual
Style	Tone consistency, grammar	✅
Format	Markdown syntax, citations	✅
Logic	Flow, transitions, coherence	⚠️ Manual

Review Process:

Automated Checks - Run syntax and format validators
LLM Review - Quality assessment by reviewer agent
Score Calculation - Compute overall quality score
Feedback Generation - Create improvement suggestions
Revision (if needed) - Regenerate low-quality sections

Threshold:

Score ≥ 0.85: Approve
Score < 0.85: Request revision

Duration: ~5-10 seconds

Stage 7: Format Conversion 🔄

Goal: Transform Markdown to desired output formats.

Conversion Pipeline:

HTML Conversion:

python

import markdown2

html = markdown2.markdown(
    content,
    extras=[
        'tables',
        'fenced-code-blocks',
        'header-ids',
        'toc',
        'metadata'
    ]
)

Template Application:

html

<!DOCTYPE html>
<html>
<head>
    <style>
        /* Custom styles for professional look */
    </style>
</head>
<body>
    {{ content }}
</body>
</html>

Duration: ~2-5 seconds per format

Stage 8: Export Output 📤

Goal: Save and deliver final output to user.

Export Structure:

storage/20251005_143022_AI_Industry_Trends/
├── metadata.json
├── intermediate/
│   ├── 01_task_decomposition.json
│   ├── 02_search_results.json
│   └── 03_content_outline.json
├── reports/
│   ├── FINAL_REPORT.md
│   └── FINAL_REPORT.html
└── exports/
    ├── report.pdf
    └── report.docx

Completion Summary:

✅ Generation Complete!

📊 Statistics:
   - Duration: 2m 34s
   - Searches: 8 queries
   - Content: 5,432 words
   - Citations: 15 sources
   - Quality Score: 0.92

📁 Output Files:
   - Markdown: storage/.../FINAL_REPORT.md
   - HTML: storage/.../FINAL_REPORT.html
   - PDF: storage/.../exports/report.pdf

🔗 Project ID: 20251005_143022

Duration: ~1-2 seconds

Total Timeline

Stage	Duration	Percentage
Requirement Analysis	1s	1%
Task Decomposition	3s	2%
Parallel Search	10s	7%
Content Integration	3s	2%
Intelligent Generation	90s	60%
Quality Review	8s	5%
Format Conversion	4s	3%
Export Output	1s	<1%
Total	~120s	100%

Performance Tip

Most time is spent on content generation (LLM calls). Use faster models like GPT-3.5 for drafts, then iterate with GPT-4 for quality.

Iteration Workflow

When you request modifications:

Key Differences from Initial Generation:

✅ Preserves context
✅ Targets specific sections
✅ Maintains version history
✅ Faster execution (~30-60s)

Monitoring & Observability

Real-Time Progress

During execution, XunLong displays:

🐉 XunLong - Content Generation

📋 Task Analysis
✅ Identified: Report Generation
✅ Style: Business
✅ Depth: Comprehensive

🔍 Searching Information
[████████████████████████████████] 100% (8/8 queries)
✅ Found 42 relevant sources

✨ Generating Content
[████████████████░░░░░░░░░░░░░░░░] 60% (3/5 sections)
⏱️  Elapsed: 1m 24s | Estimated: 48s remaining

LangFuse Tracking

All operations are logged to LangFuse:

Trace View:

Trace: report_generation_20251005_143022
├─ analyze_requirement (1.2s)
├─ decompose_tasks (2.8s)
├─ parallel_search (9.4s)
│  ├─ search_query_1 (3.2s)
│  ├─ search_query_2 (4.1s)
│  └─ search_query_3 (5.3s)
├─ generate_content (89.3s)
│  ├─ section_1 (15.2s, 1,234 tokens)
│  ├─ section_2 (18.7s, 1,567 tokens)
│  ├─ section_3 (22.1s, 1,892 tokens)
│  ├─ section_4 (16.8s, 1,345 tokens)
│  └─ section_5 (14.2s, 1,123 tokens)
├─ review_quality (7.9s)
└─ export_formats (3.4s)

Error Recovery

Automatic Retry

Failed steps are automatically retried:

python

@retry(max_attempts=3, backoff=exponential)
async def execute_search(query):
    try:
        return await search_engine.search(query)
    except NetworkError:
        # Will retry with backoff
        raise

Checkpointing

Progress is checkpointed at each stage:

intermediate/
├── 01_task_decomposition.json  ✅ Saved
├── 02_search_results.json      ✅ Saved
├── 03_content_outline.json     ✅ Saved
└── 04_generated_sections.json  ⏸️  In Progress

If generation fails, you can resume:

bash

python xunlong.py resume 20251005_143022

Best Practices

1. Optimize Search Queries

Be specific in your requests
Include relevant keywords
Specify time ranges if needed

2. Choose Appropriate Depth

Overview: Quick 5-minute generation
Standard: Balanced 10-minute generation
Comprehensive: Detailed 20-minute generation

3. Monitor Token Usage

bash

# Check token consumption
python xunlong.py stats 20251005_143022

4. Iterate Incrementally

Start with standard depth
Review output
Request targeted improvements

Next Steps

Learn about Report Generation
Explore Fiction Writing
Try PPT Creation
Understand Iteration

Workflow ​

High-Level Workflow ​

Detailed Workflow Stages ​

Stage 1: Requirement Analysis 🔍 ​

Stage 2: Task Decomposition 📋 ​

Stage 3: Parallel Search 🌐 ​

Stage 4: Content Integration 📦 ​

Stage 5: Intelligent Generation ✨ ​

Stage 6: Quality Review ✅ ​

Stage 7: Format Conversion 🔄 ​

Stage 8: Export Output 📤 ​

Total Timeline ​

Iteration Workflow ​

Monitoring & Observability ​

Real-Time Progress ​

LangFuse Tracking ​

Error Recovery ​

Automatic Retry ​

Checkpointing ​

Best Practices ​

1. Optimize Search Queries ​

2. Choose Appropriate Depth ​

3. Monitor Token Usage ​

4. Iterate Incrementally ​

Next Steps ​

Workflow

High-Level Workflow

Detailed Workflow Stages

Stage 1: Requirement Analysis 🔍

Stage 2: Task Decomposition 📋

Stage 3: Parallel Search 🌐

Stage 4: Content Integration 📦

Stage 5: Intelligent Generation ✨

Stage 6: Quality Review ✅

Stage 7: Format Conversion 🔄

Stage 8: Export Output 📤

Total Timeline

Iteration Workflow

Monitoring & Observability

Real-Time Progress

LangFuse Tracking

Error Recovery

Automatic Retry

Checkpointing

Best Practices

1. Optimize Search Queries

2. Choose Appropriate Depth

3. Monitor Token Usage

4. Iterate Incrementally

Next Steps