FormFill

Advanced PDF form filling powered by Claude AI with multiple filling approaches for different types of forms.

🚀 Features

FormFill provides three powerful approaches for PDF form filling:

1. 🎯 Coordinate-Based Filling (⭐ Recommended)

Ultra-fast filling (seconds vs minutes)
Real-time visual preview with confidence indicators
Perfect for non-fillable/scanned PDFs
Direct PDF text writing at precise coordinates
Interactive adjustment mode for fine-tuning
Color-coded confidence levels (Green/Yellow/Red)

2. 🖱️ Computer Use Method

Simulates human form filling with mouse/keyboard
Works with interactive/fillable PDF forms
Based on Claude Computer Use API
Good for complex form interactions

3. 📋 Smart Data Integration

JSON data formatting and extraction
Support for complex nested data structures
Automatic field matching and mapping
CSV export capabilities

🔧 Installation

Prerequisites

macOS:

brew install poppler

Ubuntu/Debian:

sudo apt-get install poppler-utils

Install Dependencies

pip install -r requirements.txt
pip install opencv-python PyMuPDF

Authentication

Set your Anthropic API key:

export ANTHROPIC_API_KEY=sk-ant-api-***

📖 Usage

Quick Start - Coordinate-Based Filling

1. Preview Mode (Recommended First Step):

python3 coordinate_fill_cli.py examples/sample_form.pdf -j examples/sample_answers.json --preview-only

2. Full Filling Process:

python3 coordinate_fill_cli.py examples/sample_form.pdf -j examples/sample_answers.json

3. Interactive Mode for Fine-Tuning:

python3 coordinate_fill_cli.py examples/sample_form.pdf -j examples/sample_answers.json --interactive

Data Formatting

Extract and format data from JSON:

# View formatted data
python3 format_data.py examples/sample_answers.json

# Generate string format
python3 format_data.py examples/sample_answers.json --string

# Create CSV file
python3 format_data.py examples/sample_answers.json --csv output.csv

Computer Use Method

For fillable PDF forms:

python3 -m formfill.cli examples/sample_form.pdf -s "$(python3 format_data.py examples/sample_answers.json --string)"

Complete Workflow

One command for everything:

python3 fill_form_complete.py examples/sample_form.pdf examples/sample_answers.json

📁 File Structure

formfill/
├── formfill/                    # Core FormFill package
│   ├── coordinate_filler.py     # Coordinate-based filling engine
│   ├── cli.py                   # Computer use CLI
│   ├── fill.py                  # Form filling logic
│   └── ...
├── coordinate_fill_cli.py       # Coordinate-based CLI interface
├── format_data.py               # Data formatting utilities
├── fill_form_complete.py        # Complete workflow script
└── examples/
    ├── sample_form.pdf          # Example form
    └── sample_answers.json      # Example data

🎮 Advanced Usage

Interactive Coordinate Adjustment

In interactive mode, you can fine-tune text placement:

# Available commands in interactive mode:
adjust <index> <x> <y>     # Move text placement
remove <index>             # Remove a placement
add <name> <text> <x> <y>  # Add new placement
preview                    # Generate preview image
done                       # Finish and save

Custom Output Paths

python3 coordinate_fill_cli.py form.pdf -j data.json \
  --output custom_output.pdf \
  --preview custom_preview.png

Data Input Options

JSON with nested structure:

{
  "session_id": "test_123",
  "collected_answers": {
    "FIRST_NAME": "Jane",
    "LAST_NAME": "Doe",
    "EMAIL_ADDRESS": "jane.doe@email.com"
  }
}

Direct string input:

python3 coordinate_fill_cli.py form.pdf -s "Name: John Doe, Email: john@example.com, Phone: 555-0123"

🎯 When to Use Each Method

Method	Best For	Speed	Accuracy
Coordinate-Based	Non-fillable PDFs, scanned forms	⚡ Very Fast	🎯 High
Computer Use	Interactive PDFs, complex forms	🐌 Slower	🎯 High
Complete Workflow	Mixed form types, automation	⚡ Fast	🎯 High

🔍 How It Works

Coordinate-Based Approach

PDF Analysis: Claude examines the PDF form image
Field Detection: AI identifies form fields and optimal text positions
Coordinate Mapping: Precise (x, y) coordinates are determined
Preview Generation: Visual overlay shows planned text placement
Direct Writing: Text is written directly to PDF at coordinates

Computer Use Approach

Screen Simulation: PDF page becomes a "virtual screen"
AI Navigation: Claude determines mouse clicks and typing actions
Action Simulation: Clicks and typing are simulated programmatically
Text Placement: Form fields are filled through UI interaction

🔧 Configuration

Environment Variables

# Required
export ANTHROPIC_API_KEY="your-api-key-here"

# Optional - for computer use method
export HEIGHT="768"
export WIDTH="1024"

Confidence Levels

The coordinate-based system provides confidence scores:

🟢 Green (0.8+): High confidence, likely accurate
🟡 Yellow (0.5-0.8): Medium confidence, may need adjustment
🔴 Red (<0.5): Low confidence, review recommended

📊 Performance Comparison

Metric	Coordinate-Based	Computer Use
Speed	~10 seconds	~2-5 minutes
Accuracy	90-95%	95-98%
PDF Type	Any (fillable/non-fillable)	Fillable preferred
Preview	✅ Real-time	❌ No preview
Adjustment	✅ Interactive	❌ Manual retry

🐛 Troubleshooting

Common Issues

1. API Key Error:

Error: ANTHROPIC_API_KEY environment variable not set

Solution: Export your API key: export ANTHROPIC_API_KEY="your-key"

2. PDF Conversion Error:

Error: Unable to get page count. Is poppler installed?

Solution: Install poppler: brew install poppler (macOS) or sudo apt-get install poppler-utils (Ubuntu)

3. Low Confidence Placements:

Use --interactive mode to manually adjust coordinates
Check preview image before finalizing
Try different field descriptions in your data

4. Missing Dependencies:

pip install opencv-python PyMuPDF anthropic pdf2image Pillow

🤝 Contributing

FormFill bridges the gap between "LLM can only call APIs" and "LLM has complete VM control" by providing targeted PDF manipulation capabilities. Contributions are welcome!

📄 Examples

The examples/ directory contains:

sample_form.pdf - Example medical intake form
sample_answers.json - Comprehensive form data
Generated previews and filled PDFs

🎉 Success Stories

FormFill has been successfully tested on:

✅ Medical intake forms
✅ Photography permit applications
✅ Insurance claim forms
✅ Government applications
✅ Employment paperwork

Ready to automate your PDF form filling workflow! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
examples		examples
formfill		formfill
.gitignore		.gitignore
README.md		README.md
coordinate_fill_cli.py		coordinate_fill_cli.py
fill_form_complete.py		fill_form_complete.py
format_data.py		format_data.py
photographyapp_filled.pdf		photographyapp_filled.pdf
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
sample_data.csv		sample_data.csv
sample_form_filled.pdf		sample_form_filled.pdf
sample_form_preview.png		sample_form_preview.png
setup.py		setup.py

akashjss/formfill

Folders and files

Latest commit

History

Repository files navigation