CommitBoy is a robust automation pipeline that clones a backend Python repository, uses an LLM to optimize code functions, validates correctness and performance via an MCP server, and pushes clean optimized patches back to Git. All orchestrated via Apache Airflow.
- 🔁 Fully automated pipeline using Airflow
- 🤖 LLM-driven code optimization (GPT-4o, LLaMA, etc.)
- ✅ MCP-backed validation: unit tests + performance
- 🔍 Function-level AST parsing and patching
- 🧪 No regression allowed: validated with test & perf thresholds
- 🚀 Auto-pushes optimized code as Git PR or branch
- 🗂 Full artifact logging: prompt, metrics, diffs
Input: Python backend repo (with tests)
│
▼
[Airflow DAG]
├── Clone repo
├── Parse functions via AST
├── Optimize using LLM (GPT-4o / LLaMA)
├── Run tests + MCP benchmarks
├── Compare against baseline
└── Push clean Git patch / PR
llm_optimizer/
├── dags/ # Airflow DAG
├── llm_engine/ # LLM optimizer logic
├── mcp/ # Performance + correctness validator
├── vcs/ # Git patch + PR automation
├── utils/ # Logging, AST tools
├── configs/ # Threshold configs (YAML)
├── repo/ # Cloned backend repo
└── .env # Secrets (e.g. OpenAI key)
-
Python 3.8+
-
Airflow 2.6+
-
openai,gitpython,pytest,tracemalloc, etc. -
Docker (optional, for sandboxed testing)
-
LLM access:
- OpenAI API key, or
- Local HuggingFace model
Install Python dependencies:
pip install -r requirements.txtEach optimized function is validated with:
- ✅ Unit test pass/fail
- ⏱ Latency in ms
- 📉 Memory (peak RSS)
- 🔁 Performance diff vs baseline
Configured in configs/thresholds.yaml.
Add a .env file:
OPENAI_KEY=your_openai_keyOptional: Add performance thresholds in configs/thresholds.yaml.
- Start Airflow:
airflow standalone- Trigger the DAG:
airflow dags trigger robust_code_optimizer-
Git PR or new branch with only validated optimized functions
-
Saved artifacts:
- Prompt → Response logs
- Code diff patches
- Performance benchmarks
-
Slack/email alerts (optional)
| Component | File | Purpose |
|---|---|---|
| DAG Logic | dags/dag_code_optimizer.py |
Orchestration |
| LLM Interface | llm_engine/code_optimizer.py |
Prompt/response + patch |
| MCP Validator | mcp/mcp_runner.py |
Benchmarking + test runner |
| Git Handler | vcs/git_patch.py |
Commit & PR logic |
| AST Parsing | utils/function_extractor.py |
Function isolation |
- AST-based function filtering
- Multi-model LLM support
- Performance-aware validation
- Git-integrated PR auto-push
- Semantic diff engine (optional)
- Multi-repo batch scheduler
- Web dashboard for run metrics
MIT License
To streamline building and managing the entire stack, we provide a Makefile with common targets:
| Command | Description |
|---|---|
make |
Build all Docker images (build-images) |
make build-images |
Build all service images |
make build-webhook |
Build webhook-listener image |
make build-mcp |
Build mcp-engine image |
make build-llm |
Build llm-service image |
make build-git |
Build git-cmd image |
make build-airflow |
Build custom Airflow image |
make up |
docker-compose up -d (start all services) |
make down |
docker-compose down (stop & remove containers) |
make logs |
docker-compose logs -f (stream container logs) |
make clean |
Remove all images built by this Makefile |
# 1️⃣ Build everything:
make
# 2️⃣ Bring up the stack in detached mode:
make up
# 3️⃣ Tail logs (press Ctrl+C to exit):
make logs
# 4️⃣ Tear down all services:
make down