FastAPI Journal Automation with Generative AI Compound System for Journals

A FastAPI-based automation system for creating, enriching, and managing journal articles. It integrates Google Gemini, Groq’s LLaMA, and the CORE API to deliver a seamless pipeline — from metadata input to fully structured, AI-generated journal outputs.

Table of Contents

Overview
Key Concepts
Features
Tech Stack
Architecture & Data Flow
Data Models & Metadata
Getting Started
Usage & API Endpoints
AI Integrations
Output Formats
Deployment & DevOps
Testing & Quality
Security & Privacy
Troubleshooting
Roadmap
Contributing
Licensing
Acknowledgments
Releases

Overview

This repository hosts a compact, production-ready automation system for journals. It uses a compound AI approach to orchestrate multiple AI agents, each handling a slice of the pipeline: from metadata intake and enrichment to writing, formatting, and publication readiness.
The system is built on FastAPI, a lean Python framework that provides clean RESTful APIs, strong typing, and fast performance.
The AI stack blends Google Gemini for semantic understanding, Groq’s LLaMA for generation, and the CORE API for retrieval-augmented generation and knowledge sourcing.
Outputs can be structured JSON, LaTeX-ready content, Markdown manuscripts, and publication-ready PDFs. The design supports RAG workflows to pull relevant literature and ensure factual grounding.

Key Concepts

Compound AI System: A coordinated set of AI agents that work in sequence or parallel to complete a task. Each agent focuses on a subtask — metadata normalization, outline generation, content drafting, citation management, and QA.
Agent Orchestration: A lightweight director pattern that assigns tasks to specialized agents. It ensures data provenance, versioning, and auditable changes through the pipeline.
Retrieval-Augmented Generation (RAG): The system pulls external sources (via CORE) to ground generated content in credible literature. This reduces hallucinations and improves accuracy.
Metadata-First Pipeline: The process starts with metadata input (title, keywords, authors, abstracts) and grows into a fully structured article with sections, figures, tables, references, and appendices.
Output Flexibility: The pipeline can emit JSON schemas, LaTeX sources, Markdown manuscripts, and export-ready PDFs, enabling seamless integration with publishers and repositories.

Features

End-to-end journal automation: intake, enrichment, drafting, formatting, and export.
AI collaboration: Gemini for understanding, Groq LLaMA for generation, CORE for sources.
Structured outputs: JSON payloads, LaTeX sources, Markdown manuscripts, and PDF exports.
LaTeX-ready pipelines: ready-to-compile templates and bibliography management.
RAG-enabled literature integration: pull citations and snippets from trusted sources.
Fine-grained control: adjustable generation temperature, max tokens, and agent budgets.
Versioned outputs: deterministic seeds and content versioning for reproducibility.
Authentication-friendly API: token-based access suitable for internal tooling.
Extensible architecture: plug in new AI agents or data sources with minimal changes.
Local and cloud-ready deployments: Docker, Kubernetes, or raw Python environments.

Tech Stack

Language: Python
Web framework: FastAPI
AI components: Google Gemini, Groq LLaMA
Knowledge source: CORE API
Data formats: JSON, Markdown, LaTeX, PDF
Deployment: Docker, optional Kubernetes
Testing: pytest and HTTPX
Documentation: Markdown, with ready LaTeX templates

Architecture & Data Flow

Ingest: Users submit metadata and preferences via a REST API.
Validate: FastAPI models validate input, normalize terms, and resolve author details.
Enrich: The system calls Gemini to interpret intent, extract key questions, and shape the article outline.
Generate: LLaMA-based generators draft sections, figures, and captions under policy controls.
Source: CORE fetches relevant literature; results are ranked and injected into the draft through a RAG loop.
Assemble: The pipeline composes a complete manuscript, with bibliographic references, figures, and appendices.
Output: The final artifacts are serialized to JSON, LaTeX, Markdown, and PDF.
Audit: Every step is logged, with data lineage preserved for reproducibility and compliance.

Data Models & Metadata

Article: id, title, authors, affiliations, abstract, keywords, metadata version, publication date, language.
Section: heading level, title, content, citations.
Figures & Tables: captions, references, image/table payloads, sources.
Citations: id, DOI, URL, retrieved_at, source_agency.
Enrichment: notes, reliability_score, confidence, used_sources.
AIRun: id, model, seed, prompts, tokens_used, duration, status.
Provenance: input_version, output_version, data_pipeline_trace.
Users & Permissions: roles, tokens, access scopes.

Getting Started Prerequisites

Python 3.11+ installed locally or in your environment.
Virtual environment support (venv) or conda.
Basic familiarity with REST and command line.

Installation

Create a working directory and set up a virtual environment.
Install dependencies from requirements.txt or via pip install -r requirements.txt.
Prepare a .env file with API keys and endpoints for Gemini, LLaMA (Groq), and CORE.

Environment and Secrets

GOOGLE_GEMINI_API_KEY: your Gemini API key
GROQ_LLAMA_ENDPOINT: address of the Groq LLaMA service
CORE_API_KEY: your CORE API key
DATABASE_URL: connection string for metadata and artifacts storage
APP_SECRET_KEY: used for session and token signing

An example .env snippet

GOOGLE_GEMINI_API_KEY=your_key_here
GROQ_LLAMA_ENDPOINT=http://localhost:8001
CORE_API_KEY=your_core_key
DATABASE_URL=sqlite:///journal.db
APP_SECRET_KEY=change-me

Project Structure

app/
- main.py: FastAPI application and router setup
- models.py: Pydantic models for request/response schemas
- schemas.py: database schemas and serialization helpers
- services/
  - ai.py: orchestrates Gemini and LLaMA calls
  - core.py: pulls literature from CORE
  - drafting.py: content assembly and formatting
  - export.py: JSON, LaTeX, Markdown, and PDF generation
- routers/
  - articles.py: CRUD and generation endpoints
- templates/
  - latex/ and markdown/ templates for outputs
tests/
- test_endpoints.py: API contract tests
- test_ai_flows.py: AI agent flow tests
docs/
- architecture.md
- api_reference.md
- usage_examples.md
scripts/
- install-fastapi-journal.sh: installer script to bootstrap dependencies (see Releases)
docker/
- Dockerfile
- docker-compose.yml

Getting Started: Quick Start Guide

Clone the repo: git clone https://github.com/aligoraya202/FastAPI-Journal-Automation-with-Generative-And-AI-Compound-AI-System.git
Create a virtual environment and activate it: python -m venv venv && source venv/bin/activate
Install dependencies: pip install -r requirements.txt
Copy and fill the environment file: cp .env.example .env
Start the API server: uvicorn app.main:app --reload --port 8000
Access the API docs at: http://localhost:8000/docs

Usage & API Endpoints

POST /articles/generate
- Creates a new article draft from provided metadata
- Request body includes: title, authors, abstract, keywords, metadata_version, preferred_language, target_journal
- Response includes: article_id, status, generation_summary
GET /articles/{article_id}
- Retrieve a complete article draft with sections and citations
PATCH /articles/{article_id}
- Update metadata or enrichment notes
POST /articles/{article_id}/export
- Trigger export to JSON, LaTeX, Markdown, or PDF
POST /ai/feedback
- Provide feedback to improve future generations
DELETE /articles/{article_id}
- Remove drafts and associated artifacts

Examples

Submitting metadata
- Title: "A Novel Approach to Journal Automation with AI"
- Authors: ["Alex Doe", "Jamie Lee"]
- Abstract: "This study explores automated generation of journal content using a compound AI system."
- Keywords: ["journal automation", "AI", "NLP", "RAG", "LaTeX"]
Generating content
- Call POST /articles/generate with the above payload
- The system routes tasks to Gemini for intent capture, LLaMA for draft, and CORE for sources
Exporting to LaTeX
- POST /articles/{article_id}/export with format=latex
- Returns a .tex file ready for compilation

AI Integrations

Google Gemini
- Role: Semantic interpretation, task planning, and prompt shaping
- Usage: Used in the early stage to understand intent and generate an outline
Groq LLaMA
- Role: High-quality text generation for sections, captions, and summaries
- Usage: Generates draft content with controlled prompts and seeds to ensure determinism
CORE API
- Role: Literature retrieval and grounding
- Usage: Pulls citations, snippets, and context to ground the manuscript
RAG Loop
- Pulls sources, ranks them by credibility, and injects them into the draft
- Ensures citations align with the article content

Output Formats

JSON: Structured payload with all sections, citations, and metadata
Markdown: Human-readable manuscript with headers, lists, and embedded citations
LaTeX: Clean, publisher-ready source with bibliographies
PDF: Ready-to-publish PDF generated from LaTeX or Markdown pipelines

LaTeX & Publication Readiness

The LaTeX templates are designed to be drop-in replacements for common journal formats
Bibliography management relies on BibTeX or Biber, depending on the template
Figures and tables are positioned to meet typical submission guidelines
The system can embed DOI links and cross-references to related works

Deployment & DevOps

Local development: Simple Python environment with uvicorn
Docker: A ready-to-run container image with all dependencies
Kubernetes: Small deployment manifests that scale the AI worker pool
CI: GitHub Actions workflows for linting, tests, and build of artifacts
Observability: Basic logging, plus optional OpenTelemetry instrumentation for traces

Security & Privacy

The API uses token-based authentication for protected routes
Data is stored with versioning to enable audit trails
All AI prompts are designed to minimize the leakage of sensitive content
Access to external services (Gemini, LLaMA, CORE) is controlled by API keys and endpoints

Testing & Quality

End-to-end tests simulate article creation from metadata to export
Unit tests cover AI orchestration, formatting, and export logic
Mock services stand in for Gemini, LLaMA, and CORE during tests
Continuous integration runs tests on each pull request

Troubleshooting

Common issues: missing environment variables, incorrect API keys, or network restrictions
Check the logs for AIRun entries to see seeds, prompts, and durations
If CORE returns empty results, verify CORE_API_KEY and network access
If LaTeX export fails, confirm LaTeX toolchain (pdflatex/xelatex) availability
For Docker users, ensure proper volumes and environment variable passthrough

Roadmap

Expand multi-language support for metadata and drafts
Add more publishers with templates for journal formats
Improve citation disambiguation and DOI resolution
Enhance accessibility features and screen reader support
Introduce a plugin system to integrate other AI providers

Contributing

We welcome contributions. Start by forking the repository and creating a feature branch.
Follow the code style used in the project: type hints, clear variable names, and small functions.
Add or update tests for any new feature.
Run tests locally before opening a pull request.
Documentation updates are welcome for new endpoints, workflows, or templates.

Licensing

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Special thanks to the teams behind Google Gemini, Groq LLaMA, and the CORE API for enabling research and practical tooling in journal automation.
Gratitude to the open-source community for sharing best practices in FastAPI, AI workflows, and document preparation.

Releases

For the latest installers and release artifacts, visit the Releases page: https://github.com/aligoraya202/FastAPI-Journal-Automation-with-Generative-And-AI-Compound-AI-System/releases
From the Releases page, download install-fastapi-journal.sh and execute it to bootstrap the system. The installer sets up dependencies, config, and sample data to help you get started quickly.
If you need the latest release details or want to verify compatibility with your environment, check the same Releases page again: https://github.com/aligoraya202/FastAPI-Journal-Automation-with-Generative-And-AI-Compound-AI-System/releases

Notes on Usage and Best Practices

Start with a small metadata package to validate the flow. Gradually scale to longer manuscripts.
Use the enrichment notes to guide the AI in tone, structure, and audience.
Keep generation seeds consistent when you want repeatable outputs across environments.
Leverage CORE to fetch diverse sources while maintaining a credible citation strategy.
Regularly rotate API keys and monitor usage to avoid drift in AI behavior.

Glossary

AI Agent: A module that performs a dedicated task in the pipeline.
RAG: Retrieval-Augmented Generation, a method that augments generation with external sources.
LaTeX: A high-quality typesetting system commonly used for scientific documents.
CORE: An API providing access to a wide literature corpus for grounding content.
Gemini: Google’s AI model used for semantic understanding and task framing.
LLaMA: Meta’s family of language models released via Groq’s deployment channel.

Techniques and Best Practices

Keep prompts concise but expressive. The more precise you are, the better the result.
Define a clear outline before drafting long sections.
Predefine a citation strategy to ensure consistency across the manuscript.
Maintain a clean data model that tracks provenance and edits.

Visuals and Design

Hero image showcases the fusion of AI and scholarly writing.
Architecture diagram highlights the flow between ingestion, AI processing, RAG integration, and output.
The design favors readability, with a calm color palette and accessible typography.

Appendix: API Reference Highlights

POST /articles/generate
- Input: title, authors, abstract, keywords, metadata_version, language, target_journal
- Output: article_id, status, summary
GET /articles/{article_id}
- Output: full article with sections, figures, references
POST /articles/{article_id}/export
- Input: format (json|latex|markdown|pdf)
- Output: export artifact or download link
POST /ai/feedback
- Input: article_id, feedback_type, notes
- Output: acknowledgement and next steps

Appendix: Environment Variables (Expanded)

GOOGLE_GEMINI_API_KEY: Gemini service key
GROQ_LLAMA_ENDPOINT: LLaMA server address
CORE_API_KEY: CORE service key
DATABASE_URL: local or cloud database connection
APP_SECRET_KEY: session and token signing
LOG_LEVEL: debug|info|warning|error
ENABLE_HEDGING: true|false, controls safety hedges in generation
OUTPUT_FORMATS: json, latex, markdown, pdf (comma-separated)

Appendix: Sample Commands (No Code Blocks)

Create a virtual environment and activate it
Install dependencies from requirements
Copy example environment file and fill in keys
Start the API with uvicorn
Use the API docs for endpoint exploration

Final Note

This repository embraces a practical, step-by-step approach to automating journal article production with a compound AI system. It blends semantic understanding, high-quality generation, and credible literature grounding to produce publication-ready manuscripts.

Releases (again)

To explore the latest installer and release artifacts, visit the Releases page: https://github.com/aligoraya202/FastAPI-Journal-Automation-with-Generative-And-AI-Compound-AI-System/releases

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
Fonts		Fonts
.gitignore		.gitignore
README.md		README.md
app.py		app.py
finalFormate.tex		finalFormate.tex
journal_template.html		journal_template.html
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FastAPI Journal Automation with Generative AI Compound System for Journals

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

aligoraya202/FastAPI-Journal-Automation-with-Generative-And-AI-Compound-AI-System

Folders and files

Latest commit

History

Repository files navigation

FastAPI Journal Automation with Generative AI Compound System for Journals

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages