Skip to content

A modern web application for creating and managing multimedia content, featuring AI-powered text-to-speech capabilities using Chatterbox.

Notifications You must be signed in to change notification settings

kliewerdaniel/ConCreat

Repository files navigation

🎨 ConCreat

ConCreat App Screenshot

A cutting-edge web application for creating and managing multimedia content with integrated AI-powered capabilities

Version License: MIT Next.js React TypeScript Python

πŸ“– Documentation β€’ πŸ› Report Bug β€’ ✨ Request Feature


πŸ“‹ Table of Contents


✨ Features

🎯 Core Capabilities

Feature Description
πŸ’¬ AI Chat Assistant Interactive conversations powered by Ollama local models
🎨 AI Image Generation Create stunning images using ComfyUI workflows with GGUF models
🎬 Video Processing Advanced video generation and processing with HunyuanVideo integration
πŸ—£οΈ Text-to-Speech Natural voice synthesis powered by Chatterbox technology
🎡 Voice Cloning Personalize audio content with advanced voice replication
πŸ“± Modern UI/UX Sleek interface built with Next.js, React, and Tailwind CSS
πŸ”— ComfyUI Integration Node-based AI workflows for professional content creation

πŸš€ Advanced Features

  • GGUF Model Support - Optimized quantized models for efficient inference
  • Customizable Pipelines - Node-based workflows that can be modified and extended
  • High-Quality Output - Support for various formats with configurable quality settings
  • Prompt Engineering - Advanced text encoding with positive/negative prompts
  • Real-time Processing - Fast generation with optimized model architectures
  • Cross-platform Compatibility - Works on Windows, macOS, and Linux

πŸ€– ComfyUI Integration

ConCreat leverages ComfyUI, a powerful node-based interface for AI image and video generation, to provide advanced creative tools.

πŸ“‹ Included Workflows

🎨 Image Generation Workflow (workflows/imagemaker.json)

Advanced image creation using GGUF models like z_image_turbo, with support for LoRA models and custom prompts

🎬 Video Generation Workflow (workflows/video.json)

Video creation using HunyuanVideo15 models for high-quality video generation from images

πŸ”§ Workflow Features

  • ⚑ GGUF Model Support: Optimized quantized models for efficient inference
  • πŸ”„ Customizable Pipelines: Node-based workflows that can be modified and extended
  • 🎯 High-Quality Output: Support for various image and video formats with configurable quality settings
  • πŸ’¬ Prompt Engineering: Advanced text encoding with positive and negative prompts

πŸ“₯ ComfyUI Model Requirements

To use the included ComfyUI workflows, you'll need to download the following models and place them in your ComfyUI models directory:

🎨 Required Models for Image Generation

Model Filename Download Link Location
VAE ae.safetensors Hugging Face ComfyUI/models/vae/
CLIP Qwen3-4B-UD-Q6_K_XL.gguf Hugging Face ComfyUI/models/clip/
Unet z_image_turbo-Q8_0.gguf Hugging Face ComfyUI/models/unet/

🎬 Required Models for Video Generation

Model Filename Download Link Location
Checkpoint HV15-Rapid-AIO-v1.safetensors Hugging Face ComfyUI/models/checkpoints/
CLIP Vision sigclip_vision_patch14_384.safetensors Hugging Face ComfyUI/models/clip_vision/

πŸ› οΈ ComfyUI Installation

# Clone ComfyUI repository
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

# Install dependencies
pip install -r requirements.txt

# Download required custom nodes
# ComfyUI-GGUF: https://github.com/city96/ComfyUI-GGUF
# rgthree-comfy: https://github.com/rgthree/rgthree-comfy

πŸ’‘ Note: Model file sizes can be large (several GB). Ensure you have sufficient disk space and a stable internet connection for downloads.


⚑ Quick Start

πŸ“‹ System Requirements

  • Node.js: Version 18 or higher
  • Python: Version 3.11 or above
  • Git: For version control
  • Ollama: Local AI model server (latest version recommended)
  • Storage: At least 10GB free space for models

πŸš€ Automated Installation

  1. Clone the Repository

    git clone https://github.com/kliewerdaniel/concreat.git
    cd ConCreat
  2. Execute Setup Script

    npm run setup

    This command handles all dependency installations and creates the Python virtual environment automatically.

  3. Launch Development Server

    npm run dev
  4. Access Application Open your browser and navigate to http://localhost:3000


πŸ”§ Manual Setup

For those preferring step-by-step installation:

πŸ“¦ Frontend Dependencies

npm install

🐍 Python Environment Setup

python3 -m venv venv
source venv/bin/activate  # Use `venv\Scripts\activate` on Windows

πŸ“š Python Dependencies

pip install -r requirements.txt

πŸ€– Ollama Setup

ConCreat uses Ollama for local AI chat functionality. Install and set up Ollama:

# Install Ollama (follow instructions for your OS at https://ollama.ai)
# For macOS/Linux:
curl -fsSL https://ollama.ai/install.sh | sh

# For Windows: Download from https://ollama.ai/download

# Pull recommended models
ollama pull gemma      # Default chat model
ollama pull llama2     # Alternative model
ollama pull mistral    # Additional model option

# Start Ollama service (runs in background)
ollama serve

πŸ’‘ Note: Ollama runs on localhost:11434 by default. The application will automatically detect if Ollama is running and fall back to mock responses if not available.

▢️ Application Launch

npm run dev

πŸ“‚ Project Architecture

ConCreat/
β”œβ”€β”€ πŸ“ src/app/
β”‚   β”œβ”€β”€ πŸ“ api/
β”‚   β”‚   β”œβ”€β”€ πŸ“ chat/          # πŸ’¬ Chat system endpoints
β”‚   β”‚   β”œβ”€β”€ πŸ“ generate/      # 🎨 Content creation APIs
β”‚   β”‚   β”œβ”€β”€ πŸ“ images/        # πŸ–ΌοΈ Image manipulation APIs
β”‚   β”‚   β”œβ”€β”€ πŸ“ tts/           # πŸ—£οΈ Text-to-speech conversion
β”‚   β”‚   β”œβ”€β”€ πŸ“ videos/        # 🎬 Video processing endpoints
β”‚   β”‚   └── πŸ“ voices/        # 🎡 Voice management system
β”‚   β”œβ”€β”€ 🎨 globals.css        # Global stylesheet
β”‚   β”œβ”€β”€ πŸ“± layout.tsx         # Application layout component
β”‚   └── 🏠 page.tsx           # Main page component
β”œβ”€β”€ 🌐 public/                # Static resources
β”œβ”€β”€ πŸ”§ workflows/             # Workflow configuration files
β”œβ”€β”€ 🐍 tts_service.py         # Python TTS service implementation
β”œβ”€β”€ βš™οΈ setup.sh               # Automated setup script
β”œβ”€β”€ πŸ“‹ requirements.txt       # Python package requirements
β”œβ”€β”€ πŸ“¦ package.json           # Node.js project configuration
└── πŸ“– README.md              # Project documentation

πŸ“ Note: The chatterbox/ directory containing TTS models is generated during setup and not part of the repository.


πŸ”Œ API Reference

Endpoint Method Description
/api/chat GET/POST πŸ’¬ Interactive chat functionality
/api/generate POST 🎨 AI content generation services
/api/images GET/POST πŸ–ΌοΈ Image processing and management
/api/tts POST πŸ—£οΈ Text-to-speech conversion endpoint
/api/videos GET/POST 🎬 Video content operations
/api/voices GET/POST 🎡 Voice cloning and management

πŸ” Environment Configuration

Create a .env.local file in the project root:

# Optional: Hugging Face authentication token for model access
HF_TOKEN=your_huggingface_token_here

πŸ› οΈ Development Workflow

πŸƒβ€β™‚οΈ Available Commands

Command Description
npm run dev πŸš€ Start development server
npm run build πŸ”¨ Create production build
npm run start ▢️ Start production server
npm run lint πŸ” Run ESLint code quality checks
npm run setup βš™οΈ Complete environment setup

✨ Quality Assurance

The project maintains high code standards with:

  • πŸ” ESLint: JavaScript/TypeScript code quality enforcement
  • πŸ“ TypeScript: Enhanced type safety and developer experience
  • 🎨 Tailwind CSS: Consistent and responsive styling

🀝 Contributing

We ❀️ contributions! Please follow these steps:

πŸ“ How to Contribute

  1. 🍴 Fork the repository
  2. 🌿 Create a feature branch: git checkout -b feature/amazing-feature
  3. πŸ’» Make your changes and commit: git commit -am 'Add amazing feature'
  4. πŸ“€ Push your changes: git push origin feature/amazing-feature
  5. πŸ”„ Open a Pull Request

πŸ› Bug Reports & Feature Requests

πŸ“‹ Development Guidelines

  • Follow the existing code style
  • Write clear, concise commit messages
  • Update documentation as needed
  • Add tests for new features
  • Ensure all tests pass

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

License: MIT


πŸ› οΈ Technology Stack

🎨 Frontend

Next.js React TypeScript Tailwind CSS

πŸ–₯️ Backend & AI

Python PyTorch Ollama ComfyUI

☁️ Deployment & Tools

Vercel Git ESLint


πŸ™ Acknowledgments

  • Ollama - Local AI model server for chat functionality
  • ComfyUI - Powerful node-based AI interface
  • Chatterbox - Advanced TTS technology
  • HunyuanVideo - High-quality video generation models
  • Next.js - The React framework for production
  • Tailwind CSS - A utility-first CSS framework

Made with ❀️ by Daniel Kliewer

⭐ Star this repo if you found it helpful!

⬆️ Back to Top

About

A modern web application for creating and managing multimedia content, featuring AI-powered text-to-speech capabilities using Chatterbox.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published