Skip to content

Full-stack advanced notes app with OpenAI audio transcription and AI organization (Flask, React/Vite, Clerk, SQLite/Postgres).

Notifications You must be signed in to change notification settings

riefer02/advanced-notes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

63 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Chisos

Production-ready advanced notes application with audio transcription using:

  • Frontend: Vite + React + TypeScript + Tailwind CSS
  • Backend: Flask + OpenAI GPT-4o-mini-transcribe (API-based)
  • Python tooling: uv (fast package manager from Astral)
  • Deployment-friendly: No GPU required, pure API calls

πŸ“š Documentation

Complete documentation: See docs/ folder

πŸš€ Quick Start

Prerequisites

  • Node.js 20.19+ or 22.12+ (Vite requirements)
  • Python 3.11+
  • uv (install with: curl -LsSf https://astral.sh/uv/install.sh | sh)
  • OpenAI API key (for transcription + AI categorization) - See environment setup

Backend Setup

cd backend

# Install dependencies (creates .venv and installs from pyproject.toml)
uv sync

# Create .env file with your OpenAI API key
echo "OPENAI_API_KEY=sk-your-key-here" > .env

# Optional: enable audio clips (S3-backed)
# echo "AUDIO_CLIPS_ENABLED=true" >> .env
# echo "S3_BUCKET=your-bucket-name" >> .env

# Start Flask server on http://localhost:5001
./run.sh

First run: No model download needed! Uses OpenAI API for transcription.

Frontend Setup

Open a new terminal:

cd frontend

# Install Node dependencies
npm install

# Start Vite dev server on http://localhost:5173
npm run dev

Visit http://localhost:5173 and record or upload audio (MP3, WAV, WebM, M4A, etc.) to see the transcription.


πŸ“ Project Structure

chisos/
β”œβ”€ README.md              # This file
β”œβ”€ .gitignore
β”œβ”€ Makefile               # Optional: run both servers with 'make dev'
β”‚
β”œβ”€ backend/               # Flask + OpenAI API
β”‚  β”œβ”€ pyproject.toml      # uv project definition
β”‚  β”œβ”€ uv.lock             # Dependency lock file (auto-generated)
β”‚  β”œβ”€ run.sh              # Dev server launcher
β”‚  β”œβ”€ wsgi.py             # WSGI entry point
β”‚  └─ app/
β”‚     β”œβ”€ __init__.py      # Flask app factory with CORS
β”‚     β”œβ”€ asr.py           # OpenAI transcription API client
β”‚     β”œβ”€ routes.py        # REST API endpoints (11 total)
β”‚     └─ services/        # AI categorization + storage
β”‚
└─ frontend/              # Vite + React + TS + Tailwind
   β”œβ”€ package.json
   β”œβ”€ vite.config.ts
   β”œβ”€ tailwind.config.ts
   β”œβ”€ postcss.config.js
   β”œβ”€ tsconfig.json
   β”œβ”€ index.html
   β”œβ”€ .env.local          # API URL: VITE_API_URL=http://localhost:5001
   └─ src/
      β”œβ”€ main.tsx
      β”œβ”€ App.tsx
      β”œβ”€ index.css        # Tailwind directives
      └─ components/
         └─ AudioUploader.tsx  # Upload UI + transcription display

πŸ”§ How It Works

Backend (backend/)

  1. Transcription (app/asr.py):

    • Uses OpenAI GPT-4o-mini-transcribe API
    • Supports: MP3, WAV, WebM, M4A, MP4, MPEG, MPGA (up to 25MB)
    • No local model download required
    • Fast, reliable, API-based transcription
  2. AI Categorization (app/services/ai_categorizer.py):

    • Uses OpenAI GPT-4o-mini for semantic analysis
    • Generates folder paths, filenames, tags automatically
    • Structured JSON outputs for reliability
  3. Storage (app/services/storage.py):

    • SQLite database with FTS5 full-text search
    • Database-only storage (no file system)
    • CRUD operations, folder hierarchy, tag management
  4. REST API (app/routes.py):

    • Endpoints for transcription, notes, folders, tags, search, and Ask Notes
    • Returns JSON responses with comprehensive metadata
  5. CORS: Enabled via flask-cors so Vite dev server (:5173) can call Flask (:5001)

Frontend (frontend/)

  • Vite: Fast dev server with HMR
  • React + TypeScript: Type-safe components
  • Tailwind CSS: Utility-first styling
  • AudioUploader component: File upload β†’ POST to /api/transcribe β†’ display transcript + metadata

πŸ“Š API Reference

POST /api/transcribe

Request:

  • Multipart form-data with file field, OR
  • Raw audio bytes in body

Response:

{
  "text": "transcribed speech text",
  "meta": {
    "device": "openai-api",
    "model": "gpt-4o-mini-transcribe",
    "language": "en",
    "duration": 3.45
  },
  "categorization": {
    "note_id": "abc123",
    "folder_path": "Ideas/Product",
    "filename": "new_feature_idea.txt",
    "tags": ["product", "feature"],
    "confidence": 0.95,
    "reasoning": "This appears to be a product feature idea..."
  }
}

Error:

{
  "error": "error message"
}

GET /api/health

Response:

{
  "status": "ok"
}

POST /api/ask

Ask a natural-language question about your notes. The backend creates a structured query plan and performs hybrid retrieval (filters + full-text search + embeddings) before generating a markdown answer with sources.


πŸ› οΈ Development Commands

Backend

cd backend

# Install/update dependencies
uv sync

# Add a new dependency
uv add <package-name>

# Run Flask dev server
./run.sh

# Or manually:
uv run flask run --host 0.0.0.0 --port 5001

Frontend

cd frontend

# Install dependencies
npm install

# Dev server (http://localhost:5173)
npm run dev

# Production build
npm run build

# Preview production build
npm run preview

# Lint
npm run lint

Both (Optional Makefile)

# Start both backend and frontend
make dev

☁️ Deployment Notes

  • No GPU required: Pure API-based transcription
  • No model downloads: Everything runs via OpenAI API
  • Lightweight: Only ~30 Python packages (vs 166 with local models)
  • Platform-agnostic: Works on any OS with Python 3.11+
  • Easy scaling: API handles all compute, just scale your Flask app

Check API usage via the meta.model field in responses.


πŸ” Troubleshooting

Backend

"OPENAI_API_KEY is required"

Ask Notes returns 500

  • Verify OPENAI_API_KEY is set
  • Ensure your OpenAI project has access to the configured models:
    • OPENAI_MODEL (default: gpt-4o-mini)
    • OPENAI_EMBEDDING_MODEL (default: text-embedding-3-small)

Transcription fails with 401 error

  • Check your API key is valid
  • Ensure you have credits/billing set up on OpenAI

Transcription too slow

Frontend

CORS errors

  • Ensure backend is running on :5001
  • Check frontend/.env.local has VITE_API_URL=http://localhost:5001

Build errors

  • Delete node_modules/ and run npm install again
  • Verify Node version: node -v (should be 20.19+ or 22.12+)

Audio recording/upload fails

  • Check browser console for errors
  • Ensure microphone permissions are granted
  • OpenAI supports: MP3, MP4, MPEG, MPGA, M4A, WAV, WebM (max 25MB)

πŸ“¦ Production Considerations

Backend

  • CORS: Restrict origins in production (edit app/__init__.py)
  • File size limits: Add max file size checks in routes.py
  • Rate limiting: Use flask-limiter
  • Authentication: Add API keys or OAuth
  • Streaming: Implement chunked upload + streaming ASR (if NeMo supports)
  • Deployment: Use Gunicorn/uWSGI instead of Flask dev server
    uv add gunicorn
    uv run gunicorn -w 4 -b 0.0.0.0:5001 wsgi:app

Frontend

  • Build for production:
    npm run build
    # Output in frontend/dist/
  • Environment variables: Set VITE_API_URL to production backend URL
  • Static hosting: Deploy dist/ to Vercel, Netlify, or Cloudflare Pages
  • API proxy: Configure Vite proxy in production or use Nginx

Transcription

  • Cost monitoring: Track OpenAI API usage on dashboard
  • Model upgrade: Switch to gpt-4o-transcribe for higher quality (more expensive)
  • Diarization: Use gpt-4o-transcribe-diarize for speaker labels
  • Streaming: Enable stream=True for real-time transcription
  • Prompting: Add custom prompts to improve accuracy for specific domains

πŸ“š Key Technologies & References

Component Link
OpenAI Transcription Speech to Text Docs
GPT-4o-mini Model Docs
TanStack Query React Query Docs
SQLite FTS5 Full-Text Search
Vite Getting Started
Tailwind CSS Vite Setup
Flask Quickstart
uv Installation

🎯 Acceptance Criteria βœ…

  • βœ… Vite frontend with Tailwind for recording/uploading audio
  • βœ… OpenAI API transcription (gpt-4o-mini-transcribe)
  • βœ… AI-powered categorization (GPT-4o-mini)
  • βœ… SQLite database storage with FTS5 search
  • βœ… 11 REST API endpoints for full CRUD
  • βœ… Split-pane layout with folder navigation
  • βœ… TanStack Query for state management
  • βœ… Keyboard navigation and accessibility
  • βœ… CORS configured for local dev
  • βœ… Deployment-ready (no GPU required)

🚧 Future Enhancements

  • Speaker diarization (gpt-4o-transcribe-diarize)
  • Streaming transcription with WebSocket
  • Note editing UI with inline updates
  • Bulk operations (move, delete, export)
  • Export as Markdown/PDF
  • User authentication & multi-user support
  • PostgreSQL for production
  • Docker Compose setup
  • CI/CD pipeline (GitHub Actions)
  • Real-time collaboration

πŸ“ License

This is a proof-of-concept template.


Built with ❀️ for easy deployment

About

Full-stack advanced notes app with OpenAI audio transcription and AI organization (Flask, React/Vite, Clerk, SQLite/Postgres).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published