A Chrome extension that reads web page content aloud using Pocket TTS - a lightweight text-to-speech model that runs on your CPU.
- Read any web page content aloud
- Paragraph-by-paragraph processing - audio starts playing quickly even for long documents
- Multiple voice options (8 different voices)
- Automatic content extraction (focuses on main article content)
- Simple playback controls (play/stop)
- Works entirely locally - no cloud services required
pocket-reader/
├── server/ # Python TTS server
│ ├── server.py # Flask server using Pocket TTS
│ └── pyproject.toml # UV/Python dependencies
├── extension/ # Chrome extension
│ ├── manifest.json # Extension manifest
│ ├── popup.html # Extension popup UI
│ ├── popup.css # Popup styles
│ ├── popup.js # Popup logic
│ ├── content.js # Content extraction script
│ ├── background.js # Background service worker
│ └── icons/ # Extension icons
└── generate_icons.py # Script to generate extension icons
- Python 3.10 or later
- UV package manager
- Chrome or Chromium-based browser
First, generate the extension icons:
uv run --with pillow generate_icons.pyNavigate to the server directory and start the server:
cd server
uv run server.pyThe server will:
- Download the Pocket TTS model on first run (~100MB)
- Preload the default voice
- Listen on http://localhost:5050
- Open Chrome and go to
chrome://extensions/ - Enable "Developer mode" (toggle in top right)
- Click "Load unpacked"
- Select the
extensionfolder from this project
Note
You may need to restart Chrome for the extension to be fully active. Pocket Reader may also not appear in your toolbar by default.
- Make sure the TTS server is running
- Navigate to any web page you want to read
- Click the Pocket Reader extension icon
- Select a voice (optional)
- Click "Read Page"
The extension will:
- Extract the main content from the page
- Split the text into paragraphs
- Generate and play audio paragraph by paragraph (so you hear audio quickly, even for long articles)
Click "Stop" to stop playback at any time.
- Alba - Default casual voice
- Marius - Male voice
- Javert - Male voice
- Jean - Male voice
- Fantine - Female voice
- Cosette - Female voice
- Eponine - Female voice
- Azelma - Female voice
The TTS server provides the following endpoints:
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check |
/voices |
GET | List available voices |
/paragraphs |
POST | Split text into paragraphs |
/synthesize |
POST | Convert text to speech |
/preload |
POST | Preload model and voices |
curl -X POST http://localhost:5050/synthesize \
-H "Content-Type: application/json" \
-d '{"text": "Hello world!", "voice": "alba"}' \
--output speech.wav- Make sure the server is running (
uv run server.pyin the server directory) - Check that port 5050 is not in use by another application
- Check browser console for errors
- Ensure your browser allows audio playback
- The extension tries to find the main article content automatically
- Some pages with unusual layouts may not extract well
- You may need to reload the page to have the extension's script to restart on that page
This project uses Pocket TTS which is licensed under the MIT License. See Pocket TTS for more details.
