A command-line tool to extract fenced code blocks from markdown files and save them as individual source files.
- Automatic extension detection - Files get appropriate extensions based on code language (Go →
.go, Python →.py, etc.) - Extract code blocks from markdown files or stdin
- Preserve language information from fenced code blocks
- Customize output filenames and extensions
- Support for multiple code blocks in a single markdown file
- Configurable via CLI flags, config file, or environment variables
brew tap spandigital/tap
brew install codeblocksgo install github.com/spandigitial/codeblocks@latestgit clone https://github.com/SPANDigital/codeblocks.git
cd codeblocks
go buildExtract code blocks from stdin:
cat README.md | codeblocksExtract code blocks from a file:
codeblocks -i documentation.mdSpecify output directory and file extension:
codeblocks -i tutorial.md -o ./examples -e go -f exampleExtract code examples from documentation:
# Extract all code blocks from API documentation
codeblocks -i api-docs.md -o ./code-samples -f api-exampleProcess markdown from a URL via pipe:
# Download and extract code blocks
curl -s https://raw.githubusercontent.com/user/repo/main/README.md | codeblocks -o ./extractedExtract code for testing:
# Extract test examples from documentation
codeblocks -i tests.md -e test.go -f integration -o ./test-casesBatch process multiple files:
# Extract code from all markdown files
for file in docs/*.md; do
codeblocks -i "$file" -o ./code-samples -f "$(basename "$file" .md)"
doneBy default, codeblocks automatically detects the programming language from fenced code blocks and uses the appropriate file extension. This means your extracted code files will have the correct extension for their language, making them immediately usable.
Input markdown (example.md):
```go
package main
func main() { println("Hello from Go!") }
```
```python
def greet():
print("Hello from Python!")
```
```javascript
function greet() {
console.log("Hello from JavaScript!");
}
```Extract with auto-detected extensions:
$ codeblocks -i example.md
Saving file: sourcecode-0.go in /current/directory
Saving file: sourcecode-1.py in /current/directory
Saving file: sourcecode-2.js in /current/directoryThe tool automatically recognizes 40+ programming languages and data formats:
- Compiled languages: Go (
.go), Rust (.rs), C (.c), C++ (.cpp), Java (.java), Kotlin (.kt), Swift (.swift) - Scripting languages: Python (
.py), Ruby (.rb), Perl (.pl), PHP (.php), Lua (.lua) - Web technologies: JavaScript (
.js), TypeScript (.ts), HTML (.html), CSS (.css), JSX (.jsx), TSX (.tsx) - Shell scripts: Bash/Shell (
.sh), Fish (.fish), PowerShell (.ps1) - Data formats: JSON (
.json), YAML (.yaml), TOML (.toml), XML (.xml) - Markup: Markdown (
.md), LaTeX (.tex) - Database: SQL (
.sql) - Other: Dockerfile, Makefile, and more...
If you need all files to have the same extension, use the --extension flag to override auto-detection:
# Force all code blocks to use .txt extension
$ codeblocks -i example.md --extension txt
Saving file: sourcecode-0.txt
Saving file: sourcecode-1.txt
Saving file: sourcecode-2.txtThis is useful when:
- You want uniform extensions regardless of language
- You're extracting code snippets for documentation
- You need compatibility with systems that expect specific extensions
Code blocks with unknown or missing language identifiers automatically fallback to .txt:
Input markdown (example.md):
```unknownlang
some code in an unrecognized language
```Output:
$ codeblocks -i example.md
Saving file: sourcecode.txt| Flag | Short | Description | Default |
|---|---|---|---|
--input |
-i |
Input markdown file | stdin |
--extension |
-e |
File extension for output files (overrides auto-detection) | Auto-detected from language |
--filename-prefix |
-f |
Prefix for output filenames | sourcecode |
--output-directory |
-o |
Output directory | Current directory |
--config |
Config file path | $HOME/.codeblocks.yaml |
|
--help |
-h |
Show help information |
You can configure codeblocks using:
- Command-line flags (highest priority)
- Environment variables (prefix with
CODEBLOCKS_, e.g.,CODEBLOCKS_EXTENSION=go) - Config file at
$HOME/.codeblocks.yaml(lowest priority)
Create $HOME/.codeblocks.yaml:
extension: go
filename-prefix: example
output-directory: ./code-samplescodeblocks parses markdown using goldmark, walks the AST to find fenced code blocks, and extracts them with their language information. Each code block is saved as a separate file with the specified prefix and extension.
Example Input:
# My Tutorial
Here's a Go example:
```go
package main
func main() {
println("Hello, World!")
}
```
And a Python example:
```python
def hello():
print("Hello, World!")
```Command:
codeblocks -i tutorial.md -f exampleOutput (with automatic extension detection):
example-0.go(contains the Go code)example-1.py(contains the Python code)
- Go 1.25 or higher
go build -v ./...go test -v ./...echo '```go
package main
func main() { println("test") }
```' | go run main.go -e goContributions are welcome! Please feel free to submit a Pull Request.
MIT License - see LICENSE file for details.