LVMSync

LVMSync is a high-performance incremental data replication tool for LVM snapshots. It efficiently transfers only changed blocks using metadata from snapshot COW (Copy-On-Write) devices and communicates with LVM through native Go bindings rather than shell commands.

For benchmark methodology and reproducible performance numbers, see perf.md.

For details on running with minimal privileges and sudoers examples, see SECURITY.md and docs/sudoers.md. For snapshot cleanup, resuming transfers, and verify-only rollback procedures, see docs/danger_rollback.md.

Features

Incremental Block-Level Synchronization: Transfers only changed blocks.
Zero-Copy Transfers: Utilizes splice() for efficient data movement.
Parallel Execution: Configurable concurrency for optimal performance.
Adaptive Transport Concurrency: Maintains ~1–2×BDP of in-flight data and can be overridden with --concurrency.
Rate-Limiting: Control bandwidth usage during transfers.
Compression: Samples 8 KiB per chunk and skips compression when the ratio exceeds --compress-threshold. Auto mode selects Zstd on CPUs with AVX2 or NEON support, falling back to LZ4 when those features are absent. Compression levels are tuned via --lz4-level and --zstd-level. See compression documentation for pipeline details.
Checksum Verification: Ensures data integrity using SHA-256 or BLAKE3, automatically selecting BLAKE3 on CPUs with AES-NI, AVX2/AVX-512, or NEON.
Native LVM2 Integration: Uses Go bindings to liblvm2cmd instead of shelling out.
Generic Block Device Support: Access raw /dev/* paths and regular files (including loopback images) through a unified device abstraction.
Deduplication Strategies: Detect unchanged blocks using checksum, rolling hash, or a Bloom filter with optional FastCDC content-defined chunking and mmap-backed index.
Hashing: Hardware-accelerated XXH3 provides fast deduplication hints while BLAKE3 digests are stored in manifests for integrity.
Remote Execution via SSH: Replicates data over SSH with support for pre/post-scripts.
Resume Support: Ability to resume interrupted transfers with verification enabled by default (use --verify=none to skip).
Crash-Safe WAL: Records committed ranges in a write-ahead log so interrupted runs can recover. See WAL documentation for layout and replay details.
Probe and Verification Modes: --probe-only validates devices and privileges without writing and prints size_bytes kernel_uuid gpt_uuid mbr_signature fs_uuid major minor manifest_epoch to stdout, while --verify-only scans both sides and reports mismatches.
Dry-run Estimates: --dry-run samples the manifest to project bytes and ETA without transferring data.
Planning: --plan prints resolved configuration with secrets redacted, transport order, estimated bytes, and compression decisions as JSON without transferring data.
Device Identity Tuple: Each run records (size_bytes, kernel_uuid, gpt_uuid, mbr_signature, fs_uuid, major, minor, manifest_epoch) to prevent writing to the wrong destination.
Handshake Timeouts: All transports, including rsync, apply context deadlines during handshakes and clear them once negotiation succeeds.
Sparse Destination Optimization: Detects runs of zero bytes and punches holes when the filesystem supports it. Use --sparse=never to always write zeros instead.
Aligned I/O Buffers and NUMA Pinning: --odirect allocates block-size aligned slabs from a sync.Pool and can pin worker goroutines to a device's NUMA node (--numa-pin) or an explicit node (--numa-node).
LVM Snapshot Management:
- Automatic snapshot creation and removal.
- Configurable snapshot size (absolute or percentage-based) via --snapshot-size, LVMSYNC_SNAPSHOT_SIZE, or the snapshot_size config key.
- Configurable snapshot usage threshold via --snapshot-max-usage.
- Configurable volume group for constructing the snapshot device path.
- Auto-selection of target volume groups with sufficient free space.
- Automatic privilege escalation (defaulting to sudo -n).
- Snapshot health monitoring that fails fast if usage exceeds a threshold.
- Snapshot monitor goroutine closes its error channel on exit; cleanup only cancels monitoring, avoiding send-on-closed-channel panics (see TestCreateSnapshotCleanupNoPanic).
- See LVM snapshot documentation for snapshot lifecycle and mount checks.
Graceful Shutdown: Signal handling ensures snapshots are cleaned up on interruption.
Flexible Configuration: Flags, environment variables, or config.yaml. See Configuration. Configuration values follow flag > environment variable > config file precedence.
Configuration Validation: Checks key parameters (e.g., volume group existence, escalation command) before starting operations.

Resume, verification, and safe overwrite flows

Transfers store the device identity tuple (size_bytes, kernel_uuid, gpt_uuid, mbr_signature, fs_uuid, major, minor, manifest_epoch) and compare it against the destination before writing. Partition-table mismatches return a precondition failure to avoid accidental overwrites. Use --force to bypass this check when intentionally overwriting.

Example --probe-only output showing size_bytes kernel_uuid gpt_uuid mbr_signature fs_uuid major minor manifest_epoch:

lvmsync run --probe-only /dev/vg0/snap0 /dev/vg0/target
# 10737418240 12345678-9abc-def0-1234-56789abcdef0 9abcdef0-1234-5678-90ab-cdef12345678 1a2b3c4d 0fedcba9-8765-4321-0fed-cba987654321 253 0 1700000000

--resume=statefile continues an interrupted run (verification runs unless --verify=none).
--verify-only reads both devices and reports mismatches without writing data. Resume after failure

lvmsync run --dry-run --resume=statefile /dev/vg0/snap0 /dev/vg0/target

Resume with verification

lvmsync run --dry-run --resume=verify /dev/vg0/snap0 /dev/vg0/target

Verification only

lvmsync run --dry-run --verify-only /dev/vg0/snap0 /dev/vg0/target

Safe overwrite procedure

lvmsync run --dry-run --probe-only /dev/vg0/snap0 /dev/vg0/target
lvmsync run --dry-run --verify-only /dev/vg0/snap0 /dev/vg0/target
lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/target

Exit code 3 signals verification mismatches. See operations guide for detailed recovery steps.

Supported Platforms

LVMSync supports Linux only. A runtime check in main.go aborts execution on other operating systems with exit code 1. The project is regularly tested on amd64 and arm64 architectures.

To cross-compile for another Linux architecture, set GOOS=linux and the desired GOARCH:

GOOS=linux GOARCH=arm64 go build ./...

Device Support Matrix

Device type	Source	Destination	Notes
LVM snapshot	✅	❌	snapshots are auto-created
Raw block device	✅	✅	requires `--offline` or `--fs-freeze-command`/`--fs-thaw-command` when used as a source
Regular file	✅	✅	includes loopback images

Override automatic detection with --source-type and --dest-type when a device's type is known in advance.

Offline requirements

Raw sources must be quiescent or provide filesystem freeze/thaw hooks using --fs-freeze-command and --fs-thaw-command. These command paths must be absolute. LVM snapshots are consistent by design, while regular files require no additional coordination.

Transport Options

LVMSync negotiates transports in the order provided by --transport (default ssh,tcp+tls,h2,quic). If a transport fails to connect, the next transport is tried and each attempt is logged. All transports require TLS 1.3 with mutual authentication or SSH host key verification unless --allow-insecure is set. The rsync transport is plaintext and refuses to initialize unless --allow-insecure acknowledges the lack of encryption. Enabling this transport logs a warning noting the plaintext connection. The client sends the destination device identity and the server refuses to write if it differs, returning a precondition failure. See docs/transports.md for details.

Transport	Security defaults	Notes
`quic`	TLS 1.3, BBR congestion	UDP-based transport
`h2`	TLS 1.3	HTTP/2 streams
`tcp+tls`	TLS 1.3	Plain TCP wrapped in TLS
`ssh`	Host key verification	Uses OpenSSH-style authentication
`rsync`	Plaintext, requires `--allow-insecure`	rsync wire protocol, enforces destination identity

Examples

Select multiple transports and a custom port:

lvmsync run --dry-run --transport ssh,tcp+tls,h2,quic --tcp-port 9443 /dev/vg0/source /dev/vg0/backup

Force SSH only:

lvmsync run --dry-run --transport ssh user@backup:/dev/vg1/target /dev/vg0/source

The CLI groups transport flags using pflag and binds them to viper while emitting structured logs via zap. All flags use kebab-case (e.g., --client-cert, --allow-insecure) for consistency across commands:

import (
    "github.com/spf13/pflag"
    "github.com/spf13/viper"
    "go.uber.org/zap"
)

func main() {
    logger, _ := zap.NewProduction()
    defer logger.Sync()

    transport := pflag.NewFlagSet("transport", pflag.ExitOnError)
    transport.String("transport", "ssh,tcp+tls,h2,quic", "ordered transports")
    transport.Int("tcp-port", 9443, "TCP listener port")

    v := viper.New()
    v.BindPFlags(transport)
}

Manifest Lifecycle

Transfers rely on a manifest that tracks chunk offsets and digests:

lvmsync manifest rebuild <device> refreshes or creates the manifest.
lvmsync run <source> <destination> streams blocks, skipping chunks already recorded in the manifest.
lvmsync verify <source> <destination> compares the destination with the manifest and logs any mismatches.

Garbage Collection & Atomic Commit describes how obsolete entries are pruned and rewritten safely.

authentication. Provide certificate files with --server-cert, --server-key, --client-cert, --client-key, and --ca-cert. Insecure mode disables certificate and host key verification and can be enabled with --allow-insecure, but it logs a warning and should only be used for testing.

Configuration can be supplied via flags, environment variables prefixed with variables, which override configuration files.

lvmsync daemon

The lvmsyncd binary loads optional modules and listens on one or more URIs. Use --listen repeatedly to specify addresses and --module to load plugin modules.

Configuration can come from flags, LVMSYNC_DAEMON_ environment variables, or a lvmsyncd.yaml file. Multi-value environment variables are comma-separated.

Flag	Environment variable	Config key	Description
`--listen`	`LVMSYNC_DAEMON_LISTEN`	`listen`	comma-separated list of listen URIs
`--module`	`LVMSYNC_DAEMON_MODULE`	`module`	comma-separated module paths

Example:

LVMSYNC_DAEMON_LISTEN=unix:///run/lvmsyncd.sock,tcp+tls://:9000 lvmsyncd --module ./mod.so

See daemon documentation for module configuration, ACLs, and listener options.

Resume and Verify Workflows

Resume interrupted transfers with a state file:

lvmsync run --dry-run --resume=statefile /dev/vg0/snap0 /dev/vg0/data

Generate a manifest and verify a destination:

lvmsync manifest rebuild /dev/vg0/snap0
lvmsync verify /dev/vg0/snap0 /dev/vg0/data

Resume files track the last completed chunk and are removed after a successful transfer. See docs/manifest.md for manifest and verification details.

Safety Notes

Run manifest rebuild and verify against quiescent devices.
Use --offline or freeze/thaw hooks when scanning live filesystems to keep manifests consistent.
Network transports enforce mutual TLS or host key verification; --allow-insecure disables these checks, logs a warning, and should only be used for testing.
Back up destination data before running transfers; writes are destructive.

Supported Platforms

LVMSync targets Linux systems only. Builds are tested on the amd64 and arm64 architectures.

Roadmap

Pluggable data plane: QUIC, HTTP/2, TLS/TCP, SSH
Hybrid fixed + CDC deduplication with Bloom filter index
Adaptive compression using LZ4 or Zstd with per-chunk sampling
Throughput mode presets for high-bandwidth links

See AGENTS.md for contributor tasks and design guidelines.

Architecture

LVMSync is organized into modular packages to keep concerns separated:

lvm – manages snapshot creation, monitoring, and cleanup.
device – opens and queries generic block devices such as raw /dev/* paths and regular files.
transfer – performs block-level synchronization, compression, deduplication, and resume logic.
- Internally split into focused modules: progress.go, handshake.go, and block_writer.go for clearer responsibilities.
remote – wraps SSH functionality for running commands on remote hosts and coordinating transfers. Callers must provide a context.Context with a timeout when starting the privileged helper to allow cancellation if the remote command fails to launch.
internal/config – parses and validates configuration files and CLI options.
dedup – houses Bloom filter helpers, chunking logic, and other deduplication utilities.
common and internal – shared helpers and internal utilities such as multi-error handling.
internal/client – coordinates snapshot preparation and client transfer execution.
cmd/dump – handles snapshot dumping and transport selection.
cmd/root – configures the application and routes to subcommands.
cmd/lvmsync – CLI orchestrator with a signals subpackage for signal handling and cleanup.
cmd/lvmsyncd – module loading daemon accepting multiple listen URIs.

This structure allows individual packages to be developed and tested in isolation.

Refactoring Notes

Snapshot preparation helpers (ensureVolumeGroups, checkDiskSpaceForSnapshot, createSnapshotIfNeeded, PrepareSnapshot) and client execution logic are consolidated under internal/client.
These helpers no longer rely on global variables; configuration and loggers are passed explicitly.
main.go now delegates to cmd/root, which wires together cmd/dump.

Logging

LVMSync emits structured logs using zap. Errors are logged with structured fields instead of being written to stderr, and the logger is flushed on shutdown to ensure all entries are persisted. A production logger is initialized immediately so even configuration failures during startup are reported through the same structured format. When --progress is enabled, progress updates are emitted as structured log entries, allowing external tooling to track transfer completion.

Expectations

Use zap for all logging and avoid fmt.Print* or log.* calls.
Pass loggers explicitly to commands and helpers; cmd/lvmsync.Execute requires a *zap.Logger and accepts a *lvmsync.Runner for dependency injection instead of relying on zap.L().
All commands receive an explicit *zap.Logger and default to zap.NewNop() when no logger is supplied.
Device constructors return an error when the logger is nil; transport constructors default to zap.NewNop() when no logger is supplied.
Log field keys in snake_case and include units where relevant (for example, duration_ms).
Provide raw byte values alongside human-readable sizes (for example, block_size and block_size_bytes).
Always defer syncLogger(logger) to flush buffers and log if the sync fails.

The example below demonstrates these conventions:

package main

import (
    "time"

    "go.uber.org/zap"
)

func syncLogger(logger *zap.Logger) {
    if err := logger.Sync(); err != nil {
        logger.Error("sync failed", zap.Error(err))
    }
}

func main() {
    logger, _ := zap.NewProduction()
    defer syncLogger(logger)
    start := time.Now()

    src := "/dev/vg0/source"
    dst := "/dev/vg0/backup"

    logger.Info("snapshot complete",
        zap.String("source_path", src),
        zap.String("dest_path", dst),
        zap.Int64("duration_ms", time.Since(start).Milliseconds()),
    )
}

Errors during block operations log the byte offset and block size explicitly:

Logger.Warn("Zero-copy transfer failed",
    zap.Int64("offset", offset),
    zap.Int("size_bytes", blockSize),
    zap.Int("attempt", attempt+1),
    zap.Error(err),
)

Field	Description
`offset`	Byte offset from the start of the device
`size_bytes`	Size of the block being processed
`attempt`	Current retry attempt

Configuration

LVMSync uses pflag and viper to accept options from flags, environment variables, and a YAML file. Flag groups are organized into dedicated FlagSets that are registered with the root command and bound to Viper. The CLI exposes subcommands using cobra, with run handling transfers, manifest rebuild regenerating manifests, and verify checking source and destination data. Source and destination paths for run and verify are provided as positional arguments after any flags:

lvmsync run --dry-run [flags] <source> <dest>

When run with --dry-run, LVMSync loads any manifest at --manifest-path and samples up to 100 blocks to estimate the bytes that would be transmitted. The estimate, expected duration in milliseconds (estimated_duration_ms), and bandwidth in bits per second (estimated_bandwidth_bps) are logged without sending data. For example:

{"level":"info","msg":"dry run","size_bytes":4096,"estimated_tx_bytes":4096,"estimated_duration_ms":2000,"estimated_bandwidth_bps":16000}

Running with --plan emits a JSON document describing the resolved configuration (with sensitive fields like SSH passwords and TLS keys redacted), transport order, estimated transfer bytes, and the compression algorithm selected for each chunk size class.

Examples

Set the parallel worker count using any configuration source:

CLI flag:

lvmsync run --dry-run --parallel 16

Environment variable:

LVMSYNC_PARALLEL=16 lvmsync run --dry-run

config.yaml:

parallel: 16

Flag groups

Flags are grouped in the CLI help:

General Options – worker counts, speed limits, progress controls.
SSH Options – credentials and connection settings.
Remote Options – remote hooks and lvmsync path.
Deduplication Options – dedup strategy and state storage.
Compression Options – algorithm and level tuning.
LVM Options – snapshot management and privilege escalation.
Transport Options – configure data transports (QUIC, HTTP/2, TCP+TLS, SSH).
Manifest Options – manifest path overrides and related settings.

Internally, each group is set up through a dedicated helper such as initGeneralFlags, initSSHFlags, or initCompressionFlags, keeping flag definitions focused and easy to maintain.

Example:

lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup

func initConfig() *viper.Viper {
    v := viper.New()

    general := pflag.NewFlagSet("general", pflag.ExitOnError)
    general.Bool("progress", true, "show progress")

    lvm := pflag.NewFlagSet("lvm", pflag.ExitOnError)
    lvm.String("volume_group", "", "target volume group")

    pflag.CommandLine.AddFlagSet(general)
    pflag.CommandLine.AddFlagSet(lvm)

    v.BindPFlags(pflag.CommandLine)
    v.SetEnvPrefix("LVMSYNC")
    v.AutomaticEnv()
    v.SetConfigName("config")
    v.AddConfigPath(".")
    return v
}

Grouped help

Each subcommand prints its relevant flag groups:

$ lvmsync run --help
General Options:
      --parallel int   number of worker goroutines (default 4)
...
Transport Options:
      --transport string   transport modes (comma-separated)

$ lvmsync manifest rebuild --help
General Options:
      --dry-run   skip execution
Manifest Options:
      --manifest-path string   manifest file path

$ lvmsync verify --help
General Options:
      --block-size string   block size for comparisons
Manifest Options:
      --manifest-path string   manifest to verify against

This groups related flags once and lets Viper merge values from flags, `LVMSYNC_*` variables, and the
`config.yaml` file.

The overall loading flow now passes an explicit `FlagSet` and argument slice:

1. `registerFlags(flagSets, fs)` adds all flag groups to the provided flag set.
2. `config.NewBuilder(defaults).Build(fs, args)` parses the arguments, binds flags and `LVMSYNC_*` environment variables with Viper,
   merges them with defaults and any `config.yaml` file, and returns the effective configuration plus leftover positional arguments.

`cmd/root.Configure` surfaces those leftover arguments so `Run` operates purely on provided inputs.

### New and updated flags

Recent refactors added several configuration options:

- `--tcp-port` and `--ssh-port` expose TCP+TLS and SSH endpoints.
- `--tcp-parallel` controls the number of parallel TCP connections (2–4).
- `--tcp-lowat` sets TCP_NOTSENT_LOWAT to limit unsent bytes.
- `--sync-interval` controls how many bytes are written between `fdatasync` calls. Accepts size suffixes like `64KB` or `1GB`; invalid values return an error.
- `--checkpoint-interval` sets how often resume state is persisted.
- `--checkpoint-bytes` sets how many bytes are written between resume checkpoints.
- `--block-size` sets the transfer block size (use `auto` for detection).

### I/O tuning

- `--block-size` selects the transfer block size. Use `auto` to match the destination's physical sector size.
- `--sync-interval` sets how many bytes are written between `fdatasync` calls. Accepts size suffixes like `64KB` or `1GB`; invalid values cause startup errors.
- `--odirect` uses O_DIRECT with block-size aligned buffers.
- `--numa-pin` pins worker goroutines to CPUs local to the source device's NUMA node. If `/sys` lacks NUMA details, LVMSync logs a warning and continues without pinning. Use `--numa-node` to override.
- `--numa-node` pins worker goroutines to the specified NUMA node.

### Device types

LVMSync works with three kinds of source and destination devices. Auto-detection
examines the path to select the correct handling:

| Type | Detection | Notes |
|------|-----------|-------|
| `lvm` | `/dev/<vg>/<lv>` or `/dev/mapper/<vg>-<lv>` | A snapshot is created and removed automatically |
| `raw` | Other block devices | Require `--skip-snapshot-creation` and either `--offline` or `--fs-freeze-command`/`--fs-thaw-command` |
| `file` | Regular files | Used as-is with no snapshot |

Override detection with `--source-type` and `--dest-type` when necessary.

Internally, `device.Detect` delegates to dedicated helpers:

```go
esc, err := privilege.New(ctx, logger)
if err != nil {
    // handle error
}
dev, err := device.Detect(ctx, "/dev/sdb", true, true, "auto", "", "", "", 0, 0, esc, logger, device.NewRunner())
// detectFileDevice, detectLVMDevice, or detectRawDevice is selected based on the path.

Snapshots provide a crash-consistent view of a device. LVM volumes are snapshotted automatically and removed after transfer. Raw block devices and regular files do not have a snapshot mechanism; to avoid inconsistent reads you must either take them offline with --offline or freeze the filesystem with --fs-freeze-command and --fs-thaw-command. Snapshot creation requires root privileges, so non-root invocations must permit escalation via sudo -n. The escalation command is checked during device detection and operations abort immediately if escalation fails.

Examples:

lvmsync --source-type lvm /dev/vg0/origin /tmp/dump
lvmsync --dest-type raw dumpfile /dev/sdb
lvmsync --source-type raw --offline /dev/sdb /tmp/dump
lvmsync --source-type raw --fs-freeze-command "/usr/sbin/fsfreeze -f '/mnt/data dir'" --fs-thaw-command "/usr/sbin/fsfreeze -u '/mnt/data dir'" /dev/sdb /tmp/dump

Raw device safety

Reading from a live block device can corrupt data if writes occur during the transfer. Ensure a consistent view with one of the following options:

--offline – assert that no process will write to the source device.
--fs-freeze-command/--fs-thaw-command – run commands that freeze and thaw the filesystem around the read. Command paths must be absolute. Arguments are parsed with shell-style quoting, so wrap paths containing spaces in quotes.
Time out freeze and thaw helpers with --freeze-timeout and --thaw-timeout (default 10s).

Freeze and thaw commands are validated before execution. Paths must be absolute, command names must match ^[a-zA-Z0-9._-]+$, be set, free of NUL bytes, every argument must avoid NULs, and the executable must exist; otherwise lvmsync returns an error.

Example using the provided scripts:

lvmsync --source-type raw \
  --fs-freeze-command "$(pwd)/docs/fsfreeze-freeze.sh /mnt" \
  --fs-thaw-command "$(pwd)/docs/fsfreeze-thaw.sh /mnt" \
  /dev/sdb /tmp/dump

docs/fsfreeze-freeze.sh and docs/fsfreeze-thaw.sh demonstrate basic freeze and thaw operations; add the scripts to your $PATH to use them.

Configuration sources and precedence

LVMSync uses pflag and viper so every option can be set via flags, environment variables, or the config.yaml file. Values are resolved with the following precedence (highest first):

Command-line flags
LVMSYNC_* environment variables
config.yaml
Built-in defaults

This precedence applies to duration timeouts, filesystem paths, and security flags. Unknown keys in config.yaml generate warnings, and settings like allow_insecure must be explicitly acknowledged via the flag.

Environment variables use the flag name in uppercase with underscores, e.g.:

export LVMSYNC_PARALLEL=8
export LVMSYNC_SSH_USER=backup

For example, if config.yaml sets dedup_strategy: bloom and the environment specifies LVMSYNC_DEDUP_STRATEGY=checksum, running lvmsync --dedup-strategy auto resolves to auto. Likewise, --transport quic overrides LVMSYNC_TRANSPORT_TRANSPORT=ssh and the transport key in config.yaml.

--sanitize-env and filesystem freeze/thaw commands follow the same precedence. With a config.yaml containing:

fs-freeze-command: "/usr/sbin/fsfreeze -f '/mnt/yaml dir'"
fs-thaw-command: "/usr/sbin/fsfreeze -u '/mnt/yaml dir'"
sanitize_env: false

export LVMSYNC_FS_FREEZE_COMMAND="/usr/sbin/fsfreeze -f '/mnt/env dir'"
export LVMSYNC_FS_THAW_COMMAND="/usr/sbin/fsfreeze -u '/mnt/env dir'"
export LVMSYNC_SANITIZE_ENV=0

Running:

lvmsync --fs-freeze-command "/usr/sbin/fsfreeze -f '/mnt/flag dir'" \
        --fs-thaw-command "/usr/sbin/fsfreeze -u '/mnt/flag dir'" \
        --sanitize-env run /dev/sdb /tmp/dump

uses the flag-supplied command paths and enables sanitization.

For boolean options, the same precedence applies. If config.yaml specifies check_partition: true but LVMSYNC_CHECK_PARTITION=false is set, lvmsync --check-partition enables the check. Omitting the flag leaves partition checks disabled because the environment value overrides the YAML configuration.

With a config.yaml containing:

parallel: 4

running LVMSYNC_PARALLEL=8 lvmsync run --parallel 16 results in parallel=16 because flags override environment variables, which override the config file.

For a boolean option:

dry_run: true

running LVMSYNC_DRY_RUN=true lvmsync run --dry-run=false src dst performs a real transfer because the --dry-run flag overrides both the environment variable and the config file. running LVMSYNC_DRY_RUN=true lvmsync verify --dry-run=false src dst performs a full verification because the --dry-run flag overrides both the environment variable and the config file.

A similar hierarchy applies to duration values:

retry_delay: 1s

Running LVMSYNC_RETRY_DELAY=2s lvmsync run --retry-delay 3s uses a retry delay of 3s.

Environment variables for the lvmsync daemon use the LVMSYNC_DAEMON_ prefix. Multi-value settings are comma-separated:

LVMSYNC_DAEMON_LISTEN=unix:///run/lvmsyncd.sock,tcp+tls://:9000 lvmsyncd

Grouped options use dedicated prefixes: LVMSYNC_DEDUP_, LVMSYNC_DAEMON_. For example:

LVMSYNC_LVM_SNAPSHOT_SIZE=25% lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup

Option reference

Flags override environment variables, which override config.yaml values.

Flag	Environment variable	Config key	Description
`--config`	`LVMSYNC_CONFIG`	`config`	Path to config YAML file
`--stdout`	`LVMSYNC_STDOUT`	`stdout`	Write change dump to STDOUT (prompts when TTY, requires `--yes-i-know` otherwise)
`--strict-config`	`LVMSYNC_STRICT_CONFIG`	`strict-config`	Treat configuration warnings as errors
`--yes-i-know`	`LVMSYNC_YES_I_KNOW`	`yes_i_know`	Confirm destructive write operations in non-interactive sessions
`--source-type`	`LVMSYNC_SOURCE_TYPE`	`source-type`	Source device type: `auto`, `file`, `raw`, or `lvm`
`--dest-type`	`LVMSYNC_DEST_TYPE`	`dest-type`	Destination device type: `auto`, `file`, `raw`, or `lvm`
`--offline`	`LVMSYNC_OFFLINE`	`offline`	Assume source raw device is offline
`--fs-freeze-command`	`LVMSYNC_FS_FREEZE_COMMAND`	`fs-freeze-command`	Command to freeze filesystem before reading raw source; path must be absolute, arguments are split with shell-style quoting and executable name must match `^[a-zA-Z0-9._-]+$`
`--fs-thaw-command`	`LVMSYNC_FS_THAW_COMMAND`	`fs-thaw-command`	Command to thaw filesystem after reading raw source; path must be absolute, arguments are split with shell-style quoting and executable name must match `^[a-zA-Z0-9._-]+$`
`--freeze-timeout`	`LVMSYNC_FREEZE_TIMEOUT`	`freeze-timeout`	Timeout for filesystem freeze command
`--thaw-timeout`	`LVMSYNC_THAW_TIMEOUT`	`thaw-timeout`	Timeout for filesystem thaw command
`--mode`	`LVMSYNC_MODE`	`mode`	Configuration preset: `default` or `throughput`; unknown modes fail validation
`--parallel`	`LVMSYNC_PARALLEL`	`parallel`	Number of concurrent workers
`--concurrency`	`LVMSYNC_TRANSPORT_CONCURRENCY`	`concurrency`	Stream concurrency (0 to autotune based on BDP)
`--zerocopy`	`LVMSYNC_ZEROCOPY`	`zerocopy`	Enable zero-copy transfers
`--odirect`	`LVMSYNC_ODIRECT`	`odirect`	Use O_DIRECT for device I/O when possible
`--numa-pin`	`LVMSYNC_NUMA_PIN`	`numa_pin`	Pin worker goroutines to device NUMA node; logs a warning and continues if NUMA data is missing
`--numa-node`	`LVMSYNC_NUMA_NODE`	`numa_node`	Pin worker goroutines to specified NUMA node, overriding automatic detection
`--max-retries`	`LVMSYNC_MAX_RETRIES`	`max_retries`	Maximum number of retries per block
`--retry-delay`	`LVMSYNC_RETRY_DELAY`	`retry_delay`	Initial delay between retries
`--resume`	`LVMSYNC_RESUME`	`resume`	Path to resume state file (verification runs unless `--verify=none`)
`--verify-only`	`LVMSYNC_VERIFY_ONLY`	`verify_only`	Verify destination against source without writing data
`--speed`	`LVMSYNC_SPEED`	`speed`	Transfer speed limit
`--sync-interval`	`LVMSYNC_SYNC_INTERVAL`	`sync_interval`	Bytes between fdatasync calls (accepts size suffixes like `64KB`; invalid values error)
`--checkpoint-bytes`	`LVMSYNC_CHECKPOINT_BYTES`	`checkpoint_bytes`	Bytes between resume checkpoints
`--checkpoint-interval`	`LVMSYNC_CHECKPOINT_INTERVAL`	`checkpoint_interval`	Duration between checkpoints
`--block-size`	`LVMSYNC_BLOCK_SIZE`	`block_size`	Block size for data transfer; specify 'auto' or 0 for automatic detection
`--verbose`	`LVMSYNC_VERBOSE`	`verbose`	Verbosity level
`--verify-checksum`	`LVMSYNC_VERIFY_CHECKSUM`	`verify_checksum`	Enable checksum verification
`--verify`	`LVMSYNC_VERIFY`	`verify`	Verification level: `inline`, `post`, or `none`
`--digest`	`LVMSYNC_DIGEST`	`digest`	Digest algorithm: `auto`, `blake3`, or `sha256` (`auto` selects `blake3` when AVX2, AVX-512, or NEON is available, otherwise `sha256`)
`--progress`	`LVMSYNC_PROGRESS`	`progress`	Show progress during transfer
`--output`	`LVMSYNC_OUTPUT`	`output`	Output format: `text`, `json`, or `yaml`
`--delta`	`LVMSYNC_DELTA`	`delta`	Delta algorithm: `none` or `rsync`
`--manifest-path`	`LVMSYNC_MANIFEST_PATH`	`manifest_path`	Path to manifest file
`--manifest-progress-interval`	`LVMSYNC_MANIFEST_PROGRESS_INTERVAL`	`manifest_progress_interval`	Interval between progress logs during manifest rebuild
`--manifest-timeout`	`LVMSYNC_MANIFEST_TIMEOUT`	`manifest_timeout`	Timeout for manifest rebuild (0 disables)
`--manifest-allow-mounted`	`LVMSYNC_MANIFEST_ALLOW_MOUNTED`	`manifest_allow_mounted`	Allow rebuilding when device is mounted read-write
`--ssh-host`	`LVMSYNC_SSH_HOST`	`ssh_host`	SSH host
`--ssh-user`	`LVMSYNC_SSH_USER`	`ssh_user`	SSH username
`--ssh-key`	`LVMSYNC_SSH_KEY`	`ssh_key`	Path to SSH private key
`--ssh-host-key-path`	`LVMSYNC_SSH_HOST_KEY_PATH`	`ssh_host_key_path`	Path to SSH host private key
`--ssh-agent`	`LVMSYNC_SSH_AGENT`	`ssh_agent`	Use SSH agent for authentication
`--ssh-port`	`LVMSYNC_SSH_PORT`	`ssh_port`	SSH port
`--ssh-timeout`	`LVMSYNC_SSH_TIMEOUT`	`ssh_timeout`	SSH connection timeout
`--ssh-keepalive`	`LVMSYNC_SSH_KEEPALIVE`	`ssh_keepalive`	SSH keepalive interval
`--ssh-host-key`	`LVMSYNC_SSH_HOST_KEY`	`ssh_host_key`	Expected SSH host public key
`--known-hosts`	`LVMSYNC_KNOWN_HOSTS`	`known_hosts`	Path to known_hosts file
`--strict-host-key-checking`	`LVMSYNC_STRICT_HOST_KEY_CHECKING`	`strict_host_key_checking`	Require host keys to be present in `known_hosts`; when `false`, host key verification is disabled
`--lvmsync-path`	`LVMSYNC_LVMSYNC_PATH`	`lvmsync_path`	Remote command to run (basename sanitized; only `[a-zA-Z0-9._-]+` allowed)
`--remote-pre-script`	`LVMSYNC_REMOTE_PRE_SCRIPT`	`remote_pre_script`	Remote script to run before transfer (times out after `ssh_timeout`)
`--remote-post-script`	`LVMSYNC_REMOTE_POST_SCRIPT`	`remote_post_script`	Remote script to run after transfer (separate `ssh_timeout`)
`--dedup-strategy`	`LVMSYNC_DEDUP_STRATEGY`	`dedup_strategy`	Deduplication strategy: `none`, `auto`, `checksum`, `rolling_hash`, or `bloom`
`--dedup-state-file`	`LVMSYNC_DEDUP_STATE_FILE`	`dedup_state_file`	Path to deduplication state file
`--intra-dedup`	`LVMSYNC_DEDUP_INTRA_DEDUP`	`intra_dedup`	Enable intra-run deduplication
`--cdc-min`	`LVMSYNC_DEDUP_CDC_MIN`	`cdc_min`	Minimum chunk size for CDC (must be at least 64 bytes)
`--cdc-avg`	`LVMSYNC_DEDUP_CDC_AVG`	`cdc_avg`	Target average chunk size for CDC
`--cdc-max`	`LVMSYNC_DEDUP_CDC_MAX`	`cdc_max`	Maximum chunk size for CDC
`--chunk-seed`	`LVMSYNC_DEDUP_CHUNK_SEED`	`chunk_seed`	Seed for chunking
`--bloom-entries`	`LVMSYNC_DEDUP_BLOOM_ENTRIES`	`bloom_entries`	Estimated number of entries for bloom filter
`--bloom-fp-rate`	`LVMSYNC_DEDUP_BLOOM_FP_RATE`	`bloom_fp_rate`	False positive rate for bloom filter
`--bloom-mbits`	`LVMSYNC_DEDUP_BLOOM_MBITS`	`bloom_mbits`	Bloom filter m bits power
`--compress`	`LVMSYNC_COMPRESSION_COMPRESS`	`compress`	Compression type: `none`, `lz4`, `zstd`, or `auto`
`--zstd-level`	`LVMSYNC_COMPRESSION_ZSTD_LEVEL`	`zstd_level`	Zstd compression level (`1-5`)
`--lz4-level`	`LVMSYNC_COMPRESSION_LZ4_LEVEL`	`lz4_level`	LZ4 compression level: `fast` or `hc`
`--compress-concurrency`	`LVMSYNC_COMPRESSION_COMPRESS_CONCURRENCY`	`compress_concurrency`	Compression concurrency (0 to use `GOMAXPROCS`)
`--compress-threshold`	`LVMSYNC_COMPRESSION_COMPRESS_THRESHOLD`	`compress_threshold`	Skip compression when estimated ratio exceeds this value
`--skip-snapshot-creation`	`LVMSYNC_SKIP_SNAPSHOT_CREATION`	`skip_snapshot_creation`	Skip automatic snapshot creation (requires `--force`)
`--skip-disk-check`	`LVMSYNC_SKIP_DISK_CHECK`	`skip_disk_check`	Skip disk space check before snapshot creation
`--snapshot-size`	`LVMSYNC_SNAPSHOT_SIZE`	`snapshot_size`	Snapshot size (e.g., `20G` or `20%`)
`--snapshot-max-usage`	`LVMSYNC_SNAPSHOT_MAX_USAGE`	`snapshot_max_usage`	Maximum allowed snapshot usage percent before aborting
`--lvm-escalation`	`LVMSYNC_LVM_ESCALATION`	`lvm_escalation`	Command used to escalate privileges for LVM commands; parsed with shell-style quoting and validated at startup
`--sanitize-env`	`LVMSYNC_SANITIZE_ENV`	`sanitize_env`	Drop dangerous variables like `LD_PRELOAD` and remove `PATH`/`LANG` during escalation (disabled by default)
`--no-new-privs`	`LVMSYNC_NO_NEW_PRIVS`	`no_new_privs`	Set `PR_SET_NO_NEW_PRIVS` before invoking `sudo`
`--lvm-timeout`	`LVMSYNC_LVM_TIMEOUT`	`lvm_timeout`	Timeout for LVM operations and privilege checks
`--sig-cache-ttl`	`LVMSYNC_LVM_SIG_CACHE_TTL`	`sig-cache-ttl`	TTL for cached LVM signatures
`--sig-cache-max`	`LVMSYNC_LVM_SIG_CACHE_MAX`	`sig-cache-max`	Maximum cached LVM signatures
`--volume-group`	`LVMSYNC_VOLUME_GROUP`	`volume_group`	Source volume group; derived from the source device path when empty
`--target-volume-group`	`LVMSYNC_TARGET_VOLUME_GROUP`	`target_volume_group`	Volume group name of the target LVM volume
`--target-vgs`	`LVMSYNC_TARGET_VGS`	`target_vgs`	Candidate target volume groups for auto-selection
`--create-dest-lv`	`LVMSYNC_CREATE_DEST_LV`	`create_dest_lv`	Create destination logical volume when missing (requires `--force` or confirmation)
`--force`	`LVMSYNC_FORCE`	`force`	Override safety checks and proceed on mounted destination
`--force-offline`	`LVMSYNC_FORCE_OFFLINE`	`force_offline`	Allow direct device writes; prompts for `double-confirm` when interactive (requires `--yes-i-know` otherwise)
`--allow-overwrite`	`LVMSYNC_ALLOW_OVERWRITE`	`allow_overwrite`	Allow overwriting existing data; requires `--yes-i-know` for non-interactive sessions
`--check-partition`	`LVMSYNC_CHECK_PARTITION`	`check_partition`	Verify partition signatures for source and destination
`--discard`	`LVMSYNC_DISCARD`	`discard`	Issue BLKDISCARD before writing blocks and verify discarded regions
`--dry-run`	`LVMSYNC_DRY_RUN`	`dry_run`	Log estimated transfer bytes without sending data; uses manifest sampling when available
`--enable-quic`	`LVMSYNC_ENABLE_QUIC`	`enable_quic`	Enable QUIC transport registration
`--plan`	`LVMSYNC_PLAN`	`plan`	Print configuration plan as JSON and exit
`--verify-only`	`LVMSYNC_VERIFY_ONLY`	`verify_only`	Read source and destination and report mismatches without writing data
`--probe-only`	`LVMSYNC_PROBE_ONLY`	`probe_only`	Validate devices and privileges and print `size_bytes kernel_uuid gpt_uuid mbr_signature fs_uuid major minor manifest_epoch` without transferring data
`--sparse`	`LVMSYNC_SPARSE`	`sparse`	Sparse file handling: `auto` punches holes, `never` writes zero blocks
`--transport`	`LVMSYNC_TRANSPORT_TRANSPORT`	`transport`	Ordered transports to try (e.g., `ssh,tcp+tls,h2,quic`)
`--tcp-port`	`LVMSYNC_TRANSPORT_TCP_PORT`	`tcp_port`	TCP+TLS port
`--tcp-parallel`	`LVMSYNC_TRANSPORT_TCP_PARALLEL`	`tcp_parallel`	Number of parallel TCP connections
`--tcp-lowat`	`LVMSYNC_TRANSPORT_TCP_LOWAT`	`tcp_lowat`	TCP_NOTSENT_LOWAT in bytes
`--client-cert`	`LVMSYNC_CLIENT_CERT`	`client_cert`	Client TLS certificate file
`--client-key`	`LVMSYNC_CLIENT_KEY`	`client_key`	Client TLS key file
`--ca-cert`	`LVMSYNC_CA_CERT`	`ca_cert`	CA certificate file
`--allow-insecure`	-	`allow_insecure`	Allow insecure (no TLS)

If --ssh-key is empty, lvmsync contacts the SSH agent referenced by SSH_AUTH_SOCK. The agent connection uses --ssh-timeout as its deadline. SSH transport negotiation also derives read and write deadlines from the caller's context; when the context expires, the handshake fails quickly and deadlines are cleared afterward.

Common deployment scenarios

Local disk to disk:

lvmsync run /dev/vg0/source /dev/vg0/backup

Remote over SSH:

lvmsync run /dev/vg0/source user@backup:/dev/vg1/target --ssh-key ~/.ssh/id_ed25519

Rsync delta with dedup and compression:
```
lvmsync run --delta=rsync --dedup-strategy bloom --compress lz4 --transport rsync --allow-insecure --dry-run /tmp/src /tmp/dst
```
The rsync transport is excluded from binaries by default. Compile with go build -tags rsync (and run tests with go test -tags rsync) to enable it.

Throughput-optimized transfer:

lvmsync run --mode throughput /dev/vg0/source /dev/vg1/target

lvmsyncd Examples

lvmsyncd exposes replication endpoints with the --listen flag. Each URI scheme selects a transport and optional parameters configure authentication.

Start a TLS/TCP listener:

lvmsyncd --listen tcp+tls://:9443 --server-cert server.pem --server-key server.key --client-cert client.pem --client-key client.key --ca-cert ca.pem

Activate an SSH listener:

lvmsyncd --listen ssh://:2222 --ssh-host-key-path host_key

Both transports require explicit keys; the daemon exits if any are missing and never generates self-signed certificates. Use --allow-insecure only for development.

`config.yaml` example

parallel: 4               # General Options
ssh_host: backup          # SSH Options
ssh_user: backup          # SSH Options
remote_pre_script: pre.sh # Remote Options
dedup_strategy: bloom     # Deduplication Options
compress: auto            # Compression Options
zstd_level: 3             # Compression Options
lz4_level: hc             # Compression Options
compress_threshold: 0.9   # Compression Options
snapshot_size: 20%        # LVM Options
create_dest_lv: false     # LVM Options

Use --config to point to a different file.

Invocation examples

With flags:

lvmsync run --dry-run --parallel 8 \
  --compress auto --zstd-level 3 --lz4-level hc --compress-threshold 0.9 \
  --snapshot-size 10% /dev/vg0/snap0 /mnt/backup

With environment variables:

LVMSYNC_PARALLEL=8 LVMSYNC_SNAPSHOT_SIZE=10% lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup

With a config file:

lvmsync run --dry-run --config config.yaml /dev/vg0/snap0 /mnt/backup

Transport Registry

Transport selection is controlled by the --transport flag, which accepts a comma-separated ordered list of transports to attempt (for example ssh,tcp+tls,h2,quic). The quic transport runs over TLS 1.3 with mutual authentication, negotiates the lvmsync ALPN, and exposes both bidirectional streams and datagrams. The h2 transport also requires TLS 1.3 with client certificates and negotiates the h2 ALPN. Provide certificates via --server-cert, --server-key, --client-cert, --client-key, and --ca-cert. TLS transports require a trusted CA certificate and refuse connections when no roots are provided unless insecure mode is explicitly acknowledged with the --allow-insecure flag. This bypasses certificate verification and is intended for development only; configuration files alone cannot enable it. Client certificates must be supplied explicitly; transports no longer generate self-signed certificates automatically. The transport documentation covers each option in depth. The flags below configure transport behavior.

Flags and environment variables

| Flag | Environment variable | Description | mTLS | |------|----------------------|-------------|------|| | --transport | LVMSYNC_TRANSPORT_TRANSPORT | Ordered transports to try (e.g., ssh,tcp+tls,h2,quic) | | --concurrency | LVMSYNC_TRANSPORT_CONCURRENCY | Stream concurrency (0 to autotune based on BDP) | | --tcp-port | LVMSYNC_TRANSPORT_TCP_PORT | TCP+TLS port | | --h2-port | LVMSYNC_H2_PORT | HTTP/2 port | | --tcp-parallel | LVMSYNC_TRANSPORT_TCP_PARALLEL | Number of parallel TCP connections | | --tcp-lowat | LVMSYNC_TRANSPORT_TCP_LOWAT | TCP_NOTSENT_LOWAT in bytes | | --ssh-port | LVMSYNC_SSH_PORT | SSH port | | --ssh-port | LVMSYNC_SSH_PORT | SSH port | ❌ | | --client-cert | LVMSYNC_CLIENT_CERT | Client TLS certificate file | ✅ | | --client-key | LVMSYNC_CLIENT_KEY | Client TLS key file | ✅ | | --ca-cert | LVMSYNC_CA_CERT | CA certificate file | ✅ | | --tcp-parallel | LVMSYNC_TCP_PARALLEL | Number of parallel TCP connections | n/a | | --tcp-lowat | LVMSYNC_TCP_LOWAT | TCP_NOTSENT_LOWAT in bytes | n/a |

Usage examples

Multiple transports

lvmsync run --dry-run --transport ssh,tcp+tls,h2,quic --tcp-port 9443 /dev/vg0/snap0 /mnt/backup

QUIC

0-RTT data is disabled by default.

lvmsync run --dry-run --transport quic --client-cert cert.pem --client-key key.pem --ca-cert ca.pem
# or
LVMSYNC_TRANSPORT_TRANSPORT=quic LVMSYNC_CLIENT_CERT=cert.pem LVMSYNC_CLIENT_KEY=key.pem LVMSYNC_CA_CERT=ca.pem lvmsync run --dry-run

TCP+TLS

lvmsync run --dry-run --transport tcp+tls --tcp-port 9443
# or
LVMSYNC_TRANSPORT_TRANSPORT=tcp+tls LVMSYNC_TRANSPORT_TCP_PORT=9443 lvmsync run --dry-run

HTTP/2

lvmsync run --dry-run --transport h2 --h2-port 9443 --client-cert cert.pem --client-key key.pem --ca-cert ca.pem

SSH

lvmsync run --dry-run --transport ssh backup@host:/dev/vg1/target --ssh-port 2222
# or
LVMSYNC_TRANSPORT_TRANSPORT=ssh LVMSYNC_SSH_PORT=2222 lvmsync run --dry-run backup@host:/dev/vg1/target

Hybrid Deduplication and Adaptive Compression

Hybrid dedup combines fixed-size and content-defined chunking. Enable it with --dedup hybrid and tune FastCDC with --cdc-min, --cdc-avg, and --cdc-max.

Flag (`--cdc-*`)	Environment variable	Config key	Description
`--cdc-min`	`LVMSYNC_DEDUP_CDC_MIN`	`cdc_min`	Minimum chunk size (must be at least 64 bytes)
`--cdc-avg`	`LVMSYNC_DEDUP_CDC_AVG`	`cdc_avg`	Target average chunk size
`--cdc-max`	`LVMSYNC_DEDUP_CDC_MAX`	`cdc_max`	Maximum chunk size

The three values must be positive, with --cdc-min at least 64 bytes, and satisfy --cdc-min ≤ --cdc-avg ≤ --cdc-max. LVMSync aborts when the sizes are non-positive, below the minimum, or unordered.

The Bloom filter de-duplicates previously seen chunks. Size it with --bloom-entries and desired false positive rate via --bloom-fp-rate. For an mmap-backed index, --bloom-mbits controls the bitmap size in megabits. The defaults (--bloom-entries=1000000, --bloom-fp-rate=0.01) consume about 1.14 MiB and yield ~1% false positives, while --bloom-mbits=27 allocates roughly 16 MiB for ~0.8% false positives. False positives are rare but possible; if a chunk collides in the Bloom filter it is treated as already transferred. A final SHA-256 digest over the transfer detects any mismatches so retries can resend the affected data. The mmap-backed index (*.idx) is truncated to zero on startup so each run begins with a clean bitset.

Compression samples 8 KiB from each chunk and skips when the estimated ratio exceeds --compress-threshold. --compress auto selects Zstd when AVX2 or NEON is available, falling back to LZ4 otherwise.

CLI:

lvmsync run --dry-run --dedup hybrid --cdc-min 262144 --cdc-avg 1048576 --cdc-max 4194304 /dev/vg0/snap0 /mnt/backup

Environment:

LVMSYNC_DEDUP=hybrid \
LVMSYNC_DEDUP_CDC_MIN=262144 \
LVMSYNC_DEDUP_CDC_AVG=1048576 \
LVMSYNC_DEDUP_CDC_MAX=4194304 \
lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup

YAML:

dedup: hybrid
cdc_min: 262144
cdc_avg: 1048576
cdc_max: 4194304

Dedup configuration

The dedup package exposes a LoadConfig helper that reads tuning parameters from flags, LVMSYNC_* environment variables, or keys in a YAML file. Values are resolved with the following precedence (highest first):

Command-line flags
LVMSYNC_* environment variables
config.yaml
Built-in defaults

Flag	Environment variable	Config key	Description	Default
`--min-chunk-size`	`LVMSYNC_MIN_CHUNK_SIZE`	`min_chunk_size`	Minimum chunk size in bytes	`4096`
`--max-chunk-size`	`LVMSYNC_MAX_CHUNK_SIZE`	`max_chunk_size`	Maximum chunk size in bytes	`1048576`
`--false-positive-rate`	`LVMSYNC_FALSE_POSITIVE_RATE`	`false_positive_rate`	Bloom filter false positive rate	`0.001`
`--ram-bytes`	`LVMSYNC_RAM_BYTES`	`ram_bytes`	RAM budget for the Bloom filter	`1073741824`
`--volume-size`	`LVMSYNC_VOLUME_SIZE`	`volume_size`	Size of the volume being processed	`0`
`--hash-key`	`LVMSYNC_HASH_KEY`	`hash_key`	Optional hex-encoded key for BLAKE3 hashing	`""`

Two presets are available via --mode: default and throughput. Any other value causes configuration validation to fail.

Throughput Mode Presets

--mode throughput applies a set of options tuned for high-bandwidth links:

transport order ssh,tcp+tls,h2,quic
concurrency 8
deduplication mode hybrid
compression auto
enables --odirect

CLI:

lvmsync run --dry-run --mode throughput /dev/vg0/snap0 /mnt/backup

Environment:

LVMSYNC_MODE=throughput lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup

YAML:

mode: throughput

Logging and progress

Logs are emitted with zap to stderr. When --output=text (default), progress updates are logged to stderr when --progress is enabled. Set --output=json to emit progress events on stdout as line-delimited JSON. The verify subcommand and --verify-only mode support --output=json or --output=yaml to write structured verification results to stdout. objects while preserving the regular logs on stderr. Each object follows this schema:

{"event": "progress", "bytes_transferred": 12345, "bytes_total": 67890, "progress_percent": 18.2}

A final object with {"event": "complete", "progress_percent": 100} marks completion. Disable progress entirely with --progress=false.

Installation

Requirements

Go 1.22+
Linux only (tested on amd64 and arm64 architectures)
pkg-config
LVM2 with development headers providing liblvm2cmd (liblvm2-dev)
- A recent LVM2 release providing the modern liblvm2cmd API (e.g., 2.03.21+) is required.
SSH client & server (for remote transfers)

Installing LVM2 Development Headers

CGO uses pkg-config to locate the LVM2 and device-mapper libraries. Install the development headers and pkg-config package on your system:

# Debian/Ubuntu
sudo apt install -y lvm2 liblvm2-dev pkg-config

# RHEL/CentOS
sudo yum install -y lvm2-devel pkgconfig

If the .pc files are installed in a non-standard location, set PKG_CONFIG_PATH so that pkg-config can find them.

Build

Clone the repository and build the binary using Go modules with CGO enabled. A helper target checks for the required native libraries and pkg-config:

make deps  # verify pkg-config, device-mapper, and LVM2 headers

The make build target runs this check automatically.

Then build the binaries:

git clone https://github.com/oferchen/lvmsync_go.git
cd lvmsync_go
go mod tidy
CGO_ENABLED=1 go build -o lvmsync .

To build on systems without LVM2, disable CGO. This uses stub implementations and omits LVM features:

CGO_ENABLED=0 go build -o lvmsync .

Makefile

make build   # build binaries
make test    # run tests

Usage

Show Version

lvmsync --version

Outputs Version Commit BuildDate, for example:

v0.2.0 abcdef1 2024-01-02T15:04:05Z

Basic Syntax

lvmsync run [--dry-run] [--transport ssh,tcp+tls,h2,quic] <snapshot|lvm device> <destination>

The tool supports both local and remote transfers. Use --dry-run to print planned actions without executing and --transport to provide an ordered list of transports to try.

Resume, Manifest, and Verify

Run an initial transfer and write a manifest for later verification or incremental runs:

lvmsync run --dry-run --manifest-path snapshot.manifest /dev/vg0/source /dev/vg1/target

See the manifest documentation for details on the binary format and rebuild options.

Resume an interrupted transfer using a checkpointed state file:

lvmsync run --dry-run --resume=statefile /dev/vg0/snap0 /dev/vg0/data

Rebuild a manifest index for an existing device. The command verifies the current manifest and rewrites it when digests or device metadata have changed:

lvmsync manifest rebuild /dev/vg0/lv0

Progress logs are emitted every 10s by default; adjust with --manifest-progress-interval. The command times out after 1m unless overridden with --manifest-timeout (0 disables). Rebuild refuses to run if the device is mounted read-write; pass --manifest-allow-mounted to override. Mount detection parses /proc/self/mountinfo using github.com/moby/sys/mountinfo, correctly handling bind mounts, repeated entries, and devices with spaces or special characters. Rebuild fails if the device reports a block size of 0.

Manifests embed a persistent device identifier in a fixed 64-byte field. The manifest rebuild command fails if the identifier exceeds this limit.

Verify that a source and destination match:

lvmsync verify /dev/vg0/source /dev/vg1/target

Supply options such as block size or deduplication mode to control how data is compared. For example, to estimate verification without reading data:

lvmsync verify --dry-run /dev/vg0/source /dev/vg1/target

To verify using 4 KiB blocks and a manifest generated earlier:

lvmsync verify --block-size 4K /dev/vg0/source /dev/vg1/target

Flags are parsed via Viper, so the same settings can be provided through LVMSYNC_* environment variables or a config.yaml file.

Options

General Options

Option	Description	Default
`--config`	Path to a YAML configuration file	`""`
`--parallel`	Number of concurrent workers	`4`
`--zerocopy`	Enable zero-copy transfers (only used in sequential mode)	`false`
`--max-retries`	Maximum number of retries per block	`3`
`--retry-delay`	Initial delay between retries
`100ms`
`--resume`	Path to resume state file (verification runs unless `--verify=none`)

                             | `""`      |

SSH Options

Option	Description	Default
`--ssh-host`	SSH host	`"localhost"`
`--ssh-user`	SSH username	`"root"`
`--ssh-key`	Path to SSH private key	`""`
`--ssh-host-key-path`	Path to SSH host private key (generates one if empty)	`""`
`--ssh-agent`	Use the SSH agent for authentication	`false`
`--ssh-port`	SSH port number	`22`
`--known-hosts`	Path to known_hosts file (defaults to `$HOME/.ssh/known_hosts`)	`$HOME/.ssh/known_hosts`
`--strict-host-key-checking`	Require host keys to be present in `known_hosts`; when `false`, host key verification is disabled	`true`
`--ssh-host-key`	Expected SSH host public key (authorized_keys format)	`""`

Unknown hosts are rejected unless their keys are present in known_hosts or match --ssh-host-key. The host key can also be supplied via LVMSYNC_SSH_HOST_KEY_PATH or the ssh_host_key_path YAML option; precedence follows flags over environment variables over YAML.

Programmatic use of the SSH transport requires a configuration populated with fields like SSHUser, SSHKeyPath, HostKeyPath, SSHUseAgent, SSHPort, KnownHosts, StrictHostKeyCheck, SSHTimeout, SSHKeepAliveInterval, and MaxRetries. The constructor also requires a *zap.Logger:

logger, _ := zap.NewProduction()
defer logger.Sync()
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
tr, err := ssh.New(ctx, cfg)

Remote Options

The --lvmsync-path value is sanitized to its basename and must match [a-zA-Z0-9._-]+ to prevent shell injection.

Option	Description	Default
`--lvmsync-path`	Remote command to run (sanitized basename)	`"lvmsync"`
`--remote-pre-script`	Remote script to run before starting the transfer (`ssh_timeout` applies)	`""`
`--remote-post-script`	Remote script to run after finishing the transfer (uses a fresh `ssh_timeout`)	`""`

Both scripts are run with the configured ssh_timeout. The post script uses its own timeout and still attempts to execute even when the main transfer fails; timeouts or cancellations are reported separately.

Deduplication Options

Option	Description	Default
`--dedup`	Deduplication mode ("fixed", "cdc", or "hybrid")	"fixed"
`--cdc-min`	Minimum chunk size for CDC	262144
`--cdc-avg`	Average chunk size for CDC	1048576
`--cdc-max`	Maximum chunk size for CDC	4194304
`--dedup-strategy`	Deduplication strategy ("none", "auto", "checksum", "rolling_hash", or "bloom"); use `none` to disable	"none"
`--dedup-state-file`	Path to deduplication state file	~/.lvmsync_dedup
`--bloom-entries`	Estimated number of entries for bloom filter	1000000
`--bloom-fp-rate`	False positive rate for bloom filter	0.01
`--bloom-mbits`	Size of Bloom filter bitmap in megabits (mmap index)	0

Compression Options

Option	Description	Default
`--compress`	Compression type (options: `"none"`, `"lz4"`, `"zstd"`, `"auto"`)	`"auto"`
`--zstd-level`	Zstd compression level (`1-5`)	`1`
`--lz4-level`	LZ4 compression level: `fast` or `hc`	`fast`
`--compress-concurrency`	Number of goroutines used for compression (`0` to use all cores)	`0`
`--compress-threshold`	Skip compression when estimated ratio exceeds this value	`0.9`

LVM Options

Option	Description	Default
`--skip-snapshot-creation`	Skip automatic snapshot creation	`false`
`--skip-disk-check`	Skip disk space check before snapshot creation	`false`
`--snapshot-size`	Snapshot size as an absolute value (e.g., "20G") or as a percentage (e.g., "20%")	"20%"
`--snapshot-max-usage`	Maximum allowed snapshot usage percent before aborting	`80`
`--volume-group`	Source volume group. Derived from the source device path when empty	""
`--target-volume-group`	Volume group name of the target LVM volume	""
`--target-vgs`	Candidate target volume groups for auto-selection	[]
`--lvm-escalation`	Command used to re-execute the program with elevated privileges when not running as root (e.g., `sudo -p "my prompt" -n`); parsed with shell-style quoting and validated at startup	"sudo -n"
`--lvm-timeout`	Timeout for LVM operations and privilege checks	10s
`--sig-cache-ttl`	TTL for cached LVM signatures	24h
`--sig-cache-max`	Maximum cached LVM signatures	128

The --lvm-escalation value must be a single executable and its arguments; pipes, redirects, and other shell operators are rejected. Quote any paths or argument values containing spaces, for example:

--lvm-escalation "/usr/bin/sudo wrapper" -p "no password" -n

lvm_timeout also bounds the startup privilege check to avoid hanging when escalation commands stall.

The client aborts dialing if a connection cannot be established within

Option	Description	Default
`--keepalive-time`	Interval between server pings	`2m`
`--keepalive-timeout`	Timeout waiting for keepalive ack	`20s`
`--request-timeout`	Deadline for unary RPCs	`15s`
`--client-cert`	Client TLS certificate file	`""`
`--client-key`	Client TLS key file	`""`
`--ca-cert`	CA certificate file	`""`
`--allow-insecure`	Allow insecure (disable TLS)	`false`

Examples

Local Transfer

Transfer changes from a snapshot to a destination device locally:

lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

Remote Transfer

Replicate data to a remote host. The destination must be specified in host:device format (optionally including a username, e.g., user@host:/dev/vg0/data):

lvmsync run --dry-run /dev/vg0/snap0 user@remote:/dev/vg0/data

Using Compression

Estimate a sample of each chunk and compress only when it's worthwhile.

CLI:

lvmsync run --dry-run --compress auto --zstd-level 2 --compress-threshold 0.85 /dev/vg0/snap0 /dev/vg0/data

Environment:

LVMSYNC_COMPRESSION_COMPRESS=auto \
LVMSYNC_COMPRESSION_ZSTD_LEVEL=2 \
LVMSYNC_COMPRESSION_COMPRESS_THRESHOLD=0.85 \
lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML:

compress: auto
zstd_level: 2
compress_threshold: 0.85

Rate Limiting

Transfers can be throttled using a token bucket accurate to ±3% of the target. Each writer has its own limiter, so multiple transfers with different limits run independently. Limit the transfer speed to 50MB/s:

lvmsync run --dry-run --speed 50MB /dev/vg0/snap0 /dev/vg0/data

Resuming a Transfer

Resume an interrupted transfer using a resume state file. The file records the last chunk boundaries and digests for fixed, CDC, and hybrid modes. Progress is checkpointed every --checkpoint-bytes or --checkpoint-interval, and the resume file is removed on successful completion. Changing the transport, compression, checksum algorithm, or dedup mode invalidates the checkpoint:

lvmsync run --dry-run --resume=statefile /dev/vg0/snap0 /dev/vg0/data

Full LVM Operation Example

lvmsync run --dry-run --skip-disk-check=false --snapshot-size "25%" --volume-group "vg_data" --lvm-escalation "sudo -n" /dev/vg_data/original /dev/vg_data/destination

In this example, LVMSync will:

Validate that the volume group vg_data exists.
Create a snapshot of /dev/vg_data/original sized at 25% of the original volume.
- Automatically re-exec with sudo -n if not running as root.
Monitor snapshot usage (failing fast if usage exceeds 80% by default; configurable via --snapshot-max-usage).
Perform the block-level transfer.
Remove the snapshot upon completion.
Clean up gracefully if interrupted.

Manifest Rebuild and Verification

Rebuild a manifest for an existing device when the index is missing or stale:

lvmsync manifest rebuild /dev/vg0/lv0

Progress logs are emitted every 10s by default; adjust with --manifest-progress-interval. The command times out after 1m unless overridden with --manifest-timeout (0 disables).

Compare source and destination devices against a manifest:

lvmsync verify /dev/vg0/snap0 /mnt/backup

Use --dry-run with verify to inspect planned operations without modifying the destination:

lvmsync verify --dry-run /dev/vg0/source /dev/vg1/target

Configuration Sources

LVMSync binds its command line flags to Viper, allowing configuration through flags, environment variables, or a YAML file. The resolution order is:

command line flags
environment variables (LVMSYNC_*)
the configuration file (default: config.yaml)

Environment variables use the LVMSYNC_ prefix and match flag names converted to upper case with hyphens replaced by underscores. The --config flag can point to an alternative YAML file.

Unused or unknown keys in the YAML file produce runtime warnings to surface typos. For example, a config.yaml containing an unrecognized unused_key triggers:

{"level":"warn","msg":"unknown configuration key \"unused_key\""}

Examples by Option Group

General

CLI:

lvmsync run --dry-run --parallel 8 --resume=statefile /dev/vg0/snap0 /dev/vg0/data

Environment:

LVMSYNC_PARALLEL=8 LVMSYNC_RESUME=statefile lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML (config.yaml):

parallel: 8
resume: statefile

SSH

CLI:

lvmsync run --dry-run --ssh-user backup --ssh-port 2222 /dev/vg0/snap0 backup:/dev/vg0/data

Environment:

LVMSYNC_SSH_HOST=backup LVMSYNC_SSH_USER=backup LVMSYNC_SSH_PORT=2222 lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML:

ssh_host: backup
ssh_user: backup
ssh_port: 2222

Remote

CLI:

lvmsync run --dry-run --lvmsync-path /usr/bin/lvmsync --remote-pre-script /tmp/pre.sh /dev/vg0/snap0 user@host:/dev/vg0/data

Environment:

LVMSYNC_LVMSYNC_PATH=/usr/bin/lvmsync LVMSYNC_REMOTE_PRE_SCRIPT=/tmp/pre.sh lvmsync run --dry-run /dev/vg0/snap0 user@host:/dev/vg0/data

YAML:

lvmsync_path: /usr/bin/lvmsync
remote_pre_script: /tmp/pre.sh

Deduplication

CLI:

lvmsync run --dry-run --dedup-strategy bloom --dedup-state-file ~/.lvmsync_state /dev/vg0/snap0 /dev/vg0/data

Environment:

LVMSYNC_DEDUP_STRATEGY=bloom LVMSYNC_DEDUP_STATE_FILE=~/.lvmsync_state lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML:

dedup_strategy: bloom
dedup_state_file: ~/.lvmsync_state

When --dedup-strategy auto is used, lvmsync chooses a strategy based on available RAM, volume size, and CPU capabilities.

Condition	Selected strategy
Bloom filter fits in RAM	`bloom`
Doesn't fit, checksum acceleration available	`checksum`
Doesn't fit, no acceleration	`rolling_hash`

See docs/dedup_strategies.md for more details.

LVMSync automatically reloads this state file on startup. Delete it to reset deduplication: rm ~/.lvmsync_dedup. When saving Bloom filter state, LVMSync logs dedup_bloom_stats with entries, configured_fp_rate, and observed_fp_rate.

Compression

LVMSync samples 8 KiB from each chunk to gauge compression efficiency. If the compressed sample ratio is greater than or equal to --compress-threshold, the chunk is sent uncompressed. In auto mode, Zstd is used when AVX2 or NEON is available; otherwise LZ4 is selected. The compression threshold is tunable via --compress-threshold (LVMSYNC_COMPRESSION_COMPRESS_THRESHOLD or compress_threshold), where values near 1 favor compression and lower values skip high-entropy data. Levels can be tuned with --zstd-level (1-5) or --lz4-level (fast or hc).

CLI:

lvmsync run --dry-run --compress auto --zstd-level 2 --compress-threshold 0.85 /dev/vg0/snap0 /dev/vg0/data

Environment:

LVMSYNC_COMPRESSION_COMPRESS=auto LVMSYNC_COMPRESSION_ZSTD_LEVEL=2 LVMSYNC_COMPRESSION_COMPRESS_THRESHOLD=0.85 lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML:

compress: auto
zstd_level: 2
compress_threshold: 0.85

LVM

CLI:

lvmsync run --dry-run --snapshot-size 25% --volume-group vg_data /dev/vg_data/original /dev/vg_data/destination

Environment:

LVMSYNC_SNAPSHOT_SIZE=25% LVMSYNC_VOLUME_GROUP=vg_data lvmsync run --dry-run /dev/vg_data/original /dev/vg_data/destination

YAML:

snapshot_size: "25%"
volume_group: vg_data

Flag	Environment variable	Config key	Description

Precedence example:

EOF
# effective port: 3333

CLI:

Environment:

tls-cert: cert.pem

Use --config to provide an alternate config file path.

Configuration Validation

Before starting, LVMSync validates key configuration parameters:

Verifies that the specified volume groups exist.
Ensures the escalation command is available if not running as root.

Invalid configurations will cause the tool to abort with a clear error message.

Exit Codes

LVMSync signals success and failure with structured exit codes. Refer to the operations guide for the complete list and recommended recovery steps.

Code	Meaning	Recovery Step
`0`	Success	None
`10`	Privilege or capability check failed	Run as root or adjust `--lvm-escalation`.
`20`	Device error	Verify device paths and snapshot health.
`25`	Snapshot exhausted	Extend the snapshot or reduce source writes before resuming.
`30`	Unsupported platform	Run on a supported Linux platform.
`40`	Configuration error	Review flags, environment variables, and `config.yaml`.
`50`	Runtime failure	Inspect logs and fix the issue.
`60`	Verification mismatch	Investigate mismatched data before retrying.
`70`	Partial transfer	Address the error and resume with `--resume`.

Credits

LVMSync is written in Go by Ofer Chen, inspired by mpalmer/lvmsync.

Contributing

Contributions are welcome. See AGENTS.md for detailed contributor instructions and open TODO items. Please follow the project's coding guidelines, include appropriate logging and error handling, and update documentation as needed.

Single-Responsibility Functions

Keep functions and packages focused on a single task to simplify maintenance and testing. Break up large components when behavior grows to preserve clarity.

Dependency Injection

Decouple modules by injecting dependencies through interfaces or constructor parameters. This approach makes components easier to test and swap during refactoring.

For example, the privilege package exposes an Escalator interface so tests can stub command execution:

ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
esc, err := privilege.New(ctx, zap.NewNop())
if err != nil {
    // handle error
}
if err := esc.Ensure(ctx); err != nil {
    // handle missing capabilities or sudo
}
cmd := esc.Command(ctx, "lvs", "--version")
_ = cmd

Test Coverage

Every change should include unit tests. Run go test -cover ./... to ensure coverage remains high and regressions are caught early.

Compression detection uses benchmark-driven selection between LZ4 and Zstd and now includes dedicated tests verifying algorithm choice and cache resets.

Production readiness

Structured logging uses zap; always defer a logger sync to flush entries.
Configuration is parsed with pflag and viper. Every option can be set via CLI flags, LVMSYNC_* environment variables, or the config.yaml file.
Related flags are organized into thematic FlagSets for concise help output.
Each function in the codebase includes unit tests covering both success and failure paths.
Before sending patches, run go build ./..., go test -cover ./..., and golangci-lint run.

Development

Prerequisites

LVMSync requires Go 1.24.3 or later. Verify your installation with:

go version

Development Setup

The Super-Linter workflow validates the entire repository.

Linting

The .golangci.yml config uses standard Go formatters such as gci, gofmt, gofumpt, goimports, and golines. A misconfigured swaggo formatter entry was removed. Run the linter locally to mirror CI:

golangci-lint run ./...

Testing

Run unit tests with coverage:

go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out

Some privileged tests are skipped when run without root access.

Integration tests manipulate LVM volumes and loop devices. Running them requires root privileges and utilities such as lvremove, pvremove, and losetup. These tests are skipped when prerequisites are missing.

The workflow enforces a minimum of 50% total coverage.

License

GPLv3 License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1,759 Commits
.github		.github
app		app
cmd		cmd
common		common
dedup		dedup
device		device
docs		docs
escalate		escalate
hash		hash
integration		integration
internal		internal
lvm		lvm
manifest		manifest
remote		remote
scripts		scripts
testutil		testutil
transfer		transfer
transport		transport
.commitlintrc.yml		.commitlintrc.yml
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yml		.goreleaser.yml
.hadolint.yaml		.hadolint.yaml
.jscpd.json		.jscpd.json
.markdownlint.yml		.markdownlint.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.slsa-goreleaser.yml		.slsa-goreleaser.yml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.distroless		Dockerfile.distroless
LICENSE		LICENSE
Makefile		Makefile
OPERATIONS.md		OPERATIONS.md
README.md		README.md
SECURITY.md		SECURITY.md
config.yaml		config.yaml
error_handling_test.go		error_handling_test.go
go.mod		go.mod
go.sum		go.sum
main.go		main.go
main_error_log_test.go		main_error_log_test.go
main_exitcode_test.go		main_exitcode_test.go
perf.md		perf.md
readme_examples_test.go		readme_examples_test.go
runner_success_test.go		runner_success_test.go

License

oferchen/lvmsync_go

Folders and files

Latest commit

History

Repository files navigation

LVMSync

Features

Resume, verification, and safe overwrite flows

Supported Platforms

Device Support Matrix

Offline requirements

Transport Options

Examples

Manifest Lifecycle

lvmsync daemon

Resume and Verify Workflows

Safety Notes

Supported Platforms

Roadmap

Architecture

Refactoring Notes

Logging

Expectations

Configuration

Examples

Flag groups

Grouped help

Raw device safety

Configuration sources and precedence

Option reference

Common deployment scenarios

lvmsyncd Examples

config.yaml example

Invocation examples

Transport Registry

Flags and environment variables

Usage examples

Hybrid Deduplication and Adaptive Compression

Dedup configuration

Throughput Mode Presets

Logging and progress

Installation

Requirements

Installing LVM2 Development Headers

Build

Makefile

Usage

Show Version

Basic Syntax

Resume, Manifest, and Verify

Options

General Options

SSH Options

Remote Options

Deduplication Options

Compression Options

LVM Options

Examples

Local Transfer

Remote Transfer

Using Compression

Rate Limiting

Resuming a Transfer

Full LVM Operation Example

Manifest Rebuild and Verification

Configuration Sources

Examples by Option Group

General

SSH

Remote

Deduplication

Compression

LVM

Configuration Validation

Exit Codes

Credits

Contributing

Single-Responsibility Functions

Dependency Injection

`config.yaml` example

Packages