Skip to content

oferchen/lvmsync_go

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

LVMSync

Go Super Linter Loopback Integration Coverage

LVMSync is a high-performance incremental data replication tool for LVM snapshots. It efficiently transfers only changed blocks using metadata from snapshot COW (Copy-On-Write) devices and communicates with LVM through native Go bindings rather than shell commands.

For benchmark methodology and reproducible performance numbers, see perf.md.

For details on running with minimal privileges and sudoers examples, see SECURITY.md and docs/sudoers.md. For snapshot cleanup, resuming transfers, and verify-only rollback procedures, see docs/danger_rollback.md.

Features

  • Incremental Block-Level Synchronization: Transfers only changed blocks.

  • Zero-Copy Transfers: Utilizes splice() for efficient data movement.

  • Parallel Execution: Configurable concurrency for optimal performance.

  • Adaptive Transport Concurrency: Maintains ~1–2×BDP of in-flight data and can be overridden with --concurrency.

  • Rate-Limiting: Control bandwidth usage during transfers.

  • Compression: Samples 8 KiB per chunk and skips compression when the ratio exceeds --compress-threshold. Auto mode selects Zstd on CPUs with AVX2 or NEON support, falling back to LZ4 when those features are absent. Compression levels are tuned via --lz4-level and --zstd-level. See compression documentation for pipeline details.

  • Checksum Verification: Ensures data integrity using SHA-256 or BLAKE3, automatically selecting BLAKE3 on CPUs with AES-NI, AVX2/AVX-512, or NEON.

  • Native LVM2 Integration: Uses Go bindings to liblvm2cmd instead of shelling out.

  • Generic Block Device Support: Access raw /dev/* paths and regular files (including loopback images) through a unified device abstraction.

  • Deduplication Strategies: Detect unchanged blocks using checksum, rolling hash, or a Bloom filter with optional FastCDC content-defined chunking and mmap-backed index.

  • Hashing: Hardware-accelerated XXH3 provides fast deduplication hints while BLAKE3 digests are stored in manifests for integrity.

  • Remote Execution via SSH: Replicates data over SSH with support for pre/post-scripts.

  • Resume Support: Ability to resume interrupted transfers with verification enabled by default (use --verify=none to skip).

  • Crash-Safe WAL: Records committed ranges in a write-ahead log so interrupted runs can recover. See WAL documentation for layout and replay details.

  • Probe and Verification Modes: --probe-only validates devices and privileges without writing and prints size_bytes kernel_uuid gpt_uuid mbr_signature fs_uuid major minor manifest_epoch to stdout, while --verify-only scans both sides and reports mismatches.

  • Dry-run Estimates: --dry-run samples the manifest to project bytes and ETA without transferring data.

  • Planning: --plan prints resolved configuration with secrets redacted, transport order, estimated bytes, and compression decisions as JSON without transferring data.

  • Device Identity Tuple: Each run records (size_bytes, kernel_uuid, gpt_uuid, mbr_signature, fs_uuid, major, minor, manifest_epoch) to prevent writing to the wrong destination.

  • Handshake Timeouts: All transports, including rsync, apply context deadlines during handshakes and clear them once negotiation succeeds.

  • Sparse Destination Optimization: Detects runs of zero bytes and punches holes when the filesystem supports it. Use --sparse=never to always write zeros instead.

  • Aligned I/O Buffers and NUMA Pinning: --odirect allocates block-size aligned slabs from a sync.Pool and can pin worker goroutines to a device's NUMA node (--numa-pin) or an explicit node (--numa-node).

  • LVM Snapshot Management:

    • Automatic snapshot creation and removal.
    • Configurable snapshot size (absolute or percentage-based) via --snapshot-size, LVMSYNC_SNAPSHOT_SIZE, or the snapshot_size config key.
    • Configurable snapshot usage threshold via --snapshot-max-usage.
    • Configurable volume group for constructing the snapshot device path.
    • Auto-selection of target volume groups with sufficient free space.
    • Automatic privilege escalation (defaulting to sudo -n).
    • Snapshot health monitoring that fails fast if usage exceeds a threshold.
    • Snapshot monitor goroutine closes its error channel on exit; cleanup only cancels monitoring, avoiding send-on-closed-channel panics (see TestCreateSnapshotCleanupNoPanic).
    • See LVM snapshot documentation for snapshot lifecycle and mount checks.
  • Graceful Shutdown: Signal handling ensures snapshots are cleaned up on interruption.

  • Flexible Configuration: Flags, environment variables, or config.yaml. See Configuration. Configuration values follow flag > environment variable > config file precedence.

  • Configuration Validation: Checks key parameters (e.g., volume group existence, escalation command) before starting operations.

Resume, verification, and safe overwrite flows

Transfers store the device identity tuple (size_bytes, kernel_uuid, gpt_uuid, mbr_signature, fs_uuid, major, minor, manifest_epoch) and compare it against the destination before writing. Partition-table mismatches return a precondition failure to avoid accidental overwrites. Use --force to bypass this check when intentionally overwriting.

Example --probe-only output showing size_bytes kernel_uuid gpt_uuid mbr_signature fs_uuid major minor manifest_epoch:

lvmsync run --probe-only /dev/vg0/snap0 /dev/vg0/target
# 10737418240 12345678-9abc-def0-1234-56789abcdef0 9abcdef0-1234-5678-90ab-cdef12345678 1a2b3c4d 0fedcba9-8765-4321-0fed-cba987654321 253 0 1700000000
  • --resume=statefile continues an interrupted run (verification runs unless --verify=none).
  • --verify-only reads both devices and reports mismatches without writing data. Resume after failure
lvmsync run --dry-run --resume=statefile /dev/vg0/snap0 /dev/vg0/target

Resume with verification

lvmsync run --dry-run --resume=verify /dev/vg0/snap0 /dev/vg0/target

Verification only

lvmsync run --dry-run --verify-only /dev/vg0/snap0 /dev/vg0/target

Safe overwrite procedure

lvmsync run --dry-run --probe-only /dev/vg0/snap0 /dev/vg0/target
lvmsync run --dry-run --verify-only /dev/vg0/snap0 /dev/vg0/target
lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/target

Exit code 3 signals verification mismatches. See operations guide for detailed recovery steps.

Supported Platforms

LVMSync supports Linux only. A runtime check in main.go aborts execution on other operating systems with exit code 1. The project is regularly tested on amd64 and arm64 architectures.

To cross-compile for another Linux architecture, set GOOS=linux and the desired GOARCH:

GOOS=linux GOARCH=arm64 go build ./...

Device Support Matrix

Device type Source Destination Notes
LVM snapshot snapshots are auto-created
Raw block device requires --offline or --fs-freeze-command/--fs-thaw-command when used as a source
Regular file includes loopback images

Override automatic detection with --source-type and --dest-type when a device's type is known in advance.

Offline requirements

Raw sources must be quiescent or provide filesystem freeze/thaw hooks using --fs-freeze-command and --fs-thaw-command. These command paths must be absolute. LVM snapshots are consistent by design, while regular files require no additional coordination.

Transport Options

LVMSync negotiates transports in the order provided by --transport (default ssh,tcp+tls,h2,quic). If a transport fails to connect, the next transport is tried and each attempt is logged. All transports require TLS 1.3 with mutual authentication or SSH host key verification unless --allow-insecure is set. The rsync transport is plaintext and refuses to initialize unless --allow-insecure acknowledges the lack of encryption. Enabling this transport logs a warning noting the plaintext connection. The client sends the destination device identity and the server refuses to write if it differs, returning a precondition failure. See docs/transports.md for details.

Transport Security defaults Notes
quic TLS 1.3, BBR congestion UDP-based transport
h2 TLS 1.3 HTTP/2 streams
tcp+tls TLS 1.3 Plain TCP wrapped in TLS
ssh Host key verification Uses OpenSSH-style authentication
rsync Plaintext, requires --allow-insecure rsync wire protocol, enforces destination identity

Examples

Select multiple transports and a custom port:

lvmsync run --dry-run --transport ssh,tcp+tls,h2,quic --tcp-port 9443 /dev/vg0/source /dev/vg0/backup

Force SSH only:

lvmsync run --dry-run --transport ssh user@backup:/dev/vg1/target /dev/vg0/source

The CLI groups transport flags using pflag and binds them to viper while emitting structured logs via zap. All flags use kebab-case (e.g., --client-cert, --allow-insecure) for consistency across commands:

import (
    "github.com/spf13/pflag"
    "github.com/spf13/viper"
    "go.uber.org/zap"
)

func main() {
    logger, _ := zap.NewProduction()
    defer logger.Sync()

    transport := pflag.NewFlagSet("transport", pflag.ExitOnError)
    transport.String("transport", "ssh,tcp+tls,h2,quic", "ordered transports")
    transport.Int("tcp-port", 9443, "TCP listener port")

    v := viper.New()
    v.BindPFlags(transport)
}

Manifest Lifecycle

Transfers rely on a manifest that tracks chunk offsets and digests:

  1. lvmsync manifest rebuild <device> refreshes or creates the manifest.
  2. lvmsync run <source> <destination> streams blocks, skipping chunks already recorded in the manifest.
  3. lvmsync verify <source> <destination> compares the destination with the manifest and logs any mismatches.

Garbage Collection & Atomic Commit describes how obsolete entries are pruned and rewritten safely.

authentication. Provide certificate files with --server-cert, --server-key, --client-cert, --client-key, and --ca-cert. Insecure mode disables certificate and host key verification and can be enabled with --allow-insecure, but it logs a warning and should only be used for testing.

Configuration can be supplied via flags, environment variables prefixed with variables, which override configuration files.

lvmsync daemon

The lvmsyncd binary loads optional modules and listens on one or more URIs. Use --listen repeatedly to specify addresses and --module to load plugin modules.

Configuration can come from flags, LVMSYNC_DAEMON_ environment variables, or a lvmsyncd.yaml file. Multi-value environment variables are comma-separated.

Flag Environment variable Config key Description
--listen LVMSYNC_DAEMON_LISTEN listen comma-separated list of listen URIs
--module LVMSYNC_DAEMON_MODULE module comma-separated module paths

Example:

LVMSYNC_DAEMON_LISTEN=unix:///run/lvmsyncd.sock,tcp+tls://:9000 lvmsyncd --module ./mod.so

See daemon documentation for module configuration, ACLs, and listener options.

Resume and Verify Workflows

Resume interrupted transfers with a state file:

lvmsync run --dry-run --resume=statefile /dev/vg0/snap0 /dev/vg0/data

Generate a manifest and verify a destination:

lvmsync manifest rebuild /dev/vg0/snap0
lvmsync verify /dev/vg0/snap0 /dev/vg0/data

Resume files track the last completed chunk and are removed after a successful transfer. See docs/manifest.md for manifest and verification details.

Safety Notes

  • Run manifest rebuild and verify against quiescent devices.
  • Use --offline or freeze/thaw hooks when scanning live filesystems to keep manifests consistent.
  • Network transports enforce mutual TLS or host key verification; --allow-insecure disables these checks, logs a warning, and should only be used for testing.
  • Back up destination data before running transfers; writes are destructive.

Supported Platforms

LVMSync targets Linux systems only. Builds are tested on the amd64 and arm64 architectures.

Roadmap

  • Pluggable data plane: QUIC, HTTP/2, TLS/TCP, SSH
  • Hybrid fixed + CDC deduplication with Bloom filter index
  • Adaptive compression using LZ4 or Zstd with per-chunk sampling
  • Throughput mode presets for high-bandwidth links

See AGENTS.md for contributor tasks and design guidelines.

Architecture

LVMSync is organized into modular packages to keep concerns separated:

  • lvm – manages snapshot creation, monitoring, and cleanup.
  • device – opens and queries generic block devices such as raw /dev/* paths and regular files.
  • transfer – performs block-level synchronization, compression, deduplication, and resume logic.
    • Internally split into focused modules: progress.go, handshake.go, and block_writer.go for clearer responsibilities.
  • remote – wraps SSH functionality for running commands on remote hosts and coordinating transfers. Callers must provide a context.Context with a timeout when starting the privileged helper to allow cancellation if the remote command fails to launch.
  • internal/config – parses and validates configuration files and CLI options.
  • dedup – houses Bloom filter helpers, chunking logic, and other deduplication utilities.
  • common and internal – shared helpers and internal utilities such as multi-error handling.
  • internal/client – coordinates snapshot preparation and client transfer execution.
  • cmd/dump – handles snapshot dumping and transport selection.
  • cmd/root – configures the application and routes to subcommands.
  • cmd/lvmsync – CLI orchestrator with a signals subpackage for signal handling and cleanup.
  • cmd/lvmsyncd – module loading daemon accepting multiple listen URIs.

This structure allows individual packages to be developed and tested in isolation.

Refactoring Notes

  • Snapshot preparation helpers (ensureVolumeGroups, checkDiskSpaceForSnapshot, createSnapshotIfNeeded, PrepareSnapshot) and client execution logic are consolidated under internal/client.
  • These helpers no longer rely on global variables; configuration and loggers are passed explicitly.
  • main.go now delegates to cmd/root, which wires together cmd/dump.

Logging

LVMSync emits structured logs using zap. Errors are logged with structured fields instead of being written to stderr, and the logger is flushed on shutdown to ensure all entries are persisted. A production logger is initialized immediately so even configuration failures during startup are reported through the same structured format. When --progress is enabled, progress updates are emitted as structured log entries, allowing external tooling to track transfer completion.

Expectations

  • Use zap for all logging and avoid fmt.Print* or log.* calls.
  • Pass loggers explicitly to commands and helpers; cmd/lvmsync.Execute requires a *zap.Logger and accepts a *lvmsync.Runner for dependency injection instead of relying on zap.L().
  • All commands receive an explicit *zap.Logger and default to zap.NewNop() when no logger is supplied.
  • Device constructors return an error when the logger is nil; transport constructors default to zap.NewNop() when no logger is supplied.
  • Log field keys in snake_case and include units where relevant (for example, duration_ms).
  • Provide raw byte values alongside human-readable sizes (for example, block_size and block_size_bytes).
  • Always defer syncLogger(logger) to flush buffers and log if the sync fails.

The example below demonstrates these conventions:

package main

import (
    "time"

    "go.uber.org/zap"
)

func syncLogger(logger *zap.Logger) {
    if err := logger.Sync(); err != nil {
        logger.Error("sync failed", zap.Error(err))
    }
}

func main() {
    logger, _ := zap.NewProduction()
    defer syncLogger(logger)
    start := time.Now()

    src := "/dev/vg0/source"
    dst := "/dev/vg0/backup"

    logger.Info("snapshot complete",
        zap.String("source_path", src),
        zap.String("dest_path", dst),
        zap.Int64("duration_ms", time.Since(start).Milliseconds()),
    )
}

Errors during block operations log the byte offset and block size explicitly:

Logger.Warn("Zero-copy transfer failed",
    zap.Int64("offset", offset),
    zap.Int("size_bytes", blockSize),
    zap.Int("attempt", attempt+1),
    zap.Error(err),
)
Field Description
offset Byte offset from the start of the device
size_bytes Size of the block being processed
attempt Current retry attempt

Configuration

LVMSync uses pflag and viper to accept options from flags, environment variables, and a YAML file. Flag groups are organized into dedicated FlagSets that are registered with the root command and bound to Viper. The CLI exposes subcommands using cobra, with run handling transfers, manifest rebuild regenerating manifests, and verify checking source and destination data. Source and destination paths for run and verify are provided as positional arguments after any flags:

lvmsync run --dry-run [flags] <source> <dest>

When run with --dry-run, LVMSync loads any manifest at --manifest-path and samples up to 100 blocks to estimate the bytes that would be transmitted. The estimate, expected duration in milliseconds (estimated_duration_ms), and bandwidth in bits per second (estimated_bandwidth_bps) are logged without sending data. For example:

{"level":"info","msg":"dry run","size_bytes":4096,"estimated_tx_bytes":4096,"estimated_duration_ms":2000,"estimated_bandwidth_bps":16000}

Running with --plan emits a JSON document describing the resolved configuration (with sensitive fields like SSH passwords and TLS keys redacted), transport order, estimated transfer bytes, and the compression algorithm selected for each chunk size class.

Examples

Set the parallel worker count using any configuration source:

CLI flag:

lvmsync run --dry-run --parallel 16

Environment variable:

LVMSYNC_PARALLEL=16 lvmsync run --dry-run

config.yaml:

parallel: 16

Flag groups

Flags are grouped in the CLI help:

  • General Options – worker counts, speed limits, progress controls.
  • SSH Options – credentials and connection settings.
  • Remote Options – remote hooks and lvmsync path.
  • Deduplication Options – dedup strategy and state storage.
  • Compression Options – algorithm and level tuning.
  • LVM Options – snapshot management and privilege escalation.
  • Transport Options – configure data transports (QUIC, HTTP/2, TCP+TLS, SSH).
  • Manifest Options – manifest path overrides and related settings.

Internally, each group is set up through a dedicated helper such as initGeneralFlags, initSSHFlags, or initCompressionFlags, keeping flag definitions focused and easy to maintain.

Example:

lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup
func initConfig() *viper.Viper {
    v := viper.New()

    general := pflag.NewFlagSet("general", pflag.ExitOnError)
    general.Bool("progress", true, "show progress")

    lvm := pflag.NewFlagSet("lvm", pflag.ExitOnError)
    lvm.String("volume_group", "", "target volume group")

    pflag.CommandLine.AddFlagSet(general)
    pflag.CommandLine.AddFlagSet(lvm)

    v.BindPFlags(pflag.CommandLine)
    v.SetEnvPrefix("LVMSYNC")
    v.AutomaticEnv()
    v.SetConfigName("config")
    v.AddConfigPath(".")
    return v
}

Grouped help

Each subcommand prints its relevant flag groups:

$ lvmsync run --help
General Options:
      --parallel int   number of worker goroutines (default 4)
...
Transport Options:
      --transport string   transport modes (comma-separated)

$ lvmsync manifest rebuild --help
General Options:
      --dry-run   skip execution
Manifest Options:
      --manifest-path string   manifest file path

$ lvmsync verify --help
General Options:
      --block-size string   block size for comparisons
Manifest Options:
      --manifest-path string   manifest to verify against

This groups related flags once and lets Viper merge values from flags, `LVMSYNC_*` variables, and the
`config.yaml` file.

The overall loading flow now passes an explicit `FlagSet` and argument slice:

1. `registerFlags(flagSets, fs)` adds all flag groups to the provided flag set.
2. `config.NewBuilder(defaults).Build(fs, args)` parses the arguments, binds flags and `LVMSYNC_*` environment variables with Viper,
   merges them with defaults and any `config.yaml` file, and returns the effective configuration plus leftover positional arguments.

`cmd/root.Configure` surfaces those leftover arguments so `Run` operates purely on provided inputs.

### New and updated flags

Recent refactors added several configuration options:

- `--tcp-port` and `--ssh-port` expose TCP+TLS and SSH endpoints.
- `--tcp-parallel` controls the number of parallel TCP connections (2–4).
- `--tcp-lowat` sets TCP_NOTSENT_LOWAT to limit unsent bytes.
- `--sync-interval` controls how many bytes are written between `fdatasync` calls. Accepts size suffixes like `64KB` or `1GB`; invalid values return an error.
- `--checkpoint-interval` sets how often resume state is persisted.
- `--checkpoint-bytes` sets how many bytes are written between resume checkpoints.
- `--block-size` sets the transfer block size (use `auto` for detection).

### I/O tuning

- `--block-size` selects the transfer block size. Use `auto` to match the destination's physical sector size.
- `--sync-interval` sets how many bytes are written between `fdatasync` calls. Accepts size suffixes like `64KB` or `1GB`; invalid values cause startup errors.
- `--odirect` uses O_DIRECT with block-size aligned buffers.
- `--numa-pin` pins worker goroutines to CPUs local to the source device's NUMA node. If `/sys` lacks NUMA details, LVMSync logs a warning and continues without pinning. Use `--numa-node` to override.
- `--numa-node` pins worker goroutines to the specified NUMA node.

### Device types

LVMSync works with three kinds of source and destination devices. Auto-detection
examines the path to select the correct handling:

| Type | Detection | Notes |
|------|-----------|-------|
| `lvm` | `/dev/<vg>/<lv>` or `/dev/mapper/<vg>-<lv>` | A snapshot is created and removed automatically |
| `raw` | Other block devices | Require `--skip-snapshot-creation` and either `--offline` or `--fs-freeze-command`/`--fs-thaw-command` |
| `file` | Regular files | Used as-is with no snapshot |

Override detection with `--source-type` and `--dest-type` when necessary.

Internally, `device.Detect` delegates to dedicated helpers:

```go
esc, err := privilege.New(ctx, logger)
if err != nil {
    // handle error
}
dev, err := device.Detect(ctx, "/dev/sdb", true, true, "auto", "", "", "", 0, 0, esc, logger, device.NewRunner())
// detectFileDevice, detectLVMDevice, or detectRawDevice is selected based on the path.

Snapshots provide a crash-consistent view of a device. LVM volumes are snapshotted automatically and removed after transfer. Raw block devices and regular files do not have a snapshot mechanism; to avoid inconsistent reads you must either take them offline with --offline or freeze the filesystem with --fs-freeze-command and --fs-thaw-command. Snapshot creation requires root privileges, so non-root invocations must permit escalation via sudo -n. The escalation command is checked during device detection and operations abort immediately if escalation fails.

Examples:

lvmsync --source-type lvm /dev/vg0/origin /tmp/dump
lvmsync --dest-type raw dumpfile /dev/sdb
lvmsync --source-type raw --offline /dev/sdb /tmp/dump
lvmsync --source-type raw --fs-freeze-command "/usr/sbin/fsfreeze -f '/mnt/data dir'" --fs-thaw-command "/usr/sbin/fsfreeze -u '/mnt/data dir'" /dev/sdb /tmp/dump

Raw device safety

Reading from a live block device can corrupt data if writes occur during the transfer. Ensure a consistent view with one of the following options:

  • --offline – assert that no process will write to the source device.
  • --fs-freeze-command/--fs-thaw-command – run commands that freeze and thaw the filesystem around the read. Command paths must be absolute. Arguments are parsed with shell-style quoting, so wrap paths containing spaces in quotes.
  • Time out freeze and thaw helpers with --freeze-timeout and --thaw-timeout (default 10s).

Freeze and thaw commands are validated before execution. Paths must be absolute, command names must match ^[a-zA-Z0-9._-]+$, be set, free of NUL bytes, every argument must avoid NULs, and the executable must exist; otherwise lvmsync returns an error.

Example using the provided scripts:

lvmsync --source-type raw \
  --fs-freeze-command "$(pwd)/docs/fsfreeze-freeze.sh /mnt" \
  --fs-thaw-command "$(pwd)/docs/fsfreeze-thaw.sh /mnt" \
  /dev/sdb /tmp/dump

docs/fsfreeze-freeze.sh and docs/fsfreeze-thaw.sh demonstrate basic freeze and thaw operations; add the scripts to your $PATH to use them.

Configuration sources and precedence

LVMSync uses pflag and viper so every option can be set via flags, environment variables, or the config.yaml file. Values are resolved with the following precedence (highest first):

  1. Command-line flags
  2. LVMSYNC_* environment variables
  3. config.yaml
  4. Built-in defaults

This precedence applies to duration timeouts, filesystem paths, and security flags. Unknown keys in config.yaml generate warnings, and settings like allow_insecure must be explicitly acknowledged via the flag.

Environment variables use the flag name in uppercase with underscores, e.g.:

export LVMSYNC_PARALLEL=8
export LVMSYNC_SSH_USER=backup

For example, if config.yaml sets dedup_strategy: bloom and the environment specifies LVMSYNC_DEDUP_STRATEGY=checksum, running lvmsync --dedup-strategy auto resolves to auto. Likewise, --transport quic overrides LVMSYNC_TRANSPORT_TRANSPORT=ssh and the transport key in config.yaml.

--sanitize-env and filesystem freeze/thaw commands follow the same precedence. With a config.yaml containing:

fs-freeze-command: "/usr/sbin/fsfreeze -f '/mnt/yaml dir'"
fs-thaw-command: "/usr/sbin/fsfreeze -u '/mnt/yaml dir'"
sanitize_env: false
export LVMSYNC_FS_FREEZE_COMMAND="/usr/sbin/fsfreeze -f '/mnt/env dir'"
export LVMSYNC_FS_THAW_COMMAND="/usr/sbin/fsfreeze -u '/mnt/env dir'"
export LVMSYNC_SANITIZE_ENV=0

Running:

lvmsync --fs-freeze-command "/usr/sbin/fsfreeze -f '/mnt/flag dir'" \
        --fs-thaw-command "/usr/sbin/fsfreeze -u '/mnt/flag dir'" \
        --sanitize-env run /dev/sdb /tmp/dump

uses the flag-supplied command paths and enables sanitization.

For boolean options, the same precedence applies. If config.yaml specifies check_partition: true but LVMSYNC_CHECK_PARTITION=false is set, lvmsync --check-partition enables the check. Omitting the flag leaves partition checks disabled because the environment value overrides the YAML configuration.

With a config.yaml containing:

parallel: 4

running LVMSYNC_PARALLEL=8 lvmsync run --parallel 16 results in parallel=16 because flags override environment variables, which override the config file.

For a boolean option:

dry_run: true

running LVMSYNC_DRY_RUN=true lvmsync run --dry-run=false src dst performs a real transfer because the --dry-run flag overrides both the environment variable and the config file. running LVMSYNC_DRY_RUN=true lvmsync verify --dry-run=false src dst performs a full verification because the --dry-run flag overrides both the environment variable and the config file.

A similar hierarchy applies to duration values:

retry_delay: 1s

Running LVMSYNC_RETRY_DELAY=2s lvmsync run --retry-delay 3s uses a retry delay of 3s.

Environment variables for the lvmsync daemon use the LVMSYNC_DAEMON_ prefix. Multi-value settings are comma-separated:

LVMSYNC_DAEMON_LISTEN=unix:///run/lvmsyncd.sock,tcp+tls://:9000 lvmsyncd

Grouped options use dedicated prefixes: LVMSYNC_DEDUP_, LVMSYNC_DAEMON_. For example:

LVMSYNC_LVM_SNAPSHOT_SIZE=25% lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup

Option reference

Flags override environment variables, which override config.yaml values.

Flag Environment variable Config key Description
--config LVMSYNC_CONFIG config Path to config YAML file
--stdout LVMSYNC_STDOUT stdout Write change dump to STDOUT (prompts when TTY, requires --yes-i-know otherwise)
--strict-config LVMSYNC_STRICT_CONFIG strict-config Treat configuration warnings as errors
--yes-i-know LVMSYNC_YES_I_KNOW yes_i_know Confirm destructive write operations in non-interactive sessions
--source-type LVMSYNC_SOURCE_TYPE source-type Source device type: auto, file, raw, or lvm
--dest-type LVMSYNC_DEST_TYPE dest-type Destination device type: auto, file, raw, or lvm
--offline LVMSYNC_OFFLINE offline Assume source raw device is offline
--fs-freeze-command LVMSYNC_FS_FREEZE_COMMAND fs-freeze-command Command to freeze filesystem before reading raw source; path must be absolute, arguments are split with shell-style quoting and executable name must match ^[a-zA-Z0-9._-]+$
--fs-thaw-command LVMSYNC_FS_THAW_COMMAND fs-thaw-command Command to thaw filesystem after reading raw source; path must be absolute, arguments are split with shell-style quoting and executable name must match ^[a-zA-Z0-9._-]+$
--freeze-timeout LVMSYNC_FREEZE_TIMEOUT freeze-timeout Timeout for filesystem freeze command
--thaw-timeout LVMSYNC_THAW_TIMEOUT thaw-timeout Timeout for filesystem thaw command
--mode LVMSYNC_MODE mode Configuration preset: default or throughput; unknown modes fail validation
--parallel LVMSYNC_PARALLEL parallel Number of concurrent workers
--concurrency LVMSYNC_TRANSPORT_CONCURRENCY concurrency Stream concurrency (0 to autotune based on BDP)
--zerocopy LVMSYNC_ZEROCOPY zerocopy Enable zero-copy transfers
--odirect LVMSYNC_ODIRECT odirect Use O_DIRECT for device I/O when possible
--numa-pin LVMSYNC_NUMA_PIN numa_pin Pin worker goroutines to device NUMA node; logs a warning and continues if NUMA data is missing
--numa-node LVMSYNC_NUMA_NODE numa_node Pin worker goroutines to specified NUMA node, overriding automatic detection
--max-retries LVMSYNC_MAX_RETRIES max_retries Maximum number of retries per block
--retry-delay LVMSYNC_RETRY_DELAY retry_delay Initial delay between retries
--resume LVMSYNC_RESUME resume Path to resume state file (verification runs unless --verify=none)
--verify-only LVMSYNC_VERIFY_ONLY verify_only Verify destination against source without writing data
--speed LVMSYNC_SPEED speed Transfer speed limit
--sync-interval LVMSYNC_SYNC_INTERVAL sync_interval Bytes between fdatasync calls (accepts size suffixes like 64KB; invalid values error)
--checkpoint-bytes LVMSYNC_CHECKPOINT_BYTES checkpoint_bytes Bytes between resume checkpoints
--checkpoint-interval LVMSYNC_CHECKPOINT_INTERVAL checkpoint_interval Duration between checkpoints
--block-size LVMSYNC_BLOCK_SIZE block_size Block size for data transfer; specify 'auto' or 0 for automatic detection
--verbose LVMSYNC_VERBOSE verbose Verbosity level
--verify-checksum LVMSYNC_VERIFY_CHECKSUM verify_checksum Enable checksum verification
--verify LVMSYNC_VERIFY verify Verification level: inline, post, or none
--digest LVMSYNC_DIGEST digest Digest algorithm: auto, blake3, or sha256 (auto selects blake3 when AVX2, AVX-512, or NEON is available, otherwise sha256)
--progress LVMSYNC_PROGRESS progress Show progress during transfer
--output LVMSYNC_OUTPUT output Output format: text, json, or yaml
--delta LVMSYNC_DELTA delta Delta algorithm: none or rsync
--manifest-path LVMSYNC_MANIFEST_PATH manifest_path Path to manifest file
--manifest-progress-interval LVMSYNC_MANIFEST_PROGRESS_INTERVAL manifest_progress_interval Interval between progress logs during manifest rebuild
--manifest-timeout LVMSYNC_MANIFEST_TIMEOUT manifest_timeout Timeout for manifest rebuild (0 disables)
--manifest-allow-mounted LVMSYNC_MANIFEST_ALLOW_MOUNTED manifest_allow_mounted Allow rebuilding when device is mounted read-write
--ssh-host LVMSYNC_SSH_HOST ssh_host SSH host
--ssh-user LVMSYNC_SSH_USER ssh_user SSH username
--ssh-key LVMSYNC_SSH_KEY ssh_key Path to SSH private key
--ssh-host-key-path LVMSYNC_SSH_HOST_KEY_PATH ssh_host_key_path Path to SSH host private key
--ssh-agent LVMSYNC_SSH_AGENT ssh_agent Use SSH agent for authentication
--ssh-port LVMSYNC_SSH_PORT ssh_port SSH port
--ssh-timeout LVMSYNC_SSH_TIMEOUT ssh_timeout SSH connection timeout
--ssh-keepalive LVMSYNC_SSH_KEEPALIVE ssh_keepalive SSH keepalive interval
--ssh-host-key LVMSYNC_SSH_HOST_KEY ssh_host_key Expected SSH host public key
--known-hosts LVMSYNC_KNOWN_HOSTS known_hosts Path to known_hosts file
--strict-host-key-checking LVMSYNC_STRICT_HOST_KEY_CHECKING strict_host_key_checking Require host keys to be present in known_hosts; when false, host key verification is disabled
--lvmsync-path LVMSYNC_LVMSYNC_PATH lvmsync_path Remote command to run (basename sanitized; only [a-zA-Z0-9._-]+ allowed)
--remote-pre-script LVMSYNC_REMOTE_PRE_SCRIPT remote_pre_script Remote script to run before transfer (times out after ssh_timeout)
--remote-post-script LVMSYNC_REMOTE_POST_SCRIPT remote_post_script Remote script to run after transfer (separate ssh_timeout)
--dedup-strategy LVMSYNC_DEDUP_STRATEGY dedup_strategy Deduplication strategy: none, auto, checksum, rolling_hash, or bloom
--dedup-state-file LVMSYNC_DEDUP_STATE_FILE dedup_state_file Path to deduplication state file
--intra-dedup LVMSYNC_DEDUP_INTRA_DEDUP intra_dedup Enable intra-run deduplication
--cdc-min LVMSYNC_DEDUP_CDC_MIN cdc_min Minimum chunk size for CDC (must be at least 64 bytes)
--cdc-avg LVMSYNC_DEDUP_CDC_AVG cdc_avg Target average chunk size for CDC
--cdc-max LVMSYNC_DEDUP_CDC_MAX cdc_max Maximum chunk size for CDC
--chunk-seed LVMSYNC_DEDUP_CHUNK_SEED chunk_seed Seed for chunking
--bloom-entries LVMSYNC_DEDUP_BLOOM_ENTRIES bloom_entries Estimated number of entries for bloom filter
--bloom-fp-rate LVMSYNC_DEDUP_BLOOM_FP_RATE bloom_fp_rate False positive rate for bloom filter
--bloom-mbits LVMSYNC_DEDUP_BLOOM_MBITS bloom_mbits Bloom filter m bits power
--compress LVMSYNC_COMPRESSION_COMPRESS compress Compression type: none, lz4, zstd, or auto
--zstd-level LVMSYNC_COMPRESSION_ZSTD_LEVEL zstd_level Zstd compression level (1-5)
--lz4-level LVMSYNC_COMPRESSION_LZ4_LEVEL lz4_level LZ4 compression level: fast or hc
--compress-concurrency LVMSYNC_COMPRESSION_COMPRESS_CONCURRENCY compress_concurrency Compression concurrency (0 to use GOMAXPROCS)
--compress-threshold LVMSYNC_COMPRESSION_COMPRESS_THRESHOLD compress_threshold Skip compression when estimated ratio exceeds this value
--skip-snapshot-creation LVMSYNC_SKIP_SNAPSHOT_CREATION skip_snapshot_creation Skip automatic snapshot creation (requires --force)
--skip-disk-check LVMSYNC_SKIP_DISK_CHECK skip_disk_check Skip disk space check before snapshot creation
--snapshot-size LVMSYNC_SNAPSHOT_SIZE snapshot_size Snapshot size (e.g., 20G or 20%)
--snapshot-max-usage LVMSYNC_SNAPSHOT_MAX_USAGE snapshot_max_usage Maximum allowed snapshot usage percent before aborting
--lvm-escalation LVMSYNC_LVM_ESCALATION lvm_escalation Command used to escalate privileges for LVM commands; parsed with shell-style quoting and validated at startup
--sanitize-env LVMSYNC_SANITIZE_ENV sanitize_env Drop dangerous variables like LD_PRELOAD and remove PATH/LANG during escalation (disabled by default)
--no-new-privs LVMSYNC_NO_NEW_PRIVS no_new_privs Set PR_SET_NO_NEW_PRIVS before invoking sudo
--lvm-timeout LVMSYNC_LVM_TIMEOUT lvm_timeout Timeout for LVM operations and privilege checks
--sig-cache-ttl LVMSYNC_LVM_SIG_CACHE_TTL sig-cache-ttl TTL for cached LVM signatures
--sig-cache-max LVMSYNC_LVM_SIG_CACHE_MAX sig-cache-max Maximum cached LVM signatures
--volume-group LVMSYNC_VOLUME_GROUP volume_group Source volume group; derived from the source device path when empty
--target-volume-group LVMSYNC_TARGET_VOLUME_GROUP target_volume_group Volume group name of the target LVM volume
--target-vgs LVMSYNC_TARGET_VGS target_vgs Candidate target volume groups for auto-selection
--create-dest-lv LVMSYNC_CREATE_DEST_LV create_dest_lv Create destination logical volume when missing (requires --force or confirmation)
--force LVMSYNC_FORCE force Override safety checks and proceed on mounted destination
--force-offline LVMSYNC_FORCE_OFFLINE force_offline Allow direct device writes; prompts for double-confirm when interactive (requires --yes-i-know otherwise)
--allow-overwrite LVMSYNC_ALLOW_OVERWRITE allow_overwrite Allow overwriting existing data; requires --yes-i-know for non-interactive sessions
--check-partition LVMSYNC_CHECK_PARTITION check_partition Verify partition signatures for source and destination
--discard LVMSYNC_DISCARD discard Issue BLKDISCARD before writing blocks and verify discarded regions
--dry-run LVMSYNC_DRY_RUN dry_run Log estimated transfer bytes without sending data; uses manifest sampling when available
--enable-quic LVMSYNC_ENABLE_QUIC enable_quic Enable QUIC transport registration
--plan LVMSYNC_PLAN plan Print configuration plan as JSON and exit
--verify-only LVMSYNC_VERIFY_ONLY verify_only Read source and destination and report mismatches without writing data
--probe-only LVMSYNC_PROBE_ONLY probe_only Validate devices and privileges and print size_bytes kernel_uuid gpt_uuid mbr_signature fs_uuid major minor manifest_epoch without transferring data
--sparse LVMSYNC_SPARSE sparse Sparse file handling: auto punches holes, never writes zero blocks
--transport LVMSYNC_TRANSPORT_TRANSPORT transport Ordered transports to try (e.g., ssh,tcp+tls,h2,quic)
--tcp-port LVMSYNC_TRANSPORT_TCP_PORT tcp_port TCP+TLS port
--tcp-parallel LVMSYNC_TRANSPORT_TCP_PARALLEL tcp_parallel Number of parallel TCP connections
--tcp-lowat LVMSYNC_TRANSPORT_TCP_LOWAT tcp_lowat TCP_NOTSENT_LOWAT in bytes
--client-cert LVMSYNC_CLIENT_CERT client_cert Client TLS certificate file
--client-key LVMSYNC_CLIENT_KEY client_key Client TLS key file
--ca-cert LVMSYNC_CA_CERT ca_cert CA certificate file
--allow-insecure - allow_insecure Allow insecure (no TLS)

If --ssh-key is empty, lvmsync contacts the SSH agent referenced by SSH_AUTH_SOCK. The agent connection uses --ssh-timeout as its deadline. SSH transport negotiation also derives read and write deadlines from the caller's context; when the context expires, the handshake fails quickly and deadlines are cleared afterward.

Common deployment scenarios

  • Local disk to disk:

    lvmsync run /dev/vg0/source /dev/vg0/backup
  • Remote over SSH:

    lvmsync run /dev/vg0/source user@backup:/dev/vg1/target --ssh-key ~/.ssh/id_ed25519
  • Rsync delta with dedup and compression:

    lvmsync run --delta=rsync --dedup-strategy bloom --compress lz4 --transport rsync --allow-insecure --dry-run /tmp/src /tmp/dst

    The rsync transport is excluded from binaries by default. Compile with go build -tags rsync (and run tests with go test -tags rsync) to enable it.

  • Throughput-optimized transfer:

    lvmsync run --mode throughput /dev/vg0/source /dev/vg1/target

lvmsyncd Examples

lvmsyncd exposes replication endpoints with the --listen flag. Each URI scheme selects a transport and optional parameters configure authentication.

Start a TLS/TCP listener:

lvmsyncd --listen tcp+tls://:9443 --server-cert server.pem --server-key server.key --client-cert client.pem --client-key client.key --ca-cert ca.pem

Activate an SSH listener:

lvmsyncd --listen ssh://:2222 --ssh-host-key-path host_key

Both transports require explicit keys; the daemon exits if any are missing and never generates self-signed certificates. Use --allow-insecure only for development.

config.yaml example

parallel: 4               # General Options
ssh_host: backup          # SSH Options
ssh_user: backup          # SSH Options
remote_pre_script: pre.sh # Remote Options
dedup_strategy: bloom     # Deduplication Options
compress: auto            # Compression Options
zstd_level: 3             # Compression Options
lz4_level: hc             # Compression Options
compress_threshold: 0.9   # Compression Options
snapshot_size: 20%        # LVM Options
create_dest_lv: false     # LVM Options

Use --config to point to a different file.

Invocation examples

With flags:

lvmsync run --dry-run --parallel 8 \
  --compress auto --zstd-level 3 --lz4-level hc --compress-threshold 0.9 \
  --snapshot-size 10% /dev/vg0/snap0 /mnt/backup

With environment variables:

LVMSYNC_PARALLEL=8 LVMSYNC_SNAPSHOT_SIZE=10% lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup

With a config file:

lvmsync run --dry-run --config config.yaml /dev/vg0/snap0 /mnt/backup

Transport Registry

Transport selection is controlled by the --transport flag, which accepts a comma-separated ordered list of transports to attempt (for example ssh,tcp+tls,h2,quic). The quic transport runs over TLS 1.3 with mutual authentication, negotiates the lvmsync ALPN, and exposes both bidirectional streams and datagrams. The h2 transport also requires TLS 1.3 with client certificates and negotiates the h2 ALPN. Provide certificates via --server-cert, --server-key, --client-cert, --client-key, and --ca-cert. TLS transports require a trusted CA certificate and refuse connections when no roots are provided unless insecure mode is explicitly acknowledged with the --allow-insecure flag. This bypasses certificate verification and is intended for development only; configuration files alone cannot enable it. Client certificates must be supplied explicitly; transports no longer generate self-signed certificates automatically. The transport documentation covers each option in depth. The flags below configure transport behavior.

Flags and environment variables

| Flag | Environment variable | Description | mTLS | |------|----------------------|-------------|------|| | --transport | LVMSYNC_TRANSPORT_TRANSPORT | Ordered transports to try (e.g., ssh,tcp+tls,h2,quic) | | --concurrency | LVMSYNC_TRANSPORT_CONCURRENCY | Stream concurrency (0 to autotune based on BDP) | | --tcp-port | LVMSYNC_TRANSPORT_TCP_PORT | TCP+TLS port | | --h2-port | LVMSYNC_H2_PORT | HTTP/2 port | | --tcp-parallel | LVMSYNC_TRANSPORT_TCP_PARALLEL | Number of parallel TCP connections | | --tcp-lowat | LVMSYNC_TRANSPORT_TCP_LOWAT | TCP_NOTSENT_LOWAT in bytes | | --ssh-port | LVMSYNC_SSH_PORT | SSH port | | --ssh-port | LVMSYNC_SSH_PORT | SSH port | ❌ | | --client-cert | LVMSYNC_CLIENT_CERT | Client TLS certificate file | ✅ | | --client-key | LVMSYNC_CLIENT_KEY | Client TLS key file | ✅ | | --ca-cert | LVMSYNC_CA_CERT | CA certificate file | ✅ | | --tcp-parallel | LVMSYNC_TCP_PARALLEL | Number of parallel TCP connections | n/a | | --tcp-lowat | LVMSYNC_TCP_LOWAT | TCP_NOTSENT_LOWAT in bytes | n/a |

Usage examples

Multiple transports

lvmsync run --dry-run --transport ssh,tcp+tls,h2,quic --tcp-port 9443 /dev/vg0/snap0 /mnt/backup

QUIC

0-RTT data is disabled by default.

lvmsync run --dry-run --transport quic --client-cert cert.pem --client-key key.pem --ca-cert ca.pem
# or
LVMSYNC_TRANSPORT_TRANSPORT=quic LVMSYNC_CLIENT_CERT=cert.pem LVMSYNC_CLIENT_KEY=key.pem LVMSYNC_CA_CERT=ca.pem lvmsync run --dry-run

TCP+TLS

lvmsync run --dry-run --transport tcp+tls --tcp-port 9443
# or
LVMSYNC_TRANSPORT_TRANSPORT=tcp+tls LVMSYNC_TRANSPORT_TCP_PORT=9443 lvmsync run --dry-run

HTTP/2

lvmsync run --dry-run --transport h2 --h2-port 9443 --client-cert cert.pem --client-key key.pem --ca-cert ca.pem

SSH

lvmsync run --dry-run --transport ssh backup@host:/dev/vg1/target --ssh-port 2222
# or
LVMSYNC_TRANSPORT_TRANSPORT=ssh LVMSYNC_SSH_PORT=2222 lvmsync run --dry-run backup@host:/dev/vg1/target

Hybrid Deduplication and Adaptive Compression

Hybrid dedup combines fixed-size and content-defined chunking. Enable it with --dedup hybrid and tune FastCDC with --cdc-min, --cdc-avg, and --cdc-max.

Flag (--cdc-*) Environment variable Config key Description
--cdc-min LVMSYNC_DEDUP_CDC_MIN cdc_min Minimum chunk size (must be at least 64 bytes)
--cdc-avg LVMSYNC_DEDUP_CDC_AVG cdc_avg Target average chunk size
--cdc-max LVMSYNC_DEDUP_CDC_MAX cdc_max Maximum chunk size

The three values must be positive, with --cdc-min at least 64 bytes, and satisfy --cdc-min ≤ --cdc-avg ≤ --cdc-max. LVMSync aborts when the sizes are non-positive, below the minimum, or unordered.

The Bloom filter de-duplicates previously seen chunks. Size it with --bloom-entries and desired false positive rate via --bloom-fp-rate. For an mmap-backed index, --bloom-mbits controls the bitmap size in megabits. The defaults (--bloom-entries=1000000, --bloom-fp-rate=0.01) consume about 1.14 MiB and yield ~1% false positives, while --bloom-mbits=27 allocates roughly 16 MiB for ~0.8% false positives. False positives are rare but possible; if a chunk collides in the Bloom filter it is treated as already transferred. A final SHA-256 digest over the transfer detects any mismatches so retries can resend the affected data. The mmap-backed index (*.idx) is truncated to zero on startup so each run begins with a clean bitset.

Compression samples 8 KiB from each chunk and skips when the estimated ratio exceeds --compress-threshold. --compress auto selects Zstd when AVX2 or NEON is available, falling back to LZ4 otherwise.

CLI:

lvmsync run --dry-run --dedup hybrid --cdc-min 262144 --cdc-avg 1048576 --cdc-max 4194304 /dev/vg0/snap0 /mnt/backup

Environment:

LVMSYNC_DEDUP=hybrid \
LVMSYNC_DEDUP_CDC_MIN=262144 \
LVMSYNC_DEDUP_CDC_AVG=1048576 \
LVMSYNC_DEDUP_CDC_MAX=4194304 \
lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup

YAML:

dedup: hybrid
cdc_min: 262144
cdc_avg: 1048576
cdc_max: 4194304

Dedup configuration

The dedup package exposes a LoadConfig helper that reads tuning parameters from flags, LVMSYNC_* environment variables, or keys in a YAML file. Values are resolved with the following precedence (highest first):

  1. Command-line flags
  2. LVMSYNC_* environment variables
  3. config.yaml
  4. Built-in defaults
Flag Environment variable Config key Description Default
--min-chunk-size LVMSYNC_MIN_CHUNK_SIZE min_chunk_size Minimum chunk size in bytes 4096
--max-chunk-size LVMSYNC_MAX_CHUNK_SIZE max_chunk_size Maximum chunk size in bytes 1048576
--false-positive-rate LVMSYNC_FALSE_POSITIVE_RATE false_positive_rate Bloom filter false positive rate 0.001
--ram-bytes LVMSYNC_RAM_BYTES ram_bytes RAM budget for the Bloom filter 1073741824
--volume-size LVMSYNC_VOLUME_SIZE volume_size Size of the volume being processed 0
--hash-key LVMSYNC_HASH_KEY hash_key Optional hex-encoded key for BLAKE3 hashing ""

Two presets are available via --mode: default and throughput. Any other value causes configuration validation to fail.

Throughput Mode Presets

--mode throughput applies a set of options tuned for high-bandwidth links:

  • transport order ssh,tcp+tls,h2,quic
  • concurrency 8
  • deduplication mode hybrid
  • compression auto
  • enables --odirect

CLI:

lvmsync run --dry-run --mode throughput /dev/vg0/snap0 /mnt/backup

Environment:

LVMSYNC_MODE=throughput lvmsync run --dry-run /dev/vg0/snap0 /mnt/backup

YAML:

mode: throughput

Logging and progress

Logs are emitted with zap to stderr. When --output=text (default), progress updates are logged to stderr when --progress is enabled. Set --output=json to emit progress events on stdout as line-delimited JSON. The verify subcommand and --verify-only mode support --output=json or --output=yaml to write structured verification results to stdout. objects while preserving the regular logs on stderr. Each object follows this schema:

{"event": "progress", "bytes_transferred": 12345, "bytes_total": 67890, "progress_percent": 18.2}

A final object with {"event": "complete", "progress_percent": 100} marks completion. Disable progress entirely with --progress=false.

Installation

Requirements

  • Go 1.22+
  • Linux only (tested on amd64 and arm64 architectures)
  • pkg-config
  • LVM2 with development headers providing liblvm2cmd (liblvm2-dev)
    • A recent LVM2 release providing the modern liblvm2cmd API (e.g., 2.03.21+) is required.
  • SSH client & server (for remote transfers)

Installing LVM2 Development Headers

CGO uses pkg-config to locate the LVM2 and device-mapper libraries. Install the development headers and pkg-config package on your system:

# Debian/Ubuntu
sudo apt install -y lvm2 liblvm2-dev pkg-config

# RHEL/CentOS
sudo yum install -y lvm2-devel pkgconfig

If the .pc files are installed in a non-standard location, set PKG_CONFIG_PATH so that pkg-config can find them.

Build

Clone the repository and build the binary using Go modules with CGO enabled. A helper target checks for the required native libraries and pkg-config:

make deps  # verify pkg-config, device-mapper, and LVM2 headers

The make build target runs this check automatically.

Then build the binaries:

git clone https://github.com/oferchen/lvmsync_go.git
cd lvmsync_go
go mod tidy
CGO_ENABLED=1 go build -o lvmsync .

To build on systems without LVM2, disable CGO. This uses stub implementations and omits LVM features:

CGO_ENABLED=0 go build -o lvmsync .

Makefile

make build   # build binaries
make test    # run tests

Usage

Show Version

lvmsync --version

Outputs Version Commit BuildDate, for example:

v0.2.0 abcdef1 2024-01-02T15:04:05Z

Basic Syntax

lvmsync run [--dry-run] [--transport ssh,tcp+tls,h2,quic] <snapshot|lvm device> <destination>

The tool supports both local and remote transfers. Use --dry-run to print planned actions without executing and --transport to provide an ordered list of transports to try.

Resume, Manifest, and Verify

Run an initial transfer and write a manifest for later verification or incremental runs:

lvmsync run --dry-run --manifest-path snapshot.manifest /dev/vg0/source /dev/vg1/target

See the manifest documentation for details on the binary format and rebuild options.

Resume an interrupted transfer using a checkpointed state file:

lvmsync run --dry-run --resume=statefile /dev/vg0/snap0 /dev/vg0/data

Rebuild a manifest index for an existing device. The command verifies the current manifest and rewrites it when digests or device metadata have changed:

lvmsync manifest rebuild /dev/vg0/lv0

Progress logs are emitted every 10s by default; adjust with --manifest-progress-interval. The command times out after 1m unless overridden with --manifest-timeout (0 disables). Rebuild refuses to run if the device is mounted read-write; pass --manifest-allow-mounted to override. Mount detection parses /proc/self/mountinfo using github.com/moby/sys/mountinfo, correctly handling bind mounts, repeated entries, and devices with spaces or special characters. Rebuild fails if the device reports a block size of 0.

Manifests embed a persistent device identifier in a fixed 64-byte field. The manifest rebuild command fails if the identifier exceeds this limit.

Verify that a source and destination match:

lvmsync verify /dev/vg0/source /dev/vg1/target

Supply options such as block size or deduplication mode to control how data is compared. For example, to estimate verification without reading data:

lvmsync verify --dry-run /dev/vg0/source /dev/vg1/target

To verify using 4 KiB blocks and a manifest generated earlier:

lvmsync verify --block-size 4K /dev/vg0/source /dev/vg1/target

Flags are parsed via Viper, so the same settings can be provided through LVMSYNC_* environment variables or a config.yaml file.

Options

General Options

Option Description Default
--config Path to a YAML configuration file ""
--parallel Number of concurrent workers 4
--zerocopy Enable zero-copy transfers (only used in sequential mode) false
--max-retries Maximum number of retries per block 3
--retry-delay Initial delay between retries
100ms
--resume Path to resume state file (verification runs unless --verify=none)
                             | `""`      |

| --verify-only | Verify destination against source without writing data | false | | --speed | Transfer speed limit (e.g., "100MB") | "100MB" | | -v, --verbose | Verbosity level (e.g., -v, -vv, -vvv) | 0 | | --verify-checksum | Enable checksum verification for data integrity | false | | --progress | Show progress percentage during the transfer | true | | --block-size | Block size for data transfer (e.g., "4K", "64K", "512K", "1M"), use 0 for automatic detection | "4K" | | --delta | Delta algorithm (none or rsync) | "none" | | --dry-run | Print actions without executing | false | | --plan | Print configuration plan as JSON and exit | false | | --discard | Issue BLKDISCARD before writing blocks | false | | --sparse | Sparse file handling: auto punches holes, never writes zero blocks | "auto" | | --offline | Assume source raw device is offline | false | | --fs-freeze-command | Command to freeze filesystem before reading raw source; path must be absolute and arguments use shell-style quoting | "" | | --fs-thaw-command | Command to thaw filesystem after reading raw source; path must be absolute and arguments use shell-style quoting | "" | | --freeze-timeout | Timeout for filesystem freeze command | 10s | | --thaw-timeout | Timeout for filesystem thaw command | 10s | | --transport | Ordered transports to try (e.g., ssh,tcp+tls,h2,quic) | "" |

SSH Options

Option Description Default
--ssh-host SSH host "localhost"
--ssh-user SSH username "root"
--ssh-key Path to SSH private key ""
--ssh-host-key-path Path to SSH host private key (generates one if empty) ""
--ssh-agent Use the SSH agent for authentication false
--ssh-port SSH port number 22
--known-hosts Path to known_hosts file (defaults to $HOME/.ssh/known_hosts) $HOME/.ssh/known_hosts
--strict-host-key-checking Require host keys to be present in known_hosts; when false, host key verification is disabled true
--ssh-host-key Expected SSH host public key (authorized_keys format) ""

Unknown hosts are rejected unless their keys are present in known_hosts or match --ssh-host-key. The host key can also be supplied via LVMSYNC_SSH_HOST_KEY_PATH or the ssh_host_key_path YAML option; precedence follows flags over environment variables over YAML.

Programmatic use of the SSH transport requires a configuration populated with fields like SSHUser, SSHKeyPath, HostKeyPath, SSHUseAgent, SSHPort, KnownHosts, StrictHostKeyCheck, SSHTimeout, SSHKeepAliveInterval, and MaxRetries. The constructor also requires a *zap.Logger:

logger, _ := zap.NewProduction()
defer logger.Sync()
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
tr, err := ssh.New(ctx, cfg)

Remote Options

The --lvmsync-path value is sanitized to its basename and must match [a-zA-Z0-9._-]+ to prevent shell injection.

Option Description Default
--lvmsync-path Remote command to run (sanitized basename) "lvmsync"
--remote-pre-script Remote script to run before starting the transfer (ssh_timeout applies) ""
--remote-post-script Remote script to run after finishing the transfer (uses a fresh ssh_timeout) ""

Both scripts are run with the configured ssh_timeout. The post script uses its own timeout and still attempts to execute even when the main transfer fails; timeouts or cancellations are reported separately.

Deduplication Options

Option Description Default
--dedup Deduplication mode ("fixed", "cdc", or "hybrid") "fixed"
--cdc-min Minimum chunk size for CDC 262144
--cdc-avg Average chunk size for CDC 1048576
--cdc-max Maximum chunk size for CDC 4194304
--dedup-strategy Deduplication strategy ("none", "auto", "checksum", "rolling_hash", or "bloom"); use none to disable "none"
--dedup-state-file Path to deduplication state file ~/.lvmsync_dedup
--bloom-entries Estimated number of entries for bloom filter 1000000
--bloom-fp-rate False positive rate for bloom filter 0.01
--bloom-mbits Size of Bloom filter bitmap in megabits (mmap index) 0

Compression Options

Option Description Default
--compress Compression type (options: "none", "lz4", "zstd", "auto") "auto"
--zstd-level Zstd compression level (1-5) 1
--lz4-level LZ4 compression level: fast or hc fast
--compress-concurrency Number of goroutines used for compression (0 to use all cores) 0
--compress-threshold Skip compression when estimated ratio exceeds this value 0.9

LVM Options

Option Description Default
--skip-snapshot-creation Skip automatic snapshot creation false
--skip-disk-check Skip disk space check before snapshot creation false
--snapshot-size Snapshot size as an absolute value (e.g., "20G") or as a percentage (e.g., "20%") "20%"
--snapshot-max-usage Maximum allowed snapshot usage percent before aborting 80
--volume-group Source volume group. Derived from the source device path when empty ""
--target-volume-group Volume group name of the target LVM volume ""
--target-vgs Candidate target volume groups for auto-selection []
--lvm-escalation Command used to re-execute the program with elevated privileges when not running as root (e.g., sudo -p "my prompt" -n); parsed with shell-style quoting and validated at startup "sudo -n"
--lvm-timeout Timeout for LVM operations and privilege checks 10s
--sig-cache-ttl TTL for cached LVM signatures 24h
--sig-cache-max Maximum cached LVM signatures 128

The --lvm-escalation value must be a single executable and its arguments; pipes, redirects, and other shell operators are rejected. Quote any paths or argument values containing spaces, for example:

--lvm-escalation "/usr/bin/sudo wrapper" -p "no password" -n

lvm_timeout also bounds the startup privilege check to avoid hanging when escalation commands stall.

The client aborts dialing if a connection cannot be established within

Option Description Default
--keepalive-time Interval between server pings 2m
--keepalive-timeout Timeout waiting for keepalive ack 20s
--request-timeout Deadline for unary RPCs 15s
--client-cert Client TLS certificate file ""
--client-key Client TLS key file ""
--ca-cert CA certificate file ""
--allow-insecure Allow insecure (disable TLS) false

Examples

Local Transfer

Transfer changes from a snapshot to a destination device locally:

lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

Remote Transfer

Replicate data to a remote host. The destination must be specified in host:device format (optionally including a username, e.g., user@host:/dev/vg0/data):

lvmsync run --dry-run /dev/vg0/snap0 user@remote:/dev/vg0/data

Using Compression

Estimate a sample of each chunk and compress only when it's worthwhile.

CLI:

lvmsync run --dry-run --compress auto --zstd-level 2 --compress-threshold 0.85 /dev/vg0/snap0 /dev/vg0/data

Environment:

LVMSYNC_COMPRESSION_COMPRESS=auto \
LVMSYNC_COMPRESSION_ZSTD_LEVEL=2 \
LVMSYNC_COMPRESSION_COMPRESS_THRESHOLD=0.85 \
lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML:

compress: auto
zstd_level: 2
compress_threshold: 0.85

Rate Limiting

Transfers can be throttled using a token bucket accurate to ±3% of the target. Each writer has its own limiter, so multiple transfers with different limits run independently. Limit the transfer speed to 50MB/s:

lvmsync run --dry-run --speed 50MB /dev/vg0/snap0 /dev/vg0/data

Resuming a Transfer

Resume an interrupted transfer using a resume state file. The file records the last chunk boundaries and digests for fixed, CDC, and hybrid modes. Progress is checkpointed every --checkpoint-bytes or --checkpoint-interval, and the resume file is removed on successful completion. Changing the transport, compression, checksum algorithm, or dedup mode invalidates the checkpoint:

lvmsync run --dry-run --resume=statefile /dev/vg0/snap0 /dev/vg0/data

Full LVM Operation Example

lvmsync run --dry-run --skip-disk-check=false --snapshot-size "25%" --volume-group "vg_data" --lvm-escalation "sudo -n" /dev/vg_data/original /dev/vg_data/destination

In this example, LVMSync will:

  • Validate that the volume group vg_data exists.
  • Create a snapshot of /dev/vg_data/original sized at 25% of the original volume.
    • Automatically re-exec with sudo -n if not running as root.
  • Monitor snapshot usage (failing fast if usage exceeds 80% by default; configurable via --snapshot-max-usage).
  • Perform the block-level transfer.
  • Remove the snapshot upon completion.
  • Clean up gracefully if interrupted.

Manifest Rebuild and Verification

Rebuild a manifest for an existing device when the index is missing or stale:

lvmsync manifest rebuild /dev/vg0/lv0

Progress logs are emitted every 10s by default; adjust with --manifest-progress-interval. The command times out after 1m unless overridden with --manifest-timeout (0 disables).

Compare source and destination devices against a manifest:

lvmsync verify /dev/vg0/snap0 /mnt/backup

Use --dry-run with verify to inspect planned operations without modifying the destination:

lvmsync verify --dry-run /dev/vg0/source /dev/vg1/target

Configuration Sources

LVMSync binds its command line flags to Viper, allowing configuration through flags, environment variables, or a YAML file. The resolution order is:

  1. command line flags
  2. environment variables (LVMSYNC_*)
  3. the configuration file (default: config.yaml)

Environment variables use the LVMSYNC_ prefix and match flag names converted to upper case with hyphens replaced by underscores. The --config flag can point to an alternative YAML file.

Unused or unknown keys in the YAML file produce runtime warnings to surface typos. For example, a config.yaml containing an unrecognized unused_key triggers:

{"level":"warn","msg":"unknown configuration key \"unused_key\""}

Examples by Option Group

General

CLI:

lvmsync run --dry-run --parallel 8 --resume=statefile /dev/vg0/snap0 /dev/vg0/data

Environment:

LVMSYNC_PARALLEL=8 LVMSYNC_RESUME=statefile lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML (config.yaml):

parallel: 8
resume: statefile

SSH

CLI:

lvmsync run --dry-run --ssh-user backup --ssh-port 2222 /dev/vg0/snap0 backup:/dev/vg0/data

Environment:

LVMSYNC_SSH_HOST=backup LVMSYNC_SSH_USER=backup LVMSYNC_SSH_PORT=2222 lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML:

ssh_host: backup
ssh_user: backup
ssh_port: 2222

Remote

CLI:

lvmsync run --dry-run --lvmsync-path /usr/bin/lvmsync --remote-pre-script /tmp/pre.sh /dev/vg0/snap0 user@host:/dev/vg0/data

Environment:

LVMSYNC_LVMSYNC_PATH=/usr/bin/lvmsync LVMSYNC_REMOTE_PRE_SCRIPT=/tmp/pre.sh lvmsync run --dry-run /dev/vg0/snap0 user@host:/dev/vg0/data

YAML:

lvmsync_path: /usr/bin/lvmsync
remote_pre_script: /tmp/pre.sh

Deduplication

CLI:

lvmsync run --dry-run --dedup-strategy bloom --dedup-state-file ~/.lvmsync_state /dev/vg0/snap0 /dev/vg0/data

Environment:

LVMSYNC_DEDUP_STRATEGY=bloom LVMSYNC_DEDUP_STATE_FILE=~/.lvmsync_state lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML:

dedup_strategy: bloom
dedup_state_file: ~/.lvmsync_state

When --dedup-strategy auto is used, lvmsync chooses a strategy based on available RAM, volume size, and CPU capabilities.

Condition Selected strategy
Bloom filter fits in RAM bloom
Doesn't fit, checksum acceleration available checksum
Doesn't fit, no acceleration rolling_hash

See docs/dedup_strategies.md for more details.

LVMSync automatically reloads this state file on startup. Delete it to reset deduplication: rm ~/.lvmsync_dedup. When saving Bloom filter state, LVMSync logs dedup_bloom_stats with entries, configured_fp_rate, and observed_fp_rate.

Compression

LVMSync samples 8 KiB from each chunk to gauge compression efficiency. If the compressed sample ratio is greater than or equal to --compress-threshold, the chunk is sent uncompressed. In auto mode, Zstd is used when AVX2 or NEON is available; otherwise LZ4 is selected. The compression threshold is tunable via --compress-threshold (LVMSYNC_COMPRESSION_COMPRESS_THRESHOLD or compress_threshold), where values near 1 favor compression and lower values skip high-entropy data. Levels can be tuned with --zstd-level (1-5) or --lz4-level (fast or hc).

CLI:

lvmsync run --dry-run --compress auto --zstd-level 2 --compress-threshold 0.85 /dev/vg0/snap0 /dev/vg0/data

Environment:

LVMSYNC_COMPRESSION_COMPRESS=auto LVMSYNC_COMPRESSION_ZSTD_LEVEL=2 LVMSYNC_COMPRESSION_COMPRESS_THRESHOLD=0.85 lvmsync run --dry-run /dev/vg0/snap0 /dev/vg0/data

YAML:

compress: auto
zstd_level: 2
compress_threshold: 0.85

LVM

CLI:

lvmsync run --dry-run --snapshot-size 25% --volume-group vg_data /dev/vg_data/original /dev/vg_data/destination

Environment:

LVMSYNC_SNAPSHOT_SIZE=25% LVMSYNC_VOLUME_GROUP=vg_data lvmsync run --dry-run /dev/vg_data/original /dev/vg_data/destination

YAML:

snapshot_size: "25%"
volume_group: vg_data
Flag Environment variable Config key Description

Precedence example:

EOF
# effective port: 3333

CLI:

Environment:

tls-cert: cert.pem

Use --config to provide an alternate config file path.

Configuration Validation

Before starting, LVMSync validates key configuration parameters:

  • Verifies that the specified volume groups exist.
  • Ensures the escalation command is available if not running as root.

Invalid configurations will cause the tool to abort with a clear error message.

Exit Codes

LVMSync signals success and failure with structured exit codes. Refer to the operations guide for the complete list and recommended recovery steps.

Code Meaning Recovery Step
0 Success None
10 Privilege or capability check failed Run as root or adjust --lvm-escalation.
20 Device error Verify device paths and snapshot health.
25 Snapshot exhausted Extend the snapshot or reduce source writes before resuming.
30 Unsupported platform Run on a supported Linux platform.
40 Configuration error Review flags, environment variables, and config.yaml.
50 Runtime failure Inspect logs and fix the issue.
60 Verification mismatch Investigate mismatched data before retrying.
70 Partial transfer Address the error and resume with --resume.

Credits

LVMSync is written in Go by Ofer Chen, inspired by mpalmer/lvmsync.

Contributing

Contributions are welcome. See AGENTS.md for detailed contributor instructions and open TODO items. Please follow the project's coding guidelines, include appropriate logging and error handling, and update documentation as needed.

Single-Responsibility Functions

Keep functions and packages focused on a single task to simplify maintenance and testing. Break up large components when behavior grows to preserve clarity.

Dependency Injection

Decouple modules by injecting dependencies through interfaces or constructor parameters. This approach makes components easier to test and swap during refactoring.

For example, the privilege package exposes an Escalator interface so tests can stub command execution:

ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
esc, err := privilege.New(ctx, zap.NewNop())
if err != nil {
    // handle error
}
if err := esc.Ensure(ctx); err != nil {
    // handle missing capabilities or sudo
}
cmd := esc.Command(ctx, "lvs", "--version")
_ = cmd

Test Coverage

Every change should include unit tests. Run go test -cover ./... to ensure coverage remains high and regressions are caught early.

Compression detection uses benchmark-driven selection between LZ4 and Zstd and now includes dedicated tests verifying algorithm choice and cache resets.

Production readiness

  • Structured logging uses zap; always defer a logger sync to flush entries.
  • Configuration is parsed with pflag and viper. Every option can be set via CLI flags, LVMSYNC_* environment variables, or the config.yaml file.
  • Related flags are organized into thematic FlagSets for concise help output.
  • Each function in the codebase includes unit tests covering both success and failure paths.
  • Before sending patches, run go build ./..., go test -cover ./..., and golangci-lint run.

Development

Prerequisites

LVMSync requires Go 1.24.3 or later. Verify your installation with:

go version

Development Setup

The Super-Linter workflow validates the entire repository.

Linting

The .golangci.yml config uses standard Go formatters such as gci, gofmt, gofumpt, goimports, and golines. A misconfigured swaggo formatter entry was removed. Run the linter locally to mirror CI:

golangci-lint run ./...

Testing

Run unit tests with coverage:

go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out

Some privileged tests are skipped when run without root access.

Integration tests manipulate LVM volumes and loop devices. Running them requires root privileges and utilities such as lvremove, pvremove, and losetup. These tests are skipped when prerequisites are missing.

The workflow enforces a minimum of 50% total coverage.

License

GPLv3 License. See the LICENSE file for details.

About

Syncing LVM volume from snapshot locally or over the network

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages