VolMemLyzer is a modular memory forensics toolkit that wraps Volatility 3 with three complementary workflows:
- Run mode – ergonomic “Volatility-as-a-service”: run plugins in parallel, cache outputs, and keep artifact naming/dirs predictable for downstream code.
- Extract mode – registry-driven feature extraction from plugin outputs, flattened and stable (CSV/JSON) for ML pipelines.
- Analyze mode – a stepwise DFIR triage workflow (bearings → processes → injections → network → persistence) with clear, Rich-rendered tables.
VolMemLyzer aims to unlock Volatility’s full potential for researchers and analysts who want frictionless runs inside their own codebases—not just from Volatility’s CLI.
- Quickstart (Compatibility Shim)
- Why v3 (at a glance)
- Key capabilities
- How it fits together
- Requirements
- Installation
- CLI usage (volmemlyzer)
- Python API
- Artifacts, formats & caching
- Performance tips
- Troubleshooting
- Roadmap
- License
- Team Members
- Acknowledgement
Heads up about main.py (compatibility shim): A small main.py is included only for backward compatibility with older docs/scripts. It accepts the legacy flags and produces a single aggregated features file per run.
If you don't need this legacy entry point, you can remove main.py. Keeping it won't cause drift because it calls the library directly.
Preferred interface: use the packaged CLI command volmemlyzer (see below).
- CSV →
<outdir>/features/output.csv(one row per image) - JSON →
<outdir>/features/output.json
Process a single dump
python main.py \
-f /path/to/images/IMAGE.mem \
-o ./out \
-V /path/to/volatility3/vol.pyBatch a folder of dumps (recursive)
python main.py \
-f /path/to/images/ \
-o ./out \
-V /path/to/volatility3/vol.pyThe tool writes out/features/output.csv with one row per image.
Speed up triage by skipping heavy plugins (Not using the --drop or --plugins will result in running all plugins in the plugins.py). example:
python main.py -f ./mem -o ./out -V ./volatility3/vol.py \
-D "dumpfiles,filescan,mftscan,driverscan,mutantscan,modscan,netscan,poolscanner,symlinkscan,callbacks,deskscan,devicetree,driverirp,drivermodule,windowstations"Legacy options (main.py)
-f, --memdump Path to a memory image OR a folder of images (required)
-o, --output Output directory for artifacts & features (required)
-V, --volatility Path to Volatility3's vol.py (required)
-D, --drop Comma-separated plugin list to skip (e.g., "filescan,modscan")
-P, --plugins Comma-separated plugin list to include
-F, --format csv|json (default: csv)
-j, --jobs Parallel workers
--no-cache Ignore cached plugin outputs
- Modular architecture (Runner → Registry → Pipeline → Analysis/TUI → CLI) so you can import only the pieces you need.
- Research-friendly UX: parallel plugin execution, caching & (where possible) output conversion to avoid reruns, stable artifact naming, and one-row-per-image FeatureRow for ML.
- DFIR triage workflow: opinionated but explainable steps with clean tables (thanks to Rich).
- Stable, flat features: consistent columns across images, robust null handling, clear
plugin.metricnaming.
-
Run Volatility 3 plugins with:
- parallelism (
-j/--jobs), - per-plugin timeouts,
- per-run renderer choice,
- cache reuse with optional conversion to the needed format (see notes below).
- Complete end-to-end pipeline capable of automatic resolving of the volatility path (Runs as service)
- parallelism (
-
Extract features from selected plugins via a registry of extractor functions:
- flatten to a single CSV/JSON file per run (one row per image),
- ML-ready features with consistent naming,
- dependency-aware scheduling of plugins.
-
Perform an analysis as a multi-step DFIR overview (which can be outputted in json):
- 0 Bearings (
windows.info) - 1 Processes (
pslist+psscan+psxview+pstree+ cross-checks) - 2 Injections (
malfind) - 3 Network (
netscan) - 4 Persistence (registry/tasks :
scheduled_tasks+userassist+hivelist/hivescan)
- 0 Bearings (
CLI ──► Pipeline ──► VolRunner (vol.py) ──► artifacts/*.json|jsonl|csv|txt
│
├─► Extractors (registry) ──► FeatureRow rows → output.csv
│
└─► OverviewAnalysis (steps) ──► Rich tables / JSON summary
- VolRunner builds/executes
vol.pycommands and names outputs predictably:
<outdir>/<imagebase>_<plugin>.<ext>plus<…>.stderr.txton errors. - Pipeline orchestrates parallel runs, caching, and (when supported) format conversion to avoid re-running a plugin just to change formats.
- ExtractorRegistry binds a plugin spec (
windows.pslist, deps, default renderer/timeout) to a Python extractor function. All available volatility plugins are added by default - OverviewAnalysis implements the triage steps; TerminalUI (Rich) prints academic-style tables with a tasteful left accent bar.
- Python 3.9+
- A local checkout/installation of Volatility 3; you will point VolMemLyzer at
vol.pyusing--vol-path(orVOL_PATHenv).
VolMemLyzer does not import Volatility; it invokes it as a subprocess. - Python packages (installed automatically if you use
pip install):pandas,numpy,python-dateutil,tqdm,rich
Tested primarily with Windows images (e.g.,
.vmem,.raw,.dmp,.bin). Other OSes may work where Volatility supports them.
Zero-friction Volatility path (no --vol-path needed): In v3, if you don’t pass --vol-path, VolMemLyzer automatically resolves Volatility 3 in this order: (1) any explicit hint you provided (--vol-path or the VOL_PATH env var); (2) an importable module in the current environment — it launches python -m volatility3; (3) the vol console script on your PATH; and (4) a few common local vol.py locations. This removes path-hunting and venv confusion, so most users can run one-line commands with sane defaults, while power users can still pin a specific checkout by supplying --vol-path. The result is a cleaner, faster CLI with fewer errors and zero reliance on hard-coded filesystem paths.
pip install volmemlyzer
# then the CLI is available as:
volmemlyzer --helpgit clone https://github.com/<you>/volmemlyzer.git
cd volmemlyzer
pip install -e .
# or, without packaging:
pip install -r requirements.txt
python -m volmemlyzer.cli --help or volmemlyzer --helpTo make the CLI easy to grasp, each mode below shows two examples:
- Simple — the cleanest command that relies on smart defaults (no friction).
- Complete — the fully flexible form with all commonly used arguments shown.
The packaged CLI exposes analyze, run, extract, and list subcommands.
--vol-path PATH Path to 'vol' or 'vol.py' (optional; auto-detected if omitted, env: VOL_PATH)
--renderer NAME Volatility renderer: json | jsonl | csv | pretty | quick | none (default: json)
--timeout SECONDS Per-plugin timeout in seconds (default: 0 = disabled)
-j, --jobs N Parallel workers (default: CPU count)
--log-level LEVEL CRITICAL | ERROR | WARNING | INFO | DEBUG
- Volatility: auto-detected (prefer
python -m volatility3, thenvolon PATH). - Outdir:
<image_dir>/.volmemlyzer(created if missing). - Renderer:
json. - Timeout:
0(disabled). - Jobs: 1 (No paralellism unless user incerements the jobs).
- Caching: enabled (omit
--no-cacheto reuse artifacts).
Run DFIR triage steps over one image.
Simple (all defaults)
volmemlyzer analyze -i "D:\dumps\host.vmem"Runs steps 0–4 (bearings → persistance), writes artifacts to D:\dumps\.volmemlyzer, renderer json, no parallelization in plugin runs, caching on.
Complete (all key flags)
# PowerShell line continuations shown with ^
volmemlyzer ^
--vol-path "C:\tools\volatility3\vol.py" --renderer json --timeout 600 -j 4 ^
analyze -i "D:\dumps\host.vmem" -o "D:\dumps\.volmemlyzer" ^
--steps 0,1,2 --json ^
--no-cacheRuns steps 0–2 (bearings → Injections), all plugin runs have 10 minute timeouts, using four jobs for paralellization, writes artifacts to D:\dumps\.volmemlyzer, and an output file to/cases/.volmemlyzer/analysis/win10.raw.json (outdir inferred), caching off.
Options:
-i/--image(required): memory image file-o/--outdir: artifacts directory (default near the image, as shown above)--steps(comma list or aliases:bearings|info,processes|proc|ps,injections|malfind,network|net|netscan,persistence|reg|tasks,kernel,report)--json: write the step summary to a JSON file--high-level: when supported, show only the highest-risk findings--no-cache: ignore cached plugin outputs
Run raw Volatility plugins (parallel, cached, selected renderer).
Simple (defaults + pick plugins to run)
volmemlyzer run -i /cases/win10.raw --plugins pslist,pstree,psscanWrites artifacts to /cases/.volmemlyzer, renderer json, no parallelization in plugin runs, caching on.
Complete (all key flags)
volmemlyzer --vol-path /opt/volatility3/vol.py --renderer json -j 6 run -i /cases/win10.raw -o /cases/.volmemlyzer (--plugins pslist,pstree,psscan or --drop netscan) --no-cacheNote that the drop and plugins should not be used together
Options:
-i/--image(required): memory image file-o/--outdir: artifacts directory (default near the image)--renderer: renderer for this run (json|jsonl|csv|pretty|quick|none)--plugins: comma list to include--drop: comma list to exclude--no-cache: ignore cached outputs
Output:
The CLI prints the artifacts directory and each plugin’s output path:
[+] raw artifacts directory: /cases/.volmemlyzer
- pslist → /cases/.volmemlyzer/win10.raw_pslist.json
- pstree → /cases/.volmemlyzer/win10.raw_pstree.json
...
Extract ML-ready features for a single image or an entire directory (recursive).
Simple (single file; choose output format)
volmemlyzer extract -i /cases/win10.raw -f csvWrites to /cases/.volmemlyzer/features/win10.raw.csv (outdir inferred), no parallelization in plugin runs, caching on.
Simple (directory, recursive)
volmemlyzer extract -i /cases/ -f csvWrites one features file per dump under /cases/.volmemlyzer/features/.
Prefer a single aggregated file (one row per image)? Use the legacy compatibility shim
python main.py -f /cases -F csv, which produces<outdir>/features/output.csv.
Complete (all key flags)
volmemlyzer --vol-path /opt/volatility3/vol.py -j 4 extract -i /cases/ -o /cases/.volmemlyzer -f csv (--plugins pslist,malfind or --drop netscan) --no-cacheNote that the drop and plugins should not be used together
Options:
-i/--image(required): file or directory. Directories are scanned for*.vmem, *.raw, *.dmp, *.binrecursively.-o/--outdir: artifacts directory (default near the image or directory)-f/--format:jsonorcsv(required)--plugins: comma list to restrict extraction--drop: comma list to exclude--no-cache: ignore cached outputs
The full enumerated list is long. Below is a compact index by category with representative examples. Each bullet maps to multiple concrete CSV columns.
System & OS
info.Is64,info.winBuild,info.IsPAE,info.SystemTime
Processes, Threads & Trees
pslist.nproc,pslist.avg_threads,pslist.wow64_ratio,pslist.zombie_countpstree.max_depth,pstree.avg_branching_factor,pstree.cross_session_edgesthreads.nThreads,threads.kernel_startaddr_ratio
Modules & DLL Loading
dlllist.ndlls,dlllist.avg_dllPerProc,dlllist.maxLoadDelaySecldrmodules.not_in_load,ldrmodules.memOnlyRatiomodules.nModules,modules.largeModuleRatio
Handles (type mix & access patterns)
handles.nHandles,handles.nTypeToken,handles.privHighAccessPct,handles.maxHandlesOneProc
Code Injection & VADs
malfind.ninjections,malfind.RWXratio,malfind.maxVADsizevadinfo.exec_ratio,vadinfo.large_commit_count,vadwalk.max_vad_size
Kernel Callbacks, Drivers & Pools
callbacks.ncallbacks,callbacks.distinctModules,callbacks.noSymbolbigpools.nAllocs,bigpools.nonPagedRatio,bigpools.tagEntropyMeanunloaded.n_entries,unloaded.repeated_driver_ratio
Network & Sockets
netscan.nConn,netscan.publicEstablished,netscan.duplicateListennetstat.nConn,netstat.nEstablished
Registry & Services
registry.hivescan.orphan_offset_count,registry.hivelist.user_hive_countregistry.certificates.disallowed_count,registry.userassist.avg_focus_countsvclist.running_services_count,svcscan.Start_Auto
Command History & Consoles
cmdline.urlInArgs,cmdline.scriptExec,cmdscan.maxCmdsconsoles.nConhost,consoles.histBufOverflow,consoles.dumpIoC
GUI Objects (Windows, Desktops, WindowStations)
windows.total_window_objs,windows.null_title_ratio,windows.station_mismatch_countdeskscan.uniqueDesktops,deskscan.session0GuiCountwinsta.custom_station_count,winsta.service_station_ratio
Memory Mapping & Statistics
virtmap.unused_size_ratio,virtmap.max_region_size_mb,virtmap.pagedpool_fragmentationstatistics.invalid_page_ratio,statistics.swapped_page_count
Version Info / PE Metadata
verinfo.valid_version_ratio,verinfo.dup_base_countamcache.future_compile_ratio,amcache.nonMicrosoftRatio
Need a full column list? See the
FEATURES.md.
List available building blocks: Volatility 3 plugins and extractor-backed plugins.
Synopsis
volmemlyzer list [--vol]/[--registry] [--grep STR] [--max N] [--json]Does
--volShow Volatility 3 plugin names (parsed fromvol.py -horpython -m volatility3 -h).--registryShow extractor-backed plugins in VolMemLyzer’s registry.- If neither flag is given, shows both.
Options
--grep STRCase-insensitive substring filter.--max NLimit items per list (0= unlimited).
Notes
- Pass
--vol-pathglobally ifvol.pyisn’t on PATH. Exits0even when lists are empty.
Examples
# Show both sources (default)
volmemlyzer list
# Only registry extractors
volmemlyzer list --registry
# All available volatility plugins
volmemlyzer list --vol
# Only Volatility plugins, filter all regisrtry-based plugins, cap output
volmemlyzer list --vol --grep "windows.registry" --max 50
You can import individual layers in notebooks or other tools.
from volmemlyzer.runner import VolRunner
from volmemlyzer.extractor_registry import ExtractorRegistry
from volmemlyzer.plugins import build_registry
from volmemlyzer.pipeline import Pipeline
runner = VolRunner(vol_path="/opt/volatility3/vol.py",
default_renderer="json", default_timeout_s=600)
registry: ExtractorRegistry = build_registry()
pipe = Pipeline(runner, registry)res = pipe.run_plugin_raw(
image_path="/cases/win10.raw",
enable={"pslist", "pstree"},
renderer="json",
outdir="/cases/.volmemlyzer",
concurrency=4,
use_cache=True,
)
print(res.artifacts["plugins"]["pslist"]) # → path to JSON outputfrom dataclasses import asdict
row = pipe.run_extract_features(
image_path="/cases/win10.raw",
enable={"pslist", "malfind"},
concurrency=4,
artifacts_dir="/cases/.volmemlyzer",
use_cache=True,
)
print(asdict(row))from volmemlyzer.analysis import OverviewAnalysis
analysis = OverviewAnalysis()
summary = analysis.run_steps(
pipe=pipe,
image_path="/cases/win10.raw",
artifacts_dir="/cases/.volmemlyzer",
steps=[0,1,2,3,4,6], # bearings → report
use_cache=True,
high_level=False,
)
# 'summary' is a dict you can also persist as JSON- Artifacts directory: defaults to
<imagebase>.artifacts/created next to the image (or pass-o/--outdir). - File naming:
<outdir>/<imagebase>_<plugin>.<ext>and<…>.stderr.txton errors. - Renderers:
json | jsonl | csv | pretty | quick | none. - Caching: If an artifact already exists, VolMemLyzer will reuse it. Where supported, it will convert to the desired format to avoid rerunning the plugin; if not convertible, it will rerun with the requested renderer.
- Features: aggregated into one file per run under
<outdir>/features/→output.csvoroutput.json(one row per image).
Tip: Prefer
jsonfor downstream extractors; it’s the most consistently supported by extractors.
- Increase
-j/--jobsto match your CPU cores. - Use
--no-cacheonly when you genuinely need fresh plugin runs. - Set
--timeoutto bound misbehaving plugins.
-
“Permission denied” writing artifacts
Ensure-o/--outdirpoints to a directory (not an existing file). VolMemLyzer will create<outdir>and<outdir>/features/as needed. -
vol.py not found / wrong Python
Use--vol-path(orVOL_PATH) to point at the actualvol.py. The Runner uses your current interpreter (sys.executable) to launch it. -
Weird paths on Windows
Quote paths with spaces. BothC:\and\\?\long paths work if your Python is configured for them. -
Renderer conversion
If a cached artifact isn’t convertible to the required format, VolMemLyzer will rerun that plugin with the requested renderer.
- More extractors (registry/artifacts/GUI objects)
- Additional kernel and persistence checks in
analysis - Optional HTML report backend
- Unit tests and example datasets
This package is using Volatility and following their LICENSE.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (VolMemLyzer), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
For citation VolMemLyzer V1.0.0, V2.0.0, or V3.0.0 in your works and also understanding it completely, you can find below published papers:
@INPROCEEDINGS{9452028,
author={Lashkari, Arash Habibi and Li, Beiqi and Carrier, Tristan Lucas and Kaur, Gurdip},
booktitle={2021 Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS)},
title={VolMemLyzer: Volatile Memory Analyzer for Malware Classification using Feature Engineering},
year={2021},
volume={},
number={},
pages={1-8},
doi={10.1109/RDAAPS48126.2021.9452028}}
- Arash Habibi Lashkari: Founder and Project Owner
- Yasin Dehfouli: Master Student, Researcher and Developer (Python 3.0 - VolMemLyzer-V3.0.0)
- Abhay Pratap Singh: Undergraduate Student, Researcher and Developer (Python 3.0 - VolMemLyzer-V2.0.0)
- Beiqi Li: Undergraduate Student, Developer (Python 2.7 - VolMemLyzer V1.0.0)
- Tristan Carrier: Master Student, Researcher, and developer (Python 2.7 - VolMemLyzer V1.0.0)
- Gurdip Kaur: Postdoctorall Fellow Researcher (Python 2.0 - VolMemLyzer V1.0.0)
This project has been made possible through funding from the Natural Sciences and Engineering Research Council grant from Canada—NSERC (#RGPIN-2020-04701)—to Arash Habibi Lashkari and Mitacs Global Research Internship (GRI) for the researchers.