DEZ VM is a lightweight 32-bit virtual machine designed for educational purposes and simple program execution. It features a complete assembler toolchain with modern language features including labels, comments, and advanced instruction encoding.
- 32-bit RISC-like instruction set with 16 general-purpose registers
- Complete assembler toolchain with lexer, parser, and symbol table
- Professional-grade disassembler with multiple output formats (objdump, hexdump, detailed)
- Column headers and line numbers for clear disassembly output
- Symbol table integration with address-to-symbol mapping
- Label and comment support for readable assembly code
- Advanced instruction encoding supporting both register-to-register and register-to-immediate operations
- System calls for I/O operations including string printing
- Memory management with code, data, and stack segments
- Binary file format for compiled programs
- Comprehensive test suite with CTest integration
- Enhanced syntax highlighting with support for binary and hexadecimal numbers
- Total Memory: 16KB (4096 words)
- Code Segment: 0x0000 - 0x0FFF (4KB, read-only)
- Data Segment: 0x1000 - 0x1FFF (4KB, writable)
- Stack Segment: 0x2000 - 0x3FFF (8KB, writable)
- R0 - R15: 16 general-purpose 32-bit registers
- PC: Program Counter
- SP: Stack Pointer
- Flags: Status flags (zero, carry, overflow)
# Create build directory
mkdir build && cd build
# Build the project
cmake ..
make
# All executables will be created in the bin/ directory
ls bin/
# asm dez_vm disasm_tool test_assembly test_core test_memory test_performance test_strings
# Run all tests
ctest --output-on-failureCreate a .s file with your program:
; examples/hello.s
start: ; Entry point label
MOV R0, 42 ; Load value 42 into R0
SYS R0, PRINT ; Print the value
HALT ; Stop executionUse the assembler to compile your assembly code:
# From the build directory
./bin/asm ../examples/hello.s hello.binExecute the compiled binary with the VM:
# From the build directory
./bin/dez_vm hello.binRun the comprehensive test suite:
# Run all tests
ctest --output-on-failure
# Or run specific tests
./bin/test_assembly
./bin/test_coreDEZ assembly supports multiple number formats for immediate values:
- Decimal:
42,255,4095 - Hexadecimal:
0x2A,0xFF,0xFFF(prefix with0x) - Binary:
0b101010,0b11111111,0b111111111111(prefix with0b)
All formats are equivalent and can be used interchangeably:
MOV R1, 42 ; Decimal
MOV R2, 0x2A ; Hexadecimal (same as 42)
MOV R3, 0b101010 ; Binary (same as 42)Note: Immediate values are limited to 12 bits (0-4095) due to instruction encoding constraints.
MOV R1, 42- Load immediate value 42 into R1MOV R1, R2- Copy R2 to R1
ADD R1, R2, R3- R1 = R2 + R3 (register-to-register)ADD R1, R2, 5- R1 = R2 + 5 (register-to-immediate)SUB R1, R2, R3- R1 = R2 - R3 (register-to-register)SUB R1, R2, 3- R1 = R2 - 3 (register-to-immediate)MUL R1, R2, R3- R1 = R2 * R3DIV R1, R2, R3- R1 = R2 / R3
STORE R1, 256- Store R1 at memory address 256
JMP label- Jump to labelJZ label- Jump if zero flag setJNZ label- Jump if zero flag not set
SYS R1, PRINT- Print value in R1SYS R1, PRINT_CHAR- Print character in R1
CMP R1, R2- Compare R1 and R2 (sets flags)CMP R1, 5- Compare R1 with immediate value 5 (sets flags)HALT- Halt executionNOP- No operation
label:- Define a label; This is a comment- Full-line commentMOV R0, 42 ; Inline comment- Inline comment
; Simple hello world program
start: ; Entry point
MOV R0, 42 ; Load value 42 (decimal)
SYS R0, PRINT ; Print the value
HALT ; Stop execution; Demonstrating different number formats
start:
MOV R0, 42 ; Decimal
MOV R1, 0x2A ; Hexadecimal (same as 42)
MOV R2, 0b101010 ; Binary (same as 42)
ADD R3, R0, R1 ; 42 + 42 = 84
SYS R3, PRINT ; Print result
HALT; Basic arithmetic with immediate values
start:
MOV R0, 10 ; Load first operand
MOV R1, 20 ; Load second operand
ADD R2, R0, R1 ; Add: R2 = R0 + R1 = 30
SUB R3, R0, R1 ; Subtract: R3 = R0 - R1 = -10
HALT; Loop to sum numbers 1 to 5
start:
MOV R0, 5 ; Initialize counter
MOV R1, 0 ; Initialize sum
loop: ; Loop label
ADD R1, R1, R0 ; Add counter to sum
SUB R0, R0, 1 ; Decrement counter
CMP R0, 0 ; Check if counter is zero
JNZ loop ; Continue loop if not zero
done: ; End of loop
HALT ; Stop execution; Conditional jump example
start:
MOV R0, 5 ; Set R0 to 5
MOV R1, 5 ; Set R1 to 5 (equal to R0)
CMP R0, R1 ; Compare R0 and R1
JZ equal ; Jump if equal
MOV R2, 1 ; This should be skipped
JMP end ; Jump to end
equal: ; Jump target for equal case
MOV R2, 1 ; Set R2 to 1
end: ; End label
HALT ; Stop execution; Store and retrieve values from memory
start:
MOV R0, 123 ; Load value 123
STORE R0, 256 ; Store R0 at address 256
MOV R1, 456 ; Load value 456
STORE R1, 257 ; Store R1 at address 257
HALT ; Stop executionCompiles assembly source files to binary format:
# From the build directory
./bin/asm ../examples/hello.s hello.binProfessional-grade disassembler with multiple output formats, column headers, and symbol table support:
# From the build directory
./bin/disasm_tool hello.bin# objdump-like format with proper columns
./bin/disasm_tool -f objdump hello.bin
# hexdump-like format with ASCII representation
./bin/disasm_tool -f hexdump hello.bin
# Detailed format with instruction breakdown
./bin/disasm_tool -f detailed hello.bin
# Simple format (default)
./bin/disasm_tool -f simple hello.bin# Show column headers explaining each column
./bin/disasm_tool -f objdump -H hello.bin
# Show line numbers for better readability
./bin/disasm_tool -f hexdump -l hello.bin
# Combine both features
./bin/disasm_tool -f objdump -H -l hello.bin# Show symbol information in disassembly
./bin/disasm_tool -f objdump -S hello.bin
# Display complete symbol table
./bin/disasm_tool -S
# Load symbol table from file
./bin/disasm_tool --symbol-file symbols.txt hello.bin# Disassemble specific address range
./bin/disasm_tool -s 0x100 -c 50 hello.bin
# Disassemble individual instructions
./bin/disasm_tool -f objdump 0x0000001d 0x10000400
# Show help for all options
./bin/disasm_tool --helpobjdump format with headers and symbols:
Address Instruction Disassembly
----------------------------------------------------
0x00000000: 0000001d HALT <main>
0x00000004: 10000400 MOV R0, #1024
0x00000008: 1010000a MOV R1, #10
0x0000000C: 10200000 MOV R2, #0
0x00000010: 06320804 MUL R3, R2, R0 <loop_start>
hexdump format with headers:
Address Hex Bytes ASCII Disassembly
--------------------------------------------------------
00000000 00 00 00 1d |....| HALT <main>
00000004 10 00 04 00 |....| MOV R0, #1024
00000008 10 10 00 0a |....| MOV R1, #10
0000000c 10 20 00 00 |. ..| MOV R2, #0
00000010 06 32 08 04 |.2..| MUL R3, R2, R0 <loop_start>
Symbol table output:
=== Symbol Table ===
Name Type Address Value String
------------------------------------------------------------
main LABEL 0x00000000 0x00000000
loop_start LABEL 0x00000010 0x00000000
loop_end LABEL 0x00000020 0x00000000
ARRAY_SIZE CONSTANT 0x0000000A 0x0000000A
msg_hello STRING 0x00000100 0x00000000 Hello, World!
Executes compiled binary files:
# From the build directory
./bin/dez_vm hello.binComprehensive test suite with multiple test categories:
# Run all tests
ctest --output-on-failure
# Individual test executables
./bin/test_assembly # Assembly integration tests
./bin/test_core # Core VM functionality
./bin/test_memory # Memory operations
./bin/test_performance # Performance benchmarks
./bin/test_strings # String handling./bin/disasm_tool [OPTIONS] <file.bin|instruction...>
Options:
-f, --format FORMAT Output format: simple, detailed, assembly, hex, objdump, hexdump
-s, --start ADDR Start address (hex, default: 0)
-c, --count NUM Number of instructions to disassemble
-a, --addresses Show addresses in output
-l, --line-numbers Show line numbers
-H, --headers Show column headers
-S, --symbols Show symbol information
--symbol-file FILE Load symbol table from file
-h, --help Show this help message# Basic disassembly
./bin/disasm_tool program.bin
# Professional output with headers and symbols
./bin/disasm_tool -f objdump -H -S program.bin
# Hex dump format for binary analysis
./bin/disasm_tool -f hexdump -H program.bin
# Detailed analysis with instruction breakdown
./bin/disasm_tool -f detailed -H program.bin
# Show only symbol table
./bin/disasm_tool -S
# Disassemble specific address range
./bin/disasm_tool -s 0x100 -c 20 program.binThe binary format consists of:
- Header: 4-byte instruction count
- Instructions: 32-bit encoded instructions with advanced encoding:
- Register-to-register:
(opcode << 24) | (reg1 << 20) | (reg2 << 16) | (reg3 << 12) - Register-to-immediate:
(opcode << 24) | (reg1 << 20) | (reg2 << 16) | (1 << 11) | immediate - Bit 11 flag: Distinguishes between register and immediate modes
- Register-to-register:
- String Data: Raw string data (null-terminated, word-aligned)
The disassembler provides professional-grade output similar to standard Unix tools:
- objdump format: Clean three-column layout with address, hex instruction, and disassembly
- hexdump format: Shows address, individual hex bytes, ASCII representation, and disassembly
- detailed format: Includes instruction breakdown with opcode analysis
- simple format: Basic disassembly output
- Dynamic headers: Automatically adjust based on format and options
- Line numbers: Optional line numbering for better readability
- Clear column descriptions: Headers explain what each column represents
- Address-to-symbol mapping: Shows symbol names in disassembly (e.g.,
<main>,<loop_start>) - Symbol table display: Complete symbol table with names, types, addresses, and values
- Symbol file support: Load symbol tables from external files (placeholder for future implementation)
The assembler uses a two-pass approach for proper label resolution:
- First pass: Collect all label definitions and their addresses
- Second pass: Parse instructions and resolve label references
Advanced encoding scheme supports both register-to-register and register-to-immediate operations:
- Register R0 handling: Correctly distinguishes between register R0 and immediate value 0
- Immediate values: Supports values up to 2047 (11-bit range)
- Flag-based encoding: Uses bit 11 as a mode flag for unambiguous instruction interpretation
src/
├── core/ # VM core implementation
│ ├── dez_vm.c # Main VM logic with PC management
│ ├── dez_memory.c # Memory management
│ ├── dez_instruction_table.c # Instruction execution with advanced encoding
│ ├── dez_disasm.c # Disassembler core
│ └── dez_disasm.h # Disassembler header
├── assembler/ # Assembler toolchain
│ ├── dez_lexer.c # Tokenizer with comment support
│ ├── dez_parser.c # Two-pass parser with label resolution
│ ├── dez_symbol_table.c # Symbol management for labels/constants
│ ├── dez_assembler.c # Main assembler interface
│ └── asm.c # Assembler command-line tool
├── main.c # VM command-line interface
└── tools/
└── disasm_tool.c # Enhanced disassembler with multiple formats and symbol support
tests/
├── asm/ # Assembly test files
│ ├── test_basic.s # Basic arithmetic operations
│ ├── test_memory.s # Memory operations
│ ├── test_jumps.s # Jump instructions
│ ├── test_conditional.s # Conditional jumps
│ ├── test_system.s # System calls
│ ├── test_loop.s # Loop constructs
│ ├── test_labels.s # Label functionality
│ ├── test_comments.s # Comment support
│ └── test_mixed.s # Mixed labels and comments
├── test_assembly.c # Assembly integration tests
├── test_core.c # Core VM tests
├── test_memory.c # Memory operation tests
├── test_performance.c # Performance tests
└── test_strings.c # String handling tests
include/
└── dez_vm_types.h # VM type definitions and opcodes
- Add opcode to
dez_instruction_type_tininclude/dez_vm_types.h - Add instruction metadata to
instruction_tableinsrc/core/dez_instruction_table.c - Implement execution logic in the appropriate handler function
- Update the parser to recognize the new instruction syntax
- Consider instruction encoding for register vs immediate modes
- Add syscall number to
dez_syscall_tininclude/dez_vm_types.h - Add handling in
execute_sys()function insrc/core/dez_instruction_table.c - Update the parser to recognize the new syscall name
- Add assembly test files in
tests/asm/ - Create corresponding C tests in
tests/test_assembly.c - Run the full test suite with
ctest --output-on-failure - Ensure all existing tests continue to pass
- Maximum program size: 1024 instructions (4KB code segment)
- Maximum immediate value: 2047 (11-bit range due to encoding scheme)
- No floating-point arithmetic
- No dynamic memory allocation
- Limited error handling for some edge cases
DEZ VM includes comprehensive syntax highlighting support for VS Code with enhanced support for binary and hexadecimal numbers.
- Multiple number formats: Decimal, hexadecimal (
0x), and binary (0b) with distinct colors - Two built-in themes: Dark and light themes optimized for different environments
- Comprehensive highlighting: Instructions, registers, labels, system calls, strings, and more
- Easy installation: Available as a VS Code extension package
- Install the VS Code extension from
syntax-highlighting/dez-assembly-1.0.0.vsix - Or install from the
syntax-highlighting/vscode-extension/directory in development mode
See syntax-highlighting/syntax-preview.html for a visual demonstration of the syntax highlighting.
This project is open source and available under the MIT License.
Contributions are welcome! Please feel free to submit issues and pull requests.
This VM was designed as an educational project to demonstrate virtual machine concepts and assembly language implementation.