Skip to content

UlizesR/Dez

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DEZ VM - A Modern Virtual Machine

DEZ VM is a lightweight 32-bit virtual machine designed for educational purposes and simple program execution. It features a complete assembler toolchain with modern language features including labels, comments, and advanced instruction encoding.

Features

  • 32-bit RISC-like instruction set with 16 general-purpose registers
  • Complete assembler toolchain with lexer, parser, and symbol table
  • Professional-grade disassembler with multiple output formats (objdump, hexdump, detailed)
  • Column headers and line numbers for clear disassembly output
  • Symbol table integration with address-to-symbol mapping
  • Label and comment support for readable assembly code
  • Advanced instruction encoding supporting both register-to-register and register-to-immediate operations
  • System calls for I/O operations including string printing
  • Memory management with code, data, and stack segments
  • Binary file format for compiled programs
  • Comprehensive test suite with CTest integration
  • Enhanced syntax highlighting with support for binary and hexadecimal numbers

Architecture

Memory Layout

  • Total Memory: 16KB (4096 words)
  • Code Segment: 0x0000 - 0x0FFF (4KB, read-only)
  • Data Segment: 0x1000 - 0x1FFF (4KB, writable)
  • Stack Segment: 0x2000 - 0x3FFF (8KB, writable)

Registers

  • R0 - R15: 16 general-purpose 32-bit registers
  • PC: Program Counter
  • SP: Stack Pointer
  • Flags: Status flags (zero, carry, overflow)

Building

# Create build directory
mkdir build && cd build

# Build the project
cmake ..
make

# All executables will be created in the bin/ directory
ls bin/
# asm dez_vm disasm_tool test_assembly test_core test_memory test_performance test_strings

# Run all tests
ctest --output-on-failure

Usage

1. Writing Assembly Code

Create a .s file with your program:

; examples/hello.s
start: ; Entry point label
MOV R0, 42 ; Load value 42 into R0
SYS R0, PRINT ; Print the value
HALT ; Stop execution

2. Assembling

Use the assembler to compile your assembly code:

# From the build directory
./bin/asm ../examples/hello.s hello.bin

3. Running

Execute the compiled binary with the VM:

# From the build directory
./bin/dez_vm hello.bin

4. Testing

Run the comprehensive test suite:

# Run all tests
ctest --output-on-failure

# Or run specific tests
./bin/test_assembly
./bin/test_core

Instruction Set

Number Formats

DEZ assembly supports multiple number formats for immediate values:

  • Decimal: 42, 255, 4095
  • Hexadecimal: 0x2A, 0xFF, 0xFFF (prefix with 0x)
  • Binary: 0b101010, 0b11111111, 0b111111111111 (prefix with 0b)

All formats are equivalent and can be used interchangeably:

MOV R1, 42          ; Decimal
MOV R2, 0x2A        ; Hexadecimal (same as 42)
MOV R3, 0b101010    ; Binary (same as 42)

Note: Immediate values are limited to 12 bits (0-4095) due to instruction encoding constraints.

Data Movement

  • MOV R1, 42 - Load immediate value 42 into R1
  • MOV R1, R2 - Copy R2 to R1

Arithmetic

  • ADD R1, R2, R3 - R1 = R2 + R3 (register-to-register)
  • ADD R1, R2, 5 - R1 = R2 + 5 (register-to-immediate)
  • SUB R1, R2, R3 - R1 = R2 - R3 (register-to-register)
  • SUB R1, R2, 3 - R1 = R2 - 3 (register-to-immediate)
  • MUL R1, R2, R3 - R1 = R2 * R3
  • DIV R1, R2, R3 - R1 = R2 / R3

Memory Operations

  • STORE R1, 256 - Store R1 at memory address 256

Control Flow

  • JMP label - Jump to label
  • JZ label - Jump if zero flag set
  • JNZ label - Jump if zero flag not set

System Calls

  • SYS R1, PRINT - Print value in R1
  • SYS R1, PRINT_CHAR - Print character in R1

Other

  • CMP R1, R2 - Compare R1 and R2 (sets flags)
  • CMP R1, 5 - Compare R1 with immediate value 5 (sets flags)
  • HALT - Halt execution
  • NOP - No operation

Labels and Comments

  • label: - Define a label
  • ; This is a comment - Full-line comment
  • MOV R0, 42 ; Inline comment - Inline comment

Examples

Hello World

; Simple hello world program
start: ; Entry point
MOV R0, 42 ; Load value 42 (decimal)
SYS R0, PRINT ; Print the value
HALT ; Stop execution

Number Format Examples

; Demonstrating different number formats
start:
MOV R0, 42          ; Decimal
MOV R1, 0x2A        ; Hexadecimal (same as 42)
MOV R2, 0b101010    ; Binary (same as 42)
ADD R3, R0, R1      ; 42 + 42 = 84
SYS R3, PRINT       ; Print result
HALT

Arithmetic Operations

; Basic arithmetic with immediate values
start:
MOV R0, 10 ; Load first operand
MOV R1, 20 ; Load second operand
ADD R2, R0, R1 ; Add: R2 = R0 + R1 = 30
SUB R3, R0, R1 ; Subtract: R3 = R0 - R1 = -10
HALT

Loops and Labels

; Loop to sum numbers 1 to 5
start:
MOV R0, 5 ; Initialize counter
MOV R1, 0 ; Initialize sum

loop: ; Loop label
ADD R1, R1, R0 ; Add counter to sum
SUB R0, R0, 1  ; Decrement counter
CMP R0, 0      ; Check if counter is zero
JNZ loop       ; Continue loop if not zero

done: ; End of loop
HALT ; Stop execution

Conditional Logic

; Conditional jump example
start:
MOV R0, 5 ; Set R0 to 5
MOV R1, 5 ; Set R1 to 5 (equal to R0)
CMP R0, R1 ; Compare R0 and R1
JZ equal ; Jump if equal
MOV R2, 1 ; This should be skipped
JMP end ; Jump to end
equal: ; Jump target for equal case
MOV R2, 1 ; Set R2 to 1
end: ; End label
HALT ; Stop execution

Memory Operations

; Store and retrieve values from memory
start:
MOV R0, 123 ; Load value 123
STORE R0, 256 ; Store R0 at address 256
MOV R1, 456 ; Load value 456
STORE R1, 257 ; Store R1 at address 257
HALT ; Stop execution

Tools

Assembler (asm)

Compiles assembly source files to binary format:

# From the build directory
./bin/asm ../examples/hello.s hello.bin

Enhanced Disassembler (disasm_tool)

Professional-grade disassembler with multiple output formats, column headers, and symbol table support:

Basic Usage

# From the build directory
./bin/disasm_tool hello.bin

Output Formats

# objdump-like format with proper columns
./bin/disasm_tool -f objdump hello.bin

# hexdump-like format with ASCII representation
./bin/disasm_tool -f hexdump hello.bin

# Detailed format with instruction breakdown
./bin/disasm_tool -f detailed hello.bin

# Simple format (default)
./bin/disasm_tool -f simple hello.bin

Column Headers and Line Numbers

# Show column headers explaining each column
./bin/disasm_tool -f objdump -H hello.bin

# Show line numbers for better readability
./bin/disasm_tool -f hexdump -l hello.bin

# Combine both features
./bin/disasm_tool -f objdump -H -l hello.bin

Symbol Table Integration

# Show symbol information in disassembly
./bin/disasm_tool -f objdump -S hello.bin

# Display complete symbol table
./bin/disasm_tool -S

# Load symbol table from file
./bin/disasm_tool --symbol-file symbols.txt hello.bin

Advanced Options

# Disassemble specific address range
./bin/disasm_tool -s 0x100 -c 50 hello.bin

# Disassemble individual instructions
./bin/disasm_tool -f objdump 0x0000001d 0x10000400

# Show help for all options
./bin/disasm_tool --help

Example Output

objdump format with headers and symbols:

Address    Instruction  Disassembly
----------------------------------------------------
0x00000000: 0000001d  HALT <main>
0x00000004: 10000400  MOV R0, #1024
0x00000008: 1010000a  MOV R1, #10
0x0000000C: 10200000  MOV R2, #0
0x00000010: 06320804  MUL R3, R2, R0 <loop_start>

hexdump format with headers:

Address    Hex Bytes            ASCII    Disassembly
--------------------------------------------------------
00000000  00 00 00 1d  |....|  HALT <main>
00000004  10 00 04 00  |....|  MOV R0, #1024
00000008  10 10 00 0a  |....|  MOV R1, #10
0000000c  10 20 00 00  |. ..|  MOV R2, #0
00000010  06 32 08 04  |.2..|  MUL R3, R2, R0 <loop_start>

Symbol table output:

=== Symbol Table ===
Name                 Type       Address    Value      String
------------------------------------------------------------
main                 LABEL      0x00000000 0x00000000 
loop_start           LABEL      0x00000010 0x00000000 
loop_end             LABEL      0x00000020 0x00000000 
ARRAY_SIZE           CONSTANT   0x0000000A 0x0000000A 
msg_hello            STRING     0x00000100 0x00000000 Hello, World!

VM (dez_vm)

Executes compiled binary files:

# From the build directory
./bin/dez_vm hello.bin

Test Suite

Comprehensive test suite with multiple test categories:

# Run all tests
ctest --output-on-failure

# Individual test executables
./bin/test_assembly  # Assembly integration tests
./bin/test_core      # Core VM functionality
./bin/test_memory    # Memory operations
./bin/test_performance # Performance benchmarks
./bin/test_strings   # String handling

Disassembler Quick Reference

Command Line Options

./bin/disasm_tool [OPTIONS] <file.bin|instruction...>

Options:
  -f, --format FORMAT    Output format: simple, detailed, assembly, hex, objdump, hexdump
  -s, --start ADDR       Start address (hex, default: 0)
  -c, --count NUM        Number of instructions to disassemble
  -a, --addresses        Show addresses in output
  -l, --line-numbers     Show line numbers
  -H, --headers          Show column headers
  -S, --symbols          Show symbol information
  --symbol-file FILE     Load symbol table from file
  -h, --help             Show this help message

Common Usage Patterns

# Basic disassembly
./bin/disasm_tool program.bin

# Professional output with headers and symbols
./bin/disasm_tool -f objdump -H -S program.bin

# Hex dump format for binary analysis
./bin/disasm_tool -f hexdump -H program.bin

# Detailed analysis with instruction breakdown
./bin/disasm_tool -f detailed -H program.bin

# Show only symbol table
./bin/disasm_tool -S

# Disassemble specific address range
./bin/disasm_tool -s 0x100 -c 20 program.bin

File Format

The binary format consists of:

  1. Header: 4-byte instruction count
  2. Instructions: 32-bit encoded instructions with advanced encoding:
    • Register-to-register: (opcode << 24) | (reg1 << 20) | (reg2 << 16) | (reg3 << 12)
    • Register-to-immediate: (opcode << 24) | (reg1 << 20) | (reg2 << 16) | (1 << 11) | immediate
    • Bit 11 flag: Distinguishes between register and immediate modes
  3. String Data: Raw string data (null-terminated, word-aligned)

Advanced Features

Enhanced Disassembler

The disassembler provides professional-grade output similar to standard Unix tools:

Multiple Output Formats

  • objdump format: Clean three-column layout with address, hex instruction, and disassembly
  • hexdump format: Shows address, individual hex bytes, ASCII representation, and disassembly
  • detailed format: Includes instruction breakdown with opcode analysis
  • simple format: Basic disassembly output

Column Headers and Line Numbers

  • Dynamic headers: Automatically adjust based on format and options
  • Line numbers: Optional line numbering for better readability
  • Clear column descriptions: Headers explain what each column represents

Symbol Table Integration

  • Address-to-symbol mapping: Shows symbol names in disassembly (e.g., <main>, <loop_start>)
  • Symbol table display: Complete symbol table with names, types, addresses, and values
  • Symbol file support: Load symbol tables from external files (placeholder for future implementation)

Two-Pass Assembly

The assembler uses a two-pass approach for proper label resolution:

  1. First pass: Collect all label definitions and their addresses
  2. Second pass: Parse instructions and resolve label references

Instruction Encoding

Advanced encoding scheme supports both register-to-register and register-to-immediate operations:

  • Register R0 handling: Correctly distinguishes between register R0 and immediate value 0
  • Immediate values: Supports values up to 2047 (11-bit range)
  • Flag-based encoding: Uses bit 11 as a mode flag for unambiguous instruction interpretation

Development

Project Structure

src/
├── core/                    # VM core implementation
│   ├── dez_vm.c            # Main VM logic with PC management
│   ├── dez_memory.c        # Memory management
│   ├── dez_instruction_table.c # Instruction execution with advanced encoding
│   ├── dez_disasm.c        # Disassembler core
│   └── dez_disasm.h        # Disassembler header
├── assembler/              # Assembler toolchain
│   ├── dez_lexer.c         # Tokenizer with comment support
│   ├── dez_parser.c        # Two-pass parser with label resolution
│   ├── dez_symbol_table.c  # Symbol management for labels/constants
│   ├── dez_assembler.c     # Main assembler interface
│   └── asm.c              # Assembler command-line tool
├── main.c                  # VM command-line interface
└── tools/
    └── disasm_tool.c       # Enhanced disassembler with multiple formats and symbol support

tests/
├── asm/                    # Assembly test files
│   ├── test_basic.s        # Basic arithmetic operations
│   ├── test_memory.s       # Memory operations
│   ├── test_jumps.s        # Jump instructions
│   ├── test_conditional.s  # Conditional jumps
│   ├── test_system.s       # System calls
│   ├── test_loop.s         # Loop constructs
│   ├── test_labels.s       # Label functionality
│   ├── test_comments.s     # Comment support
│   └── test_mixed.s        # Mixed labels and comments
├── test_assembly.c         # Assembly integration tests
├── test_core.c            # Core VM tests
├── test_memory.c          # Memory operation tests
├── test_performance.c     # Performance tests
└── test_strings.c         # String handling tests

include/
└── dez_vm_types.h         # VM type definitions and opcodes

Adding New Instructions

  1. Add opcode to dez_instruction_type_t in include/dez_vm_types.h
  2. Add instruction metadata to instruction_table in src/core/dez_instruction_table.c
  3. Implement execution logic in the appropriate handler function
  4. Update the parser to recognize the new instruction syntax
  5. Consider instruction encoding for register vs immediate modes

Adding New System Calls

  1. Add syscall number to dez_syscall_t in include/dez_vm_types.h
  2. Add handling in execute_sys() function in src/core/dez_instruction_table.c
  3. Update the parser to recognize the new syscall name

Testing New Features

  1. Add assembly test files in tests/asm/
  2. Create corresponding C tests in tests/test_assembly.c
  3. Run the full test suite with ctest --output-on-failure
  4. Ensure all existing tests continue to pass
  • Maximum program size: 1024 instructions (4KB code segment)
  • Maximum immediate value: 2047 (11-bit range due to encoding scheme)
  • No floating-point arithmetic
  • No dynamic memory allocation
  • Limited error handling for some edge cases

Syntax Highlighting

DEZ VM includes comprehensive syntax highlighting support for VS Code with enhanced support for binary and hexadecimal numbers.

Features

  • Multiple number formats: Decimal, hexadecimal (0x), and binary (0b) with distinct colors
  • Two built-in themes: Dark and light themes optimized for different environments
  • Comprehensive highlighting: Instructions, registers, labels, system calls, strings, and more
  • Easy installation: Available as a VS Code extension package

Installation

  1. Install the VS Code extension from syntax-highlighting/dez-assembly-1.0.0.vsix
  2. Or install from the syntax-highlighting/vscode-extension/ directory in development mode

Preview

See syntax-highlighting/syntax-preview.html for a visual demonstration of the syntax highlighting.

License

This project is open source and available under the MIT License.

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

Acknowledgments

This VM was designed as an educational project to demonstrate virtual machine concepts and assembly language implementation.

About

A C VM that runs dez assembly

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •