voidstella/moongen: moonsharp bytecode types, assembler, disassembler, and static analyzer

moonsharp bytecode types, assembler, disassembler, and static analyzer

Rust 97.2%
Nix 2.8%

Find a file

0x57e11a 03d337a8d2 bump vn (lock)		2026-02-18 15:30:01 -08:00
.vscode	init	2026-02-06 22:09:19 -08:00
src	the great documenting of february 18th 2026	2026-02-18 14:09:50 -08:00
.editorconfig	init	2026-02-06 22:09:19 -08:00
.envrc	rename verify.rs to analyzer.rs, fix slight bug with symbolrefs, explode devenv	2026-02-18 11:44:17 -08:00
.gitignore	rename verify.rs to analyzer.rs, fix slight bug with symbolrefs, explode devenv	2026-02-18 11:44:17 -08:00
Cargo.lock	bump vn (lock)	2026-02-18 15:30:01 -08:00
Cargo.toml	bump vn	2026-02-18 15:29:43 -08:00
flake.lock	rename verify.rs to analyzer.rs, fix slight bug with symbolrefs, explode devenv	2026-02-18 11:44:17 -08:00
flake.nix	rename verify.rs to analyzer.rs, fix slight bug with symbolrefs, explode devenv	2026-02-18 11:44:17 -08:00
grammar.pest	the great documenting of february 18th 2026	2026-02-18 14:09:50 -08:00
README.md	update README	2026-02-18 15:23:14 -08:00
rustfmt.toml	init	2026-02-06 22:09:19 -08:00

README.md

moongen provides both

a command line program for assembling/disassembling/analyzing moonsharp bytecode
a library for interacting with it

installing

as a CLI

any of the following:

Nix:
- nix run git+https://code.dolls.today/voidstella/moongen -- asm test.txt to run directly
- nix shell git+https://code.dolls.today/voidstella/moongen to get in a shell
NixOS:
- add moongen.url = "git+https://code.dolls.today/voidstella/moongen" to flake inputs
- then get the package from inputs.moongen.packages.${system}.default
cargo:
- cargo install moongen

as a library

cargo add moongen

CLI usage

there are three commands

moongen asm <path> assembles the assembly format into a bytecode dump
moongen disasm <path> disassembles a bytecode dump into the assembly format
moongen analyze <path> analyzes a bytecode dump and prints any diagnostics if it violated any rules, along with the full path taken to a given instruction

all three

accept - as their path, indicating they should read data from stdin
emit their results to stdout

assembly format

for an instruction reference, review the Inst documentation

syntax is defined by grammar.pest, and follows the following format

each line may start with a label definition: @ident:
each line may have one instruction
- an instruction name (ident)
- if the instruction takes addr, one of the following:
  - an integer specifying the instruction address relative to the start of the chunk
  - ~, followed by an integer specifying the instruction address relative to the current instruction
  - @, followed by an ident referring to a label
- if the instruction takes arg1, an integer
- if the instruction takes arg2, an integer
- if the instruction takes name, a string
- if the instruction takes value, an =, followed by one of the following:
  - null
  - nil
  - void
  - true
  - false
  - a float
  - a string
  - {} (creates an empty table)
- if the instruction takes symbol, a symbol
- if the instruction takes symbol_list, [, comma-separated symbols, ]

terminology

idents follow the regex /[a-zA-Z_][a-zA-Z0-9_]*/
integers follow the regex /-?(?:0|[1-9][0-9]*)/
floats follow the regex /-?(?:0|[1-9][0-9]*)(?:\.[0-9]*)/
strings are either
- JSON-escaped content wrapped in quotes ("this is a string with \"embedded\" quotes")
- base64-encoded content wrapped in quotes and prefixed with b (b"dGhpcyBpcyBhIHN0cmluZyB3aXRoICJlbWJlZGRlZCIgcXVvdGVz", useful for binary data)
symbols are one of the following:
- &, symbol name (local name), :, integer (local index)
- ^, symbol name (upvalue name), :, integer (upvalue index)
- %, symbol name (global name), :, symbol (global _ENV)
- env (_ENV symbol)
- nullref (null symbol)
symbol names are one of the following:
- an ident (name)
- an ident, @, integer (name + disambiguation)
- ... (vararg)

full demonstration

#![has_env]
// useful for debugging purposes
meta 25 1 "greeter" =null
// does nothing but is in the function header anyways
fn 0 -1 []

closure @greet []
upv.ld ^_ENV:0
// %greet:^_ENV:0 isnt necessary, but moonsharp emits it anyways
// you can use nullref for index.set
index.set 0 0 ="greet" %greet:^_ENV:0

// moonsharp likes to generate closures by emitting their instructions and jumping over them
// you dont have to do it this way though (it also saves an instruction to Not Do That)
// but this example will do it moonsharp's way
jmp @over_greet
	@greet:
	meta 9 1 "greet" =null
	fn 1 0 [&who:0]
	args [&who:0]

	lit ="hello "
	loc.ld &who:0
	lit ="!"
	op.concat
	op.concat

	ret 1
	// moonsharp also generates unreachable `ret 0`s even when the last instruction in a function is a `ret 1`...
	ret 0
@over_greet:

// indentation isn't forced either way! lay it out in a way that makes more sense if you'd like
			upv.ld ^_ENV:0
		index ="print"
				upv.ld ^_ENV:0
			index ="greet"
			lit ="dolly"
		call 1 "calling greet"
	call 1 "calling print"
pop 1

ret 0