Skip to content

Latest commit

Β 

History

History
1348 lines (997 loc) Β· 59.3 KB

File metadata and controls

1348 lines (997 loc) Β· 59.3 KB

Rugo Language Internals

This document describes the design and implementation of the Rugo programming language β€” a Ruby-inspired language that compiles to native binaries via Go.

Overview

Rugo's compilation pipeline transforms .rugo source files into native binaries through a series of well-defined stages:

.rugo source
   β”‚
   β–Ό
Strip comments
   β”‚
   β–Ό
Preprocess (desugar, shell fallback, paren-free calls)
   β”‚
   β–Ό
Parse (LL(1) grammar β†’ flat AST)
   β”‚
   β–Ό
Walk (flat AST β†’ typed AST nodes)
   β”‚
   β–Ό
Resolve (imports & requires)
   β”‚
   β–Ό
Semantic checks (validate before codegen)
 └─ UndefinedIdentCheck β€” catch undefined variables and functions
   β”‚
   β–Ό
Transform chain (immutable AST rewrites)
 β”œβ”€ ConcurrencyLowering  β€” desugar spawn/parallel/try into lowered nodes
 └─ ImplicitReturnLowering β€” convert last-expr-as-return into explicit nodes
   β”‚
   β–Ό
Type inference (fixed-point analysis)
   β”‚
   β–Ό
Code generation (AST β†’ Go AST β†’ Go source)
   β”‚
   β–Ό
go build β†’ native binary

The compiler is orchestrated by compiler.Compiler, which chains these stages together. The run, build, and emit CLI subcommands each exercise different parts of this pipeline.

Semantic Checks

After resolving imports and requires, the AST passes through a chain of semantic checks (ast/check.go). Checks implement the Check interface and are composed via CheckChain, which runs them in order and stops at the first error. Unlike transforms, checks validate the AST without modifying it.

UndefinedIdentCheck (compiler/check_idents.go): Catches undefined variable and function references before code generation. It uses a two-pass approach: first collecting all globally visible names (top-level assignments, function definitions, use/import/require namespaces, builtins), then walking the AST with a scope stack to verify that every IdentExpr resolves to a known binding. For namespaced calls (ns.func()), it validates that the function exists in the require namespace, stdlib module, or Go bridge package. Local variables shadow namespaces, matching codegen behavior.

Transform Chain

After semantic checks, the AST passes through a chain of immutable transforms (ast/transform.go). Transforms implement the Transform interface and are composed via Chain(), which runs them left-to-right. Each transform receives the output of the previous one and must not mutate its input β€” a copy-on-write helper (mapSlice) only allocates new slices when children actually change.

ConcurrencyLowering (ast/lower.go): Replaces high-level concurrency constructs (SpawnExpr, ParallelExpr, TryExpr) with lowered equivalents (LoweredSpawnExpr, LoweredParallelExpr, LoweredTryExpr) that carry pre-processed information β€” for example, extracting the last expression in a spawn body into a dedicated ResultExpr field, or pre-categorizing parallel branches as expression vs. statement. This pass also rewrites return statements inside spawn blocks and try handlers into SpawnReturnStmt and TryHandlerReturnStmt respectively.

ImplicitReturnLowering (ast/implicit_return.go): Converts last-expression-as-return-value patterns into explicit AST nodes. A trailing ExprStmt in a FuncDef or FnExpr body becomes an ImplicitReturnStmt; in a try handler it becomes a TryResultStmt. When the last statement is an IfStmt or CaseStmt, the transform recurses into each branch.

The Factory (ast/factory.go) centralizes AST node construction for transform passes, ensuring consistent creation and providing a hook point for future enhancements.

Type Inference

After transforms, compiler.Infer() (compiler/infer.go) runs a fixed-point type inference pass (up to 10 rounds). It walks all expressions and statements to resolve variable and function return types. Anything that can't be proven typed remains TypeDynamic (interface{}). The resulting TypeInfo feeds codegen, allowing it to emit unboxed Go types where possible instead of wrapping everything in interface{}.

Build Cache

During compilation, Rugo creates a temporary directory under ~/.cache/rugo/build/ to hold the generated Go source and go.mod before invoking go build. Each build gets its own uniquely-named subdirectory (rugo-*), which is automatically removed after compilation completes.

Language Design

Type System

Rugo is dynamically typed. All values at runtime are Go interface{}. The generated Go code uses a small set of runtime helper functions (rugo_to_bool, rugo_to_int, rugo_to_float, rugo_to_string) to coerce values at the boundaries where Go requires concrete types.

Supported value types:

Rugo type Go representation
Integer int
Float float64
String string
Bool bool
Nil nil
Array []interface{}
Hash map[interface{}]interface{}

Truthiness

Rugo follows Ruby-like truthiness rules: nil and false are falsy, everything else (including 0 and "") is truthy. This is enforced by the rugo_to_bool runtime function, which is used in all conditional contexts (if, while, &&, ||).

Operators

Arithmetic and comparison operators are dispatched dynamically through runtime helpers:

  • Arithmetic: + (rugo_add), - (rugo_sub), * (rugo_mul), / (rugo_div), % (rugo_mod)
  • Comparison: ==, !=, <, >, <=, >= (all via rugo_compare)
  • Logical: &&, || (short-circuit, return values like Ruby β€” not booleans)
  • Unary: - (rugo_negate), ! (rugo_not)

The + operator supports string concatenation: when the left operand is a string, the right operand is automatically coerced to string. The one exception is nil, which raises an error rather than coercing to "nil" (use string interpolation #{x} if you want a possibly-nil value to render as "nil").

Logical operator semantics (Ruby-like):

  • a || b β€” returns a if a is truthy, otherwise returns b
  • a && b β€” returns a if a is falsy, otherwise returns b

This enables the common default-value idiom:

name = input || "default"
config = load_config() || {}

Gotcha: Don't use &&/|| for flow control with void-returning functions like puts. Since puts returns nil, the || branch always fires:

# BAD β€” prints both "yes" AND "no" when x is truthy
x && puts("yes") || puts("no")

# GOOD β€” use if/else instead
if x
  puts "yes"
else
  puts "no"
end

Comparison semantics:

  • Equality (==, !=): Numeric coercion applies β€” 1 == 1.0 is true. Non-numeric types use strict equality.
  • Ordering (<, >, <=, >=): Supports both numeric and string operands. Strings are compared lexicographically. Comparing incompatible types (e.g., string vs int) panics.

Variables and Assignment

Variables are implicitly declared on first assignment. The codegen tracks declared variables per scope and emits := for first assignment and = for subsequent ones. There are no explicit type annotations.

x = 42          # declares x
x = x + 1       # reassigns x

Compound assignment operators (+=, -=, *=, /=, %=) are preprocessor sugar:

x += 1          # desugared to: x = x + 1
arr[0] += 5     # desugared to: arr[0] = arr[0] + 5

Bare append is also preprocessor sugar β€” the assignment is implicit:

append fruits, "date"     # desugared to: fruits = append(fruits, "date")

Array destructuring unpacks an array into multiple variables:

a, b, c = [10, 20, 30]   # desugared to: __destr__ = [10, 20, 30]; a = __destr__[0]; ...

This is preprocessor sugar. The right-hand side must be a single expression returning an array. Works with Go bridge multi-return functions:

import "strings"
before, after, found = strings.cut("key=value", "=")

Constants

Identifiers starting with an uppercase letter are constants (Ruby convention). They can be assigned once but never reassigned β€” attempting to do so is a compile-time error.

PI = 3.14           # constant (uppercase)
MAX_RETRIES = 5     # constant
name = "mutable"    # variable (lowercase) β€” can be reassigned

PI = 99             # compile error: cannot reassign constant PI

Constants are scoped: a constant defined inside a function is independent from one with the same name in another function or at the top level.

MAX = 100           # top-level constant

def limit()
  MAX = 50          # separate constant, local to this function
  return MAX
end

Hash and array bindings declared as constants protect the binding (you can't point the name at a different value) but their contents can still be mutated:

Config = {"host" => "localhost"}
Config["port"] = 8080   # OK β€” mutates contents, not the binding
Config = {}             # compile error β€” reassigns the binding

Variable Scoping

Different blocks create different scoping boundaries:

Block Own scope? Sees outer vars? Vars leak out?
Top-level Yes (root) β€” β€”
def function Yes Yes (read-only) No
fn lambda Yes Yes (captures outer) No
if/elsif/else No (transparent) Yes Yes
case/of (statement) No (transparent) Yes Yes
case/of (expression) Yes (IIFE) Yes No
while loop Yes Yes (read + modify) No
for..in loop Yes Yes (read + modify) No
spawn block Yes Yes (shared) No
rats block Yes No (isolated) No

Functions can read top-level variables but assigning inside a function creates a local shadow β€” the top-level value is not modified. Top-level variables referenced by def functions are promoted to package-level declarations so they are accessible. This is a key difference from lambdas, which capture the surrounding scope by reference.

rats blocks are fully isolated β€” they cannot see any top-level variables or constants. Use environment variables to share state between setup hooks and test blocks.

if blocks are transparent β€” they share the parent scope. Variables created inside an if block are accessible after the block ends. Statement-form case/of blocks have the same transparent scoping. However, when case is used as an expression (assigned to a variable), it compiles to an IIFE with its own scope β€” variables assigned inside branches do not leak out.

Loops create their own scope β€” while and for loops can read and modify outer variables, but variables first assigned inside the loop body are local to that iteration scope. The for loop variable is also local.

Lambdas capture outer scope β€” they can read and modify variables from the enclosing scope. Variables assigned inside the lambda don't leak out.

Control Flow

Control flow uses Ruby-style end-delimited blocks:

if condition
  # body
elsif other_condition
  # body
else
  # body
end

while condition
  # body
end

for item in collection
  # body β€” item is value for arrays, key for hashes
end

for key, value in hash
  # body
end

for index, value in array
  # body
end

# Integer ranges
for i in 10           # i = 0, 1, ..., 9
end

for i in range(5, 10) # i = 5, 6, ..., 9
end

break and next are supported inside loops, compiling directly to Go break and continue.

Postfix if

A statement can be conditionally executed using postfix if (Ruby-style statement modifier):

puts "big" if x > 10
x = 42 if ready
greet "world" if name != nil

This is preprocessor sugar β€” STMT if COND is rewritten to if COND\n STMT\nend. It only applies when if appears mid-line (not at the start), outside strings and brackets.

Case Expression

The case/of/elsif/else/end construct provides multi-branch matching against a subject expression (similar to switch in other languages, or case in Ruby and Nim):

case status
of "ok"
  puts "all good"
of "error", "fail"
  puts "something went wrong"
else
  puts "unknown"
end

Semantics:

  1. The subject expression is evaluated once into a temporary variable.
  2. Each of branch compares the temp using ==. Multiple comma-separated values are OR'd together.
  3. Optional elsif branches provide boolean conditions (not compared to the subject).
  4. else is a catch-all default.
  5. No match and no else evaluates to nil.
  6. No fallthrough β€” the first matching branch wins.
  7. of branches must come before elsif; else must be last.

Arrow form β€” for single-expression branches, use ->:

case status
of "ok" -> "all good"
of "error", "fail" -> "something went wrong"
else -> "unknown"
end

Arrow form takes a single expression (not an assignment). Both forms can be mixed:

case code
of 200 -> "success"
of 404
  log("not found")
  "not found"
else -> "other"
end

Case as expression β€” case can be used anywhere an expression is expected, including assignment position and function arguments. Each branch's last expression becomes the result:

# Assignment position
label = case status
of "ok" -> "success"
of "error" -> "failure"
else -> "unknown"
end

# Multi-line branches work too
message = case code
of 200
  puts("ok")
  "all good"
of 404
  puts("missing")
  "not found"
else -> "other"
end

When used as an expression (e.g., assigned to a variable), case compiles to a Go IIFE (immediately-invoked function expression) with a named return variable. Variables assigned inside expression branches are local to the IIFE and do not leak to the parent scope β€” unlike statement-form case, which has transparent scoping.

Implicit return β€” inside functions, a trailing case expression is implicitly returned:

def grade(letter)
  case letter
  of "A" -> "excellent"
  of "B" -> "good"
  of "C" -> "average"
  else -> "unknown"
  end
end

Elsif integration β€” boolean conditions can follow of branches for Nim-style flexibility:

case score
of 100 -> "perfect"
of 0 -> "zero"
elsif score >= 90
  "A"
elsif score >= 80
  "B"
else
  "C"
end

Scoping β€” statement-form case blocks are transparent, like if. Variables assigned inside branches leak to the parent scope. Expression-form case (assigned to a variable) uses an IIFE, so branch variables are local.

Codegen note: Statement-form case compiles to a Go if/else chain (not a Go switch). The subject is stored in a temp variable (__case_N). Each of becomes rugo_to_bool(rugo_eq(__case_N, value)) conditions OR'd together. Expression-form case compiles to a Go IIFE with a named return (r interface{}) β€” each branch assigns its result to r.

Functions

Functions are defined with def/end and always return interface{} in the generated Go. The last expression in a function body is implicitly returned (like lambdas). Use explicit return for early exits:

def greet(name)
  puts "Hello, #{name}!"
end

def add(a, b)
  a + b
end

def classify(x)
  if x > 10
    return "big"
  end
  "small"
end

For functions with no parameters, the parentheses are optional:

def say_hello
  puts "Hello!"
end

Default Parameter Values

Parameters can have default values using = expr syntax. Parameters with defaults must come after all required parameters. When a caller omits trailing arguments, the defaults are evaluated at call time:

def greet(name, greeting = "Hello")
  puts "#{greeting}, #{name}!"
end

greet("Alice")            # Hello, Alice!
greet("Alice", "Hey")     # Hey, Alice!

Multiple defaults are allowed, and any expression (including nil, booleans, arithmetic) can be used as a default:

def connect(host, port = 8080, tls = true)
  # port defaults to 8080, tls defaults to true
end

connect("example.com")              # port=8080, tls=true
connect("example.com", 443)         # port=443, tls=true
connect("example.com", 443, false)  # port=443, tls=false

A function with all-optional parameters can be called with zero arguments:

def label(text = "default", color = nil)
  # ...
end

label()                  # both default
label("hello")           # color defaults to nil
label("hello", "red")    # no defaults used

Codegen note: Functions with default parameters compile to a variadic Go signature (_args ...interface{}). A preamble unpacks arguments and fills defaults for any omitted parameters. Functions without defaults are unchanged. Arity is checked as a range: min_required..max_total. Required parameters after a default parameter is a compile error.

Functions are hoisted to the Go package level during codegen. Inside function bodies, all function names are visible (forward references work). At the top level, function names are only recognized after their def line (positional resolution).

Optional Type Annotations

Function parameters and return types can carry optional type annotations using the form name : type (note the space before : β€” it is required so the preprocessor's hash-colon sugar doesn't rewrite the line) and : type after the parameter list:

def add(a : Integer, b : Integer) : Integer
  return a + b
end

def greet(name : String) : String
  return "hello, " + name
end

# Mix annotated and unannotated freely
def scale(factor : Float, x)
  return factor * x
end

# Return-only annotation
def label(x) : String
  return "value: " + x
end

Annotations are optional everywhere β€” adding them is purely additive and never required. Lambdas use the same syntax:

square = fn(n : Integer) : Integer
  return n * n
end

The recognised type names mirror exactly what type_of() returns at runtime β€” annotations and runtime types share one vocabulary:

Annotation Meaning
Integer 64-bit integer
Float 64-bit float
String Go string
Bool Go bool
Array Rugo array ([]interface{})
Hash Rugo hash (map[interface{}]interface{})
Nil Always nil
Any Explicit dynamic (interface{})

Annotations are case-sensitive. Unknown names produce a compile-time error pointing at the offending position. The v0.29.0/v0.29.1 lowercase forms (int, float, …) are no longer accepted β€” the compiler suggests the canonical capitalised name with a "did you mean Integer?" hint to make migration painless.

Annotations have five effects:

  1. Compile-time validation. The annotation name must be recognised; misspellings (integer, Boolean) and the legacy lowercase forms (int, bool, …) fail at compile time with a targeted hint.
  2. Seeded type inference. Infer() plants the annotated types into FuncTypeInfo.ParamTypes/ReturnType before walking the body. The inferrer treats annotated params as ground truth and will not widen them to interface{} if a later assignment is dynamic. Annotated returns are not overwritten by the inferred return type.
  3. Typed Go signatures. When the annotated type is a primitive (Integer, Float, String, Bool), codegen emits a typed Go signature (func rugofn_add(a int, b int) int) instead of the default func(... interface{}) interface{}. The return path inserts a rugo_to_* coercion if the body produced a dynamic value, so calls into runtime helpers (e.g. math.sqrt) still work without manual casts.
  4. Body/annotation mismatch detection. After inference, the compiler walks every annotated function body and flags two patterns the inferrer can prove are wrong: reassigning an annotated parameter to a value of a concretely conflicting type (e.g. a = "hello" inside def f(a : Integer)) and returning a value whose inferred type conflicts with the annotated return type. Errors point at the rugo source line with a structured message instead of a Go-level compiler error. Assignment is strict (the generated Go has a concrete variable, no coercion at the reassignment site); returns are permissive in the numeric family and for String/Bool/Any (the codegen inserts coercion on the return path). Pass --no-infer to skip the check.
  5. Call-site and return-site flow validation. Beyond literal arguments, the compiler also flags variable arguments and variable return values when the inferrer can prove a concrete type conflict at that program point.
    • Literal arguments. When a literal (number, string, bool, nil, array, hash, or -N/!b over a literal) is passed to an annotated parameter with a concretely-conflicting type, the call is rejected (e.g. f("oops") where f is def f(a : Integer)).
    • Variable arguments (Tier 3 flow-sensitive). Each identifier read site carries a flow-sensitive per-use type (TypeInfo.VarUseTypes) that reflects the variable's type at exactly that program point β€” not the conservative storage union of every value it ever held. Sequential reassignments narrow the per-use type (y = "h"; y = 42; f(y) passes because at the call site y is provably Integer); union outcomes from if without else, loops that may not run, or case without else keep the union and produce a precise error message.
    • Variable returns (Tier 3 flow-sensitive). Symmetrically, return x is checked against the flow-sensitive type of x at the return site, not the storage union. Returning a variable that was reassigned back to a compatible type is permitted even when its history includes incompatible values.
    • fn lambda call sites (Tier 4 flow-sensitive). When an annotated fn lambda is bound to a variable (f = fn(n : Integer) ... end), every call through that variable (f(...)) is checked against the lambda's annotated parameters using the same compatibility rules as a def call. The binding is tracked flow-sensitively: aliasing (g = f) propagates the signature, reassigning to a different annotated lambda uses the new signature, reassigning to a non-fn value clears the binding, and any merge across branches with different bindings drops the binding. Higher-order use (passing a lambda to another function, storing it in an array/hash, returning it from a function) is silent β€” Tier 4 only fires for direct identifier-named calls of variables that hold an annotated lambda in the current scope.
    • The compatibility rule at call sites and for parameter defaults is strict-with-numeric-carve-out: same-type matches, plus the numeric family (Integer, Float, Bool) flows freely between numeric annotations because codegen inserts rugo_to_int / rugo_to_float wrappers at the call boundary. String, Bool, Array, Hash, and Nil annotations only accept their own type (this matches the strict variable-annotation rule β€” x : String = 42 and f(x : String); f(42) both error). Any accepts anything. Module-style ns.f(...) calls are skipped. Dynamic or unresolved expressions are silent β€” annotations stay user assertions where inference cannot decide. Pass --no-infer to disable.
    • Returns remain permissive (numeric family mutually compatible, String/Bool/Any accept anything) because codegen inserts coercion on the return path.
Variable Type Annotations

Local variables can also carry type annotations on their first assignment using the same name : type = expr shape:

x : Integer = 42
name : String = "world"
items : Array = [1, 2, 3]

Variable annotations are sticky: once x is bound as Integer, every later assignment to x in the same scope is checked against that annotation. Reassigning to a concretely-conflicting type fails at compile time:

x : Integer = 42
x = "oops"      # compile error: cannot assign String value to variable 'x' declared as Integer

Re-annotating the same name in the same scope (x : Integer = ...; x : Integer = ...) is also rejected. The annotation lives until the enclosing function or block returns, so two functions can each declare their own x : T independently.

Use : Any to opt out of the check while keeping the annotation as documentation:

x : Any = 0
x = "h"         # allowed
x = [1, 2]      # allowed

Coverage is reported by rugo emit --stats:

Params:    13   typed: 10 (76.9%)   dynamic: 3   annotated: 9 (69.2%)
Returns:    7   typed: 4 (57.1%)    dynamic: 3   annotated: 5 (71.4%)

The typed column reflects what inference resolved (with or without annotations); annotated counts only positions where the user wrote an explicit annotation.

Limitations and caveats:

  • Local variable annotations apply only to first assignment (x : T = expr). There is no syntax for re-annotating, nor for annotating index assignments (a[i] : T = ...) or dot assignments (o.f : T = ...).
  • Functions with default parameter values compile to a variadic shape; on such functions the annotations act as documentation only, since the runtime signature is dynamic.
  • Array, Hash, Nil, Any are accepted but do not produce typed Go signatures (the corresponding runtime shapes are already interface{}-typed).
  • Mismatch detection only fires when the inferrer can prove a conflict (the value's type is concrete and not in the compatibility set). Dynamic / unknown values are silent β€” annotations stay user assertions where inference cannot decide.
  • A space is required before : (x : Integer, not x:Integer) because the preprocessor would otherwise interpret x: as the start of a hash literal.

Lambdas (First-Class Functions)

Rugo supports anonymous functions (lambdas) using fn(params) body end syntax. Lambdas are first-class values β€” they can be stored in variables, passed as arguments, returned from functions, and stored in data structures.

# Basic lambda
double = fn(x) x * 2 end
puts double(5)   # 10

# Multi-line lambda
classify = fn(x)
  if x > 0
    return "positive"
  end
  "non-positive"
end

# Pass lambda to function
def my_map(f, arr)
  result = []
  for item in arr
    result = append(result, f(item))
  end
  return result
end
my_map(fn(x) x * 2 end, [1, 2, 3])

# Return lambda from function (closure)
def make_adder(n)
  return fn(x) x + n end
end
add5 = make_adder(5)
puts add5(10)   # 15

# Lambdas in data structures
ops = {"add" => fn(a, b) a + b end}
puts ops["add"](2, 3)   # 5

Lambdas compile to Go variadic anonymous functions: func(_args ...interface{}) interface{} { ... }. Parameters are unpacked from the variadic args. The last expression in a lambda body is implicitly returned. Closures capture variables by reference, so mutations to captured variables are visible outside the lambda.

Lambdas also support default parameter values, with the same semantics as def functions:

transform = fn(x, factor = 2) x * factor end
puts transform(5)      # 10
puts transform(5, 3)   # 15

When a variable holding a lambda is called, the codegen emits a runtime type assertion: variable.(func(...interface{}) interface{})(args...). Calling a non-function variable produces a friendly compile error: cannot call x β€” not a function.

Lambdas stored as hash values can be called via dot access, just like index access:

ops = {
  add: fn(a, b) a + b end,
  mul: fn(a, b) a * b end
}
puts ops["add"](2, 3)   # 5 (index access)
puts ops.add(2, 3)      # 5 (dot access)

At runtime, rugo_dot_call looks up the key in the hash, type-asserts the value to a callable lambda, and invokes it. If the key doesn't exist or the value isn't a function, a friendly error is produced.

Trailing Block Syntax (do...end)

The do...end syntax provides a concise way to pass a no-argument lambda as the last argument to a function call. It is preprocessor sugar:

# These are equivalent:
vbox(fn()
  label("Hello")
end)

vbox do
  label("Hello")
end

The preprocessor rewrites CALL do BODY end to CALL(fn() BODY end) (or appends fn() as the last argument if the call already has arguments):

# Bare call
vbox do ... end              # β†’ vbox(fn() ... end)

# Call with existing args
button("Click") do ... end   # β†’ button("Click", fn() ... end)

# Paren-free with args
styled "bold" do ... end     # β†’ styled("bold", fn() ... end)

# Assignment
result = make do ... end     # β†’ result = make(fn() ... end)

Nesting works naturally β€” each end matches its closest do:

outer do
  inner("hello") do
    puts "deep"
  end
end

Key rules:

  • do must appear at the end of a line, separated from the preceding expression by whitespace.
  • do inside strings (e.g., "I do this") is not affected.
  • do...end blocks always create a parameterless fn(). For lambdas that need parameters, use fn(params) ... end directly.
  • do is a reserved keyword β€” it cannot be used as a variable or function name.

Error Handling

Rugo provides three levels of error handling via try/or:

# Level 1: Silent recovery (returns nil on failure)
result = try some_expression

# Level 2: Default value on failure
result = try some_expression or "default"

# Level 3: Handler block with error variable
result = try some_expression or err
  puts "caught: " + err
  "fallback"
end

Under the hood, try compiles to a Go IIFE (immediately invoked function expression) with defer/recover. The error is caught by Go's panic/recover mechanism, and the error message is made available as a string in the handler block.

Shell Fallback

One of Rugo's distinctive features is shell fallback: unknown identifiers at the top level are treated as shell commands rather than producing compile errors.

ls -la              # runs as: sh -c "ls -la"
echo "hello"        # runs as: sh -c "echo hello"
uname -a            # runs as: sh -c "uname -a"

The preprocessor rewrites these to __shell__("...") calls, which the codegen translates to exec.Command("sh", "-c", ...). Shell commands inherit stdin/stdout/stderr from the parent process. Non-zero exit codes cause a panic with rugoShellError.

Backtick expressions capture command output instead of printing it:

name = `whoami`     # captures output, strips trailing newline

These are rewritten to __capture__("...") calls. String interpolation works inside backticks:

name = "world"
greeting = `echo hello #{name}`   # captures "hello world"

Pipe Operator

The pipe operator | connects expressions left-to-right, passing the output of the left side to the right side:

  • Shell command on left β†’ stdout is captured (like backticks)
  • Function/expression on left β†’ return value is used
  • Function on right β†’ piped value becomes the first argument
  • Shell command on right β†’ piped value is fed to stdin
# Shell output β†’ function
echo "hello world" | puts           # puts receives "hello world"

# Chaining: shell β†’ module β†’ builtin
echo "hello" | str.upper | puts     # prints "HELLO"

# Expression β†’ function
len("hello") | puts                 # prints 5

# Value β†’ shell stdin β†’ function
"hello" | tr a-z A-Z | puts         # prints "HELLO"

# Assignment with pipe
name = echo "rugo" | str.upper      # name = "RUGO"

# Piped value prepended before existing args
echo "world" | puts "hello"         # prints "world hello"

Key rules:

  • When all segments are shell commands (e.g. ls | grep foo), the line is left as a native shell pipe β€” backward compatible.
  • Only when at least one segment is a Rugo construct (builtin, user function, module function, or expression) does pipe expansion activate.
  • The || logical OR operator is never confused with the pipe |.
  • Pipes inside strings ("a | b") are not expanded.
  • The pipe passes return values, not stdout output. puts and print return nil, so using them as a non-final segment in a pipe chain is a compile-time error:
ls | puts | head        # βœ— compile error β€” puts returns nil, breaks the chain
ls | head | puts        # βœ“ puts at the end, receives head's captured output

The preprocessor rewrites pipe expressions before parsing. For example, echo "hello" | str.upper | puts becomes puts(str.upper(__capture__("echo \"hello\""))).

String Interpolation

String interpolation uses #{expr} syntax inside double-quoted strings:

name = "World"
puts "Hello, #{name}!"
puts "1 + 2 = #{1 + 2}"

The preprocessor handles the #{...} extraction, and the codegen compiles interpolated strings to fmt.Sprintf calls. Interpolated expressions are fully parsed through the Rugo parser to support arbitrary expressions.

Limitation: Nested double quotes inside interpolation are not supported. Use a variable instead:

# This will NOT work:
# puts "#{h["foo"]}"

# Use a variable instead:
x = h["foo"]
puts "#{x}"

Raw Strings

Single-quoted strings are raw literals where no escape processing or interpolation happens (like Ruby's single-quoted strings):

puts 'hello\nworld'        # prints: hello\nworld (literal backslash-n)
puts '\x1b[32mgreen'       # prints: \x1b[32mgreen (no ANSI processing)
puts 'no #{interpolation}'  # prints: no #{interpolation} (no interpolation)

Only two escape sequences are recognized in raw strings: \\ (literal backslash) and \' (literal single quote). All other backslash sequences are kept as-is.

Raw strings are parsed by a separate raw_str_lit lexer rule in the grammar and produce StringLiteral nodes with Raw: true. The codegen emits these strings directly to Go string literals with appropriate escaping, bypassing the interpolation pipeline.

Preprocessor

The preprocessor (ast/preprocess.go) runs before parsing and performs line-level source transformations. It operates in multiple passes:

Pass 1: Compound Assignment Expansion

Desugars +=, -=, *=, /=, %= for both simple variables and index targets:

x += 1        β†’  x = x + 1
arr[0] -= 3   β†’  arr[0] = arr[0] - 3

Pass 1b: Bare Append Expansion

Desugars bare append statements into explicit assignments. Only applies when append( starts the line and the first argument is a valid assignment target:

append(arr, val)  β†’  arr = append(arr, val)

This pass runs after paren-free call expansion, so append arr, val is first converted to append(arr, val), then desugared to arr = append(arr, val).

Pass 2: Backtick Expansion

Converts backtick expressions to capture calls:

`hostname`    β†’  __capture__("hostname")

Pass 3: Try Sugar Expansion

Expands single-line try forms into multi-line block form that the parser understands:

# try EXPR or DEFAULT expands to:
try
  EXPR
or _err
  DEFAULT
end

# try EXPR (no or) expands to:
try
  EXPR
or _err
  nil
end

This expansion also tracks a line map so error messages reference the original source line.

Pass 4: Line-by-Line Processing

Each line is classified and transformed:

  1. Pipe expansion β€” lines with top-level | (not ||) are split into segments. If at least one segment is a Rugo construct (function/builtin/dotted ident/expression), the pipe is expanded into nested calls. All-shell pipes are left for the shell to handle natively.
  2. Keywords (if, def, while, etc.) β€” left untouched.
  3. Assignments (x = ...) β€” left untouched.
  4. Parenthesized calls (func(...)) β€” left untouched.
  5. Known function, paren-free (puts "hi") β€” rewritten to puts("hi").
  6. Unknown identifier β€” rewritten to shell fallback: __shell__("...").

Function name resolution is positional at the top level: a def must appear before its paren-free usage. Inside function bodies, all function names are visible (allowing forward references).

Line Map

The preprocessor produces a line map that tracks the correspondence between preprocessed line numbers and original source line numbers. This is threaded through the walker and codegen so that //line directives and error messages reference the correct .rugo source location.

Parser

The parser is generated from an LL(1) grammar defined in parser/rugo.ebnf using the egg parser generator tool:

egg -o parser.go -package parser -start Program -type Parser -constprefix Rugo rugo.ebnf

Important: parser/parser.go is generated code and must never be hand-edited. All grammar changes go through rugo.ebnf.

Grammar Structure

The grammar defines a standard expression language with precedence levels:

Program     = { Statement }

Statement   = UseStmt | ImportStmt | RequireStmt | SandboxStmt | FuncDef | TestDef
            | IfStmt | WhileStmt | ForStmt
            | BreakStmt | NextStmt | ReturnStmt
            | AssignOrExpr

Expr        = OrExpr
OrExpr      = AndExpr { "||" AndExpr }
AndExpr     = CompExpr { "&&" CompExpr }
CompExpr    = AddExpr [ comp_op AddExpr ]
AddExpr     = MulExpr { ('+' | '-') MulExpr }
MulExpr     = UnaryExpr { ('*' | '/' | '%') UnaryExpr }
UnaryExpr   = '!' Postfix | '-' Postfix | Postfix
Postfix     = Primary { Suffix }
Suffix      = '(' [ ArgList ] ')' | '[' Expr [ ',' Expr ] ']' | '.' ident
Primary     = ... | CaseExpr | ...

CaseExpr lives in Primary rather than Statement to avoid an LL(1) conflict β€” both assignment and standalone case start with the "case" token. Standalone case (not assigned to a variable) flows through AssignOrExpr β†’ Expr β†’ Primary β†’ CaseExpr and the walker converts it to a CaseStmt for efficient codegen (no IIFE overhead).

Operator precedence (lowest to highest):

Level Operators
1 ||
2 &&
3 == != < > <= >=
4 + -
5 * / %
6 ! (unary) - (unary)
7 () [] . (postfix)

Parser Output

The parser produces a flat []int32 array encoding the parse tree. Non-terminal nodes are encoded as (-symbol, childCount, children...) and terminal tokens as positive indices into the token stream. This compact representation is then walked by the AST walker.

AST

The typed AST is defined in ast/nodes.go. It uses Go interfaces with marker methods for type safety:

Node (interface)
β”œβ”€β”€ Statement (interface)
β”‚   β”œβ”€β”€ Program           β€” root node, contains []Statement
β”‚   β”œβ”€β”€ UseStmt           β€” use "module" (Rugo stdlib)
β”‚   β”œβ”€β”€ ImportStmt        β€” import "go/pkg" [as alias] (Go bridge)
β”‚   β”œβ”€β”€ RequireStmt       β€” require "path" [as alias | with mod1, mod2, ...]
β”‚   β”œβ”€β”€ SandboxStmt      β€” sandbox [ro: [...], rw: [...], env: [...], ...] (Landlock + env)
β”‚   β”œβ”€β”€ FuncDef           β€” def name(params) body end
β”‚   β”œβ”€β”€ TestDef           β€” rats "name" body end
β”‚   β”œβ”€β”€ IfStmt            β€” if/elsif/else/end
β”‚   β”œβ”€β”€ CaseStmt          β€” case/of/elsif/else/end (contains []OfClause)
β”‚   β”œβ”€β”€ WhileStmt         β€” while cond body end
β”‚   β”œβ”€β”€ ForStmt           β€” for var [, var2] in expr body end
β”‚   β”œβ”€β”€ BreakStmt         β€” break
β”‚   β”œβ”€β”€ NextStmt          β€” next
β”‚   β”œβ”€β”€ ReturnStmt        β€” return [expr]
β”‚   β”œβ”€β”€ ExprStmt          β€” expression as statement
β”‚   β”œβ”€β”€ AssignStmt        β€” target = value
β”‚   β”œβ”€β”€ IndexAssignStmt   β€” obj[index] = value
β”‚   β”‚
β”‚   β”‚   (produced by transforms β€” not in the parse tree)
β”‚   β”œβ”€β”€ ImplicitReturnStmt β€” last expr converted to return (from ImplicitReturnLowering)
β”‚   β”œβ”€β”€ TryResultStmt      β€” last expr in try handler (from ImplicitReturnLowering)
β”‚   β”œβ”€β”€ SpawnReturnStmt    β€” return inside spawn body (from ConcurrencyLowering)
β”‚   └── TryHandlerReturnStmt β€” return inside try handler (from ConcurrencyLowering)
β”‚
└── Expr (interface)
    β”œβ”€β”€ BinaryExpr        β€” left op right
    β”œβ”€β”€ UnaryExpr         β€” op operand
    β”œβ”€β”€ CallExpr          β€” func(args...)
    β”œβ”€β”€ IndexExpr         β€” obj[index]
    β”œβ”€β”€ SliceExpr         β€” obj[start, length]
    β”œβ”€β”€ DotExpr           β€” obj.field
    β”œβ”€β”€ IdentExpr         β€” variable/function reference
    β”œβ”€β”€ IntLiteral        β€” integer
    β”œβ”€β”€ FloatLiteral      β€” float
    β”œβ”€β”€ StringLiteral     β€” string (Raw: true for single-quoted)
    β”œβ”€β”€ BoolLiteral       β€” true/false
    β”œβ”€β”€ NilLiteral        β€” nil
    β”œβ”€β”€ ArrayLiteral      β€” [elem, ...]
    β”œβ”€β”€ HashLiteral       β€” {key: value, ...} or {expr => value, ...}
    β”œβ”€β”€ TryExpr           β€” try expr or err handler end
    β”œβ”€β”€ SpawnExpr         β€” spawn body end
    β”œβ”€β”€ ParallelExpr      β€” parallel body end
    β”œβ”€β”€ FnExpr            β€” fn(params) body end (lambda)
    β”œβ”€β”€ CaseExpr          β€” case/of/elsif/else/end as expression (IIFE codegen)
    β”‚
    β”‚   (produced by ConcurrencyLowering β€” replace their non-lowered counterparts)
    β”œβ”€β”€ LoweredTryExpr      β€” try with extracted result expr and handler body
    β”œβ”€β”€ LoweredSpawnExpr    β€” spawn with extracted result expr
    └── LoweredParallelExpr β€” parallel with pre-categorized branches (ParallelBranch)

Every statement node embeds BaseStmt, which carries a SourceLine field mapping back to the original .rugo source. This is populated by the walker using the line map from the preprocessor.

The Factory (ast/factory.go) centralizes AST node creation for transform passes, providing copy-on-write helpers like ProgramFrom, FuncDefWithBody, and IfStmtWithBranches to ensure consistent construction without mutating the original tree.

AST Walker

The walker (ast/walker.go) transforms the parser's flat []int32 encoding into the typed AST. It reads the flat array sequentially, matching non-terminal symbols to construct the appropriate node types. The walker also applies the preprocessor's line map to set accurate source line numbers on each statement.

Code Generation

The code generator (compiler/codegen.go) traverses the typed AST and produces Go source via a two-stage process: first building a Go AST (compiler/goast.go), then serializing it to source (compiler/goprint.go).

Before codegen begins, the Compile() function runs semantic checks (UndefinedIdentCheck), and the generate() function runs the transform chain (ConcurrencyLowering + ImplicitReturnLowering) and type inference. The codegen is split across several files:

File Responsibility
codegen.go Orchestration, codeGen struct, generate() entry point
check_idents.go Semantic check: undefined variable and function detection
codegen_expr.go Expression compilation: exprString() converts Rugo expressions to Go source strings
codegen_stmt.go Statement compilation: buildStmt() converts statements to GoStmt nodes
codegen_func.go Function and lambda codegen, including closure variable capture
codegen_scope.go Variable scope tracking and management
codegen_runtime.go Runtime helper injection: sandbox, spawn/parallel templates, Go bridge stubs
codegen_build.go Test and benchmark harness generation

Go AST Middle Layer

Rather than emitting raw strings, codegen builds a GoFile tree (compiler/goast.go) composed of GoDecl, GoStmt, and GoExpr nodes. The GoFile contains the package name, imports, top-level declarations (functions, variables, runtime code), and the main() body. The printer (compiler/goprint.go, PrintGoFile()) then serializes this tree to properly formatted Go source with correct indentation. A GoRawDecl escape hatch allows injecting pre-formatted code for runtime templates and complex generated blocks.

The generated file includes:

  1. Imports β€” standard library imports plus any module-specific Go imports.
  2. Runtime helpers β€” type conversion, arithmetic, comparison, shell execution, iteration, and panic handling functions.
  3. Module runtimes β€” Go struct and method implementations for imported stdlib modules, plus auto-generated wrapper functions.
  4. User functions β€” each def compiles to a Go function with signature func rugofn_NAME(params ...interface{}) interface{}.
  5. Main function β€” top-level statements wrapped in func main() with a defer/recover for panic handling.

Key Code Generation Patterns

Variable scoping: The codegen maintains a scope stack. First assignment in a scope uses :=, subsequent assignments use =. Every assigned variable gets a _ = varname line to suppress Go's "declared but not used" errors.

for..in loops: The single-variable form (for x in coll) uses rugo_iterable_default() which returns values for arrays and keys for hashes (Python-style). The two-variable form (for k, v in coll) uses rugo_iterable() which returns []rugo_kv (key-value pairs) for uniform array/hash iteration. Arrays produce {index, value} pairs; hashes produce {key, value} pairs. Integer collections iterate from 0 to N-1. The range(start, end) builtin generates efficient Go for loops when used in for-loop collections (no slice allocation); outside for-loops it returns an array.

Index assignment: arr[0] = x and hash["key"] = y compile to rugo_index_set(obj, idx, val), which type-switches on the target. Negative indices are supported for arrays (e.g., arr[-1] = x sets the last element).

Negative array indexing: Array access supports negative indices (Ruby behavior). arr[-1] returns the last element, arr[-2] the second-to-last, etc. This is handled by the rugo_array_index runtime helper, which normalizes negative indices by adding len(arr).

Slicing: obj[start, length] compiles to rugo_slice(obj, start, length), which supports both arrays and strings. For arrays it returns a new array; for strings it returns a substring. Out-of-bounds indices are clamped silently (Ruby behavior) rather than panicking. Slicing unsupported types (int, bool, hash, etc.) produces a developer-friendly error like cannot slice hash (expected string or array).

Argument count validation: User-defined function calls are validated during code generation. If the number of arguments doesn't match the function's parameter count, a Rugo-specific error is emitted (e.g., wrong number of arguments for greet (2 for 1)) instead of exposing internal Go compiler errors.

try/or expressions: Compile to a Go IIFE with defer/recover. The tried expression is the return value; if it panics, the recovery handler runs and produces the fallback value.

//line directives: The codegen emits //line file.rugo:N directives before each statement so that Go runtime panics show .rugo source locations instead of generated Go line numbers.

Test harness: When rats blocks are present, the codegen generates a TAP-compliant test runner instead of a regular main(). Each test block becomes a separate function, with optional setup/teardown (per-test) and setup_file/teardown_file (per-file) hooks.

Function Naming Conventions

Rugo construct Go function name
def greet(...) rugofn_greet(...)
ns.func(...) (user module) rugons_ns_func(...)
mod.func(...) (stdlib module) rugo_mod_func(...)
puts(...) rugo_puts(...)
__shell__(...) rugo_shell(...)
__capture__(...) rugo_capture(...)

Module System

Rugo has three ways to bring in external functionality:

Keyword Purpose Example
use Load Rugo stdlib modules use "http"
import Bridge to Go stdlib packages import "strings"
require Load user .rugo files or Go modules require "helpers"

Rugo Stdlib Modules (use)

Modules provide namespaced standard library functionality. Each module self-registers via Go init() using modules.Register().

Prefer use modules for standard operations. They provide a curated, Ruby-inspired API covering math, file paths, encoding, crypto, time, and more. The import keyword gives direct access to Go's stdlib for advanced needs, but use modules are the idiomatic approach.

A module consists of:

  • runtime.go β€” A Go source file with a struct type and methods, tagged with //go:build ignore so it's not compiled directly. It's embedded as a string and emitted into the generated program.
  • Registration file β€” Declares the module name, type, function signatures with typed args, required Go imports, and embeds the runtime source.

How Modules Work at Compile Time

  1. User writes use "http" in their .rugo script.
  2. The codegen looks up the module in the registry and collects its Go imports.
  3. The module's FullRuntime() method generates:
    • The cleaned runtime source (struct + methods)
    • A module instance variable (var _http = &HTTP{})
    • Wrapper functions for each declared function that convert interface{} args to typed parameters

Available Argument Types

ArgType Go type Runtime converter
String string rugo_to_string
Int int rugo_to_int
Float float64 rugo_to_float
Bool bool rugo_to_bool
Any interface{} none (passed through)

Go Bridge (import)

The import keyword provides direct access to whitelisted Go standard library packages. The compiler maintains a static registry of bridgeable Go functions and auto-generates type conversions between Rugo's interface{} values and Go's typed parameters.

import "strings"
import "math"

puts strings.contains("hello world", "world")  # true
puts math.sqrt(144.0)                           # 12

Function names use snake_case in Rugo and are auto-converted to Go's PascalCase. Go functions returning (T, error) auto-panic on error, integrating with try/or. The as keyword provides aliasing: import "os" as go_os.

User Modules (require)

User modules use require:

require "helpers"            # loads helpers.rugo, namespace: helpers
require "lib/utils" as u    # loads lib/utils.rugo, namespace: u
require "lib/utils" as "u"  # quoted form also accepted

helpers.greet("World")
u.compute(42)

Paths are resolved relative to the calling file. The .rugo extension is added automatically if missing. Requires are resolved recursively and deduplicated. If the path points to a directory, Rugo resolves an entry point: <dirname>.rugo β†’ main.rugo β†’ sole .rugo file (file takes precedence over directory when both exist).

The with clause selectively loads specific .rugo files from a directory (local or remote):

# Local directory
require "mylib" with client, helpers
client.connect()

# Remote repository
require "github.com/user/rugo-utils@v1.0.0" with client, helpers

Each name loads <name>.rugo from the directory or repository root (falling back to lib/<name>.rugo), using the filename as the namespace.

Remote git repositories can also be required as a single module:

require "github.com/user/rugo-utils@v1.0.0" as "utils"
utils.slugify("Hello World")

Remote modules are shallow-cloned and cached in ~/.rugo/modules/. Tagged versions (@v1.0.0) and commit SHAs are cached forever; branch refs (@main) are locked to their resolved SHA on first fetch. Use @latest to automatically resolve to the highest stable semver tag.

Use rugo mod tidy to generate a rugo.lock file that records the exact commit SHA for every remote module, making builds reproducible. Use rugo mod update to re-resolve mutable dependencies, or rugo build --frozen to fail if the lock file is stale.

Go Modules via require

require also supports Go packages with exported functions. When a required path resolves to a directory containing go.mod and .go files (instead of .rugo files), the compiler introspects the Go source, classifies exported functions, and bridges them automatically β€” no manifest or registration needed:

require "path/to/my_go_module"
my_go_module.greet("world")

require "github.com/user/rugo-slug@v1.0.0" as slug
slug.make("Hello World!")

The Go module author writes a standard Go package with exported functions using bridgeable types (string, int, float64, bool, error, []string, []byte). Functions with non-bridgeable signatures (interfaces, channels, generics) are automatically excluded with clear compile-time warnings.

Exported structs with bridgeable field types are also supported β€” the compiler generates wrapper types so struct values can be created, have fields read/set via dot syntax, and be passed to Go functions:

require "mymod"

c = mymod.config()             # zero-value constructor
c.name = "app"                 # field set
c.port = 8080
c2 = mymod.new_config("x", 3) # Go constructor returning *Config
puts(mymod.describe(c2))       # pass struct to Go function

See Go Modules for full details on struct support.

See External Modules for details on creating Go modules.

There is no implicit search path β€” the require string tells you exactly where the code comes from: a relative path is local, a URL-shaped path is remote.

File Embedding (embed)

The embed keyword embeds file contents into the compiled binary at build time. The file is read during compilation and baked into the executable β€” no external files needed at runtime.

embed "config.yaml" as config
embed "assets/template.html" as template

puts config
puts len(template)

Syntax: embed "path" as name

  • path β€” file path relative to the source file
  • name β€” variable name that holds the file content as a string

Path restriction: Embedded file paths must resolve to the same directory or a subdirectory of the .rugo source file that declares them. This mirrors Go's embed restriction and prevents libraries from accessing files outside their own tree:

embed "data/config.txt" as cfg       # OK: subdirectory
embed "sibling.txt" as sib           # OK: same directory
embed "../secret.txt" as secret      # ERROR: escapes source directory

How it works: The compiler uses Go's //go:embed under the hood. Files are copied into the build directory and linked directly into the binary's data section β€” efficient even for large files.

Note: embed cannot be used with eval.run() because eval.run() compiles from an ephemeral temp directory with no files to embed. Use eval.file() instead when embedding is needed.

Module Visibility

Functions prefixed with _ are private to their module. The compiler rejects any attempt to call them from outside:

# mylib.rugo
def _helper()       # private β€” only callable within mylib
  return "internal"
end

def greet()          # public β€” callable from anywhere
  return _helper()   # OK: same module
end
# main.rugo
require "mylib"
puts mylib.greet()     # OK
puts mylib._helper()   # compile error: '_helper' is private to module 'mylib'

Functions without the _ prefix are public. This applies to all require forms: plain, as, and with.

Built-in Functions

These functions are always available without any use or import:

Function Description
puts(args...) Print args separated by spaces, followed by newline
print(args...) Print args separated by spaces, no trailing newline
len(v) Length of string (character count), array, or hash
append(arr, val) Append value to array, returns new array. Can be used as a bare statement: append arr, val
raise(msg) Raise a runtime error with the given message
type_of(v) Returns the type name of a value as a string
exit(code?) Terminate the program with optional exit code (default: 0)

Built-in Collection Methods

Arrays and hashes have built-in methods dispatched via rugo_dot_call. These are always available without imports. Built-in methods take priority over hash key lookup β€” use hash["key"] for key access when a key name collides with a method.

Array Methods

Method Returns Description
.map(fn) Array Transform each element
.filter(fn) Array Keep elements where fn returns truthy
.reject(fn) Array Remove elements where fn returns truthy
.each(fn) nil Iterate with side effects
.reduce(init, fn) Any Accumulate: fn(acc, val)
.find(fn) Any/nil First matching element
.any(fn) Bool True if any element matches
.all(fn) Bool True if all elements match
.count(fn) Int Count matching elements
.join(sep) String Join elements with separator
.first() Any/nil First element
.last() Any/nil Last element
.min() Any/nil Minimum value (numeric or string)
.max() Any/nil Maximum value (numeric or string)
.sum() Number Sum of numeric elements
.flatten() Array Flatten one level of nesting
.uniq() Array Remove duplicates (preserving order)
.sort_by(fn) Array Sort by lambda result (non-mutating)
.flat_map(fn) Array Map then flatten
.take(n) Array First n elements
.drop(n) Array All but first n elements
.zip(other) Array Pair elements from two arrays
.chunk(n) Array Split into groups of n

Hash Methods

Hash method lambdas receive (key, value):

Method Returns Description
.map(fn) Array Transform each pair: fn(k, v)
.filter(fn) Hash Keep pairs where fn(k, v) returns truthy
.reject(fn) Hash Remove pairs where fn(k, v) returns truthy
.each(fn) nil Iterate pairs: fn(k, v)
.reduce(init, fn) Any Accumulate: fn(acc, k, v)
.find(fn) Array/nil First matching [key, value] pair
.any(fn) Bool True if any pair matches
.all(fn) Bool True if all pairs match
.count(fn) Int Count matching pairs
.keys() Array All keys
.values() Array All values
.merge(other) Hash Combine hashes (other wins conflicts)

Testing

Rugo includes a built-in test framework using rats/end blocks:

use "test"

rats "arithmetic works"
  test.assert_eq(1 + 1, 2)
end

rats "string interpolation"
  name = "World"
  test.assert_eq("Hello, #{name}!", "Hello, World!")
end

Test files use the _test.rugo extension and produce TAP (Test Anything Protocol) output. The test harness supports:

  • setup / teardown functions called before/after each test
  • setup_file / teardown_file functions called once before/after all tests
  • test.assert_eq, test.assert, test.skip from the test module
  • Exit code 1 on any test failure

Doc Comments

Rugo uses position-based # comment attachment for documentation:

# File-level documentation goes here.

# Calculates the factorial of n.
# Returns 1 when n <= 1.
def factorial(n)
  # This is a regular comment β€” not shown by rugo doc
  if n <= 1
    return 1
  end
  return n * factorial(n - 1)
end

# A Dog with a name and breed.
struct Dog
  name
  breed
end

Rules:

  • Consecutive # lines immediately before def/struct (no blank line gap) = doc comment
  • First # block at top of file before any code = file-level doc
  • # inside function bodies, after a blank line gap, or inline = regular comment

Use rugo doc to view documentation for files, modules, and bridge packages:

rugo doc file.rugo           # all docs in a file
rugo doc file.rugo factorial # specific symbol
rugo doc http              # stdlib module
rugo doc strings           # bridge package
rugo doc use:os            # force stdlib module (when name is ambiguous)
rugo doc import:os         # force bridge package (when name is ambiguous)
rugo doc --all             # list everything