Sai Sasank Y

Writing my own interpreter for Lox, Part 8 - Classes

This is the next part in writing my own interpreter for Lox in Python. Previously, we implemented static variable resolution to fix closure bugs. In this post, we add classes, bringing object-oriented programming to Lox. Here is the corresponding chapter from Crafting Interpreters.

Classes are the cornerstone of object-oriented programming. They bundle data (properties) and behavior (methods) into cohesive units. Lox's classes are dynamically typed and support:

Updated Grammar

The grammar now includes class declarations, property access, and the this keyword. Here's the updated grammar:

program        → declaration* EOF ;

declaration    → classDecl
               | funDecl
               | varDecl
               | statement ;

classDecl      → "class" IDENTIFIER "{" function* "}" ;

funDecl        → "fun" function ;
function       → IDENTIFIER "(" parameters? ")" block ;

statement      → exprStmt
               | ifStmt
               | printStmt
               | returnStmt
               | whileStmt
               | block ;

expression     → assignment ;
assignment     → ( call "." )? IDENTIFIER "=" assignment
               | logic_or ;

call           → primary ( "(" arguments? ")" | "." IDENTIFIER )* ;

primary        → "true" | "false" | "nil" | "this"
               | NUMBER | STRING | IDENTIFIER
               | "(" expression ")" ;

Key additions:

Class Declarations

Classes in Lox are declared with the class keyword followed by a name and a body containing method definitions:

class Breakfast {
  cook() {
    print "Eggs a-fryin'!";
  }

  serve(who) {
    print "Enjoy your breakfast, " + who + ".";
  }
}

Class Statement Syntax Tree

I added a ClassStmt to app/stmt.py:

class ClassStmt(Stmt):
    def __init__(self, name: Token, methods: list[FunctionStmt]):
        self.name = name
        self.methods = methods

    def accept(self, visitor: StmtVisitor):
        return visitor.visit_class_stmt(self)

The class statement holds:

Parsing Class Declarations

In app/parser.py, I added parsing logic for classes:

def declaration(self):
    try:
        if (self.match(TokenType.CLASS)):
            return self.class_declaration()
        if (self.match(TokenType.FUN)):
            return self.function_declaration("function")
        # ... rest of declarations

def class_declaration(self):
    name = self.consume(TokenType.IDENTIFIER, "Expect class name.")
    self.consume(TokenType.LEFT_BRACE, "Expect '{' before class body.")
    methods = []
    while not self.check(TokenType.RIGHT_BRACE) and not self.is_at_end():
        methods.append(self.function_declaration("method"))
    self.consume(TokenType.RIGHT_BRACE, "Expect '}' after class body.")
    return ClassStmt(name, methods)

The parser reuses function_declaration() for methods since their syntax is identical to regular functions.

Creating Instances

Classes are callable and create new instances:

var breakfast = Breakfast();
breakfast.cook();

The LoxClass Class

I created LoxClass in app/lox_class.py that implements the LoxCallable interface:

class LoxClass(LoxCallable):
    def __init__(self, name: str, methods):
        self.name = name
        self.methods = methods

    def __repr__(self):
        return self.name

    def arity(self):
        initializer = self.find_method("init")
        if initializer is not None:
            return initializer.arity()
        return 0

    def call(self, interpreter, arguments):
        instance = LoxInstance(self)
        initializer = self.find_method("init")
        if initializer is not None:
            initializer.bind(instance).call(interpreter, arguments)
        return instance

    def find_method(self, name):
        if name in self.methods:
            return self.methods[name]
        return None

When called, a class:

  1. Creates a new LoxInstance
  2. Finds and calls the init initializer if present
  3. Returns the instance

The LoxInstance Class

I created LoxInstance in app/lox_instance.py to represent instances:

class LoxInstance:
    def __init__(self, klass):
        self.klass = klass
        self.fields = {}

    def __repr__(self):
        return self.klass.name + " instance"

    def get(self, name):
        if name.lexeme in self.fields:
            return self.fields[name.lexeme]

        method = self.klass.find_method(name.lexeme)
        if method is not None:
            return method.bind(self)

        raise LoxRuntimeError(name, "Undefined property '" + name.lexeme + "'.")

    def set(self, name, value):
        self.fields[name.lexeme] = value

Instances store:

Executing Class Declarations

In app/interpreter.py, class declarations create class objects:

def visit_class_stmt(self, stmt: ClassStmt):
    self.environment.define(stmt.name.lexeme, None)
    methods = {}
    for method in stmt.methods:
        is_initializer = method.name.lexeme == "init"
        function = LoxFunction(method, self.environment, is_initializer)
        methods[method.name.lexeme] = function
    klass = LoxClass(stmt.name.lexeme, methods)
    self.environment.assign(stmt.name, klass)
    return None

The two-step process (define then assign) allows methods to reference the class itself.

Properties

Lox supports dynamic properties on instances. Properties can be get and set using dot notation:

breakfast.meat = "sausage";
breakfast.bread = "sourdough";
print breakfast.meat;  // "sausage"

Get Expressions

I added a Get expression to app/expr.py:

class Get(Expr):
    def __init__(self, obj: Expr, name: Token):
        self.obj = obj
        self.name = name

    def accept(self, visitor: ExprVisitor):
        return visitor.visit_get(self)

The get expression holds:

Set Expressions

I added a Set expression to app/expr.py:

class Set(Expr):
    def __init__(self, obj: Expr, name: Token, value: Expr):
        self.obj = obj
        self.name = name
        self.value = value

    def accept(self, visitor: ExprVisitor):
        return visitor.visit_set(self)

The set expression holds:

Parsing Property Access

In app/parser.py, I extended the call() method to handle property access:

def call(self):
    expr = self.primary()

    while (True):
        if self.match(TokenType.LEFT_PAREN):
            expr = self.finish_call(expr)
        elif self.match(TokenType.DOT):
            name = self.consume(TokenType.IDENTIFIER, "Expect property name after '.'.")
            expr = Get(expr, name)
        else:
            break
    return expr

Property access has the same precedence as function calls, allowing chains like object.property.method().

I also updated assignment parsing to handle property setters:

def assignment(self):
    expr = self.logic_or()

    if (self.match(TokenType.EQUAL)):
        equals = self.previous()
        rvalue = self.assignment()

        if isinstance(expr, Variable):
            name = expr.name
            return Assign(name, rvalue)
        elif isinstance(expr, Get):
            return Set(expr.obj, expr.name, rvalue)
        ParseError.error(self, equals, "Invalid assignment target.")

    return expr

If the left side of an assignment is a Get expression, we convert it to a Set expression.

Executing Property Access

In app/interpreter.py, property get and set are straightforward:

def visit_get(self, expr: Get):
    obj = self.evaluate(expr.obj)
    if (isinstance(obj, LoxInstance)):
        return obj.get(expr.name)
    raise LoxRuntimeError(expr.name, "Only instances have properties.")

def visit_set(self, expr: Set):
    obj = self.evaluate(expr.obj)
    if not isinstance(obj, LoxInstance):
        raise LoxRuntimeError(expr.name, "Only instances have fields.")
    value = self.evaluate(expr.value)
    obj.set(expr.name, value)
    return value

The actual get/set logic lives in the LoxInstance class. Note that property lookup first checks fields, then falls back to methods.

Methods

Methods are functions associated with a class. They have access to the instance through the special this keyword:

class Breakfast {
  serve(who) {
    print "Enjoy your " + this.meat + ", " + who + ".";
  }
}

var breakfast = Breakfast();
breakfast.meat = "sausage";
breakfast.serve("Noble Reader");  // "Enjoy your sausage, Noble Reader."

Methods are stored in the class definition and bound to instances when accessed.

The this Keyword

The this keyword provides access to the current instance within methods:

class Cake {
  taste() {
    var adjective = "delicious";
    print "The " + this.flavor + " cake is " + adjective + "!";
  }
}

var cake = Cake();
cake.flavor = "chocolate";
cake.taste();  // "The chocolate cake is delicious!"

This Expression

I added a This expression to app/expr.py:

class This(Expr):
    def __init__(self, keyword: Token):
        self.keyword = keyword

    def accept(self, visitor: ExprVisitor):
        return visitor.visit_this(self)

Parsing this

In app/parser.py, I added this to primary expressions:

def primary(self):
    # ... other cases
    if self.match(TokenType.THIS):
        return This(self.previous())
    # ... rest of primary

Evaluating this

In app/interpreter.py, this is resolved like any other variable:

def visit_this(self, expr: This):
    return self.look_up_variable(expr.keyword, expr)

The magic happens in how this gets bound to the instance, which we'll see in the binding section.

Initializers

Initializers are special methods that run when an instance is created. In Lox, the initializer is named init:

class Breakfast {
  init(meat, bread) {
    this.meat = meat;
    this.bread = bread;
  }
}

var baconAndToast = Breakfast("bacon", "toast");

The init Method

When a class is instantiated, if it has an init method, it's automatically called with the arguments passed to the class:

def call(self, interpreter, arguments):
    instance = LoxInstance(self)
    initializer = self.find_method("init")
    if initializer is not None:
        initializer.bind(instance).call(interpreter, arguments)
    return instance

The class's arity is determined by the initializer:

def arity(self):
    initializer = self.find_method("init")
    if initializer is not None:
        return initializer.arity()
    return 0

Return from Initializer

Initializers have special return semantics. You can use return; to exit early, but you can't return a value:

class Foo {
  init() {
    return "oops";  // Error!
  }
}

I updated app/lox_function.py to handle initializers specially:

class LoxFunction(LoxCallable):
    def __init__(self, declaration, closure, is_initializer):
        self.declaration = declaration
        self.closure = closure
        self.is_initializer = is_initializer

    def call(self, interpreter, arguments: list[any]):
        environment = Environment(self.closure)
        for param, argument in zip(self.declaration.params, arguments):
            environment.define(param.lexeme, argument)
        try:
            interpreter.execute_block(self.declaration.body, environment)
        except Return as err:
            if self.is_initializer:
                return self.closure.get_at(0, "this")
            return err.value
        if self.is_initializer:
            return self.closure.get_at(0, "this")
        return None

For initializers:

Resolver Updates

The resolver needs to handle class-specific scoping for this and detect errors like returning values from initializers.

Tracking Class Context

I added a ClassType enum in app/resolver.py:

ClassType = Enum(
    'ClassType',
    [
        'NONE', 'CLASS'
    ]
)

And updated the resolver to track class context:

class Resolver(ExprVisitor, StmtVisitor):
    def __init__(self, interpreter):
        self.interpreter = interpreter
        self.scopes = deque()
        self.current_function = FunctionType.NONE
        self.current_class = ClassType.NONE  # New
        self.had_error = False

Resolving Classes

Class declarations create a scope with this bound:

def visit_class_stmt(self, stmt: ClassStmt):
    enclosing_class = self.current_class
    self.current_class = ClassType.CLASS
    self.declare(stmt.name)
    self.define(stmt.name)
    self.begin_scope()
    self.scopes[-1]["this"] = True
    for method in stmt.methods:
        if method.name.lexeme == "init":
            self.resolve_function(method, FunctionType.INITIALIZER)
        else:
            self.resolve_function(method, FunctionType.METHOD)
    self.end_scope()
    self.current_class = enclosing_class
    return None

Key points:

  1. Track that we're inside a class
  2. Declare and define the class name
  3. Create a new scope with this bound
  4. Resolve each method (marking initializers specially)
  5. Restore previous class context

Resolving this

The this expression checks that we're inside a class:

def visit_this(self, expr):
    if self.current_class == ClassType.NONE:
        ResolveError.error(self, expr.keyword, "Can't use 'this' outside of a class.")
        return None
    self.resolve_local(expr, expr.keyword)

This catches errors like:

print this;  // Error: Can't use 'this' outside of a class.

Preventing Value Returns from Initializers

I updated the FunctionType enum to distinguish initializers:

FunctionType = Enum(
    'FunctionType',
    [
        'NONE', 'FUNCTION', 'METHOD', 'INITIALIZER'
    ]
)

And added validation in return statement resolution:

def visit_return_stmt(self, stmt):
    if self.current_function == FunctionType.NONE:
        ResolveError.error(self, stmt.keyword, "Can't return from top-level code.")
    if stmt.value:
        if self.current_function == FunctionType.INITIALIZER:
            ResolveError.error(self, stmt.keyword, "Can't return a value from an initializer.")
        self.resolve(stmt.value)

This prevents:

class Foo {
  init() {
    return "oops";  // Error: Can't return a value from an initializer.
  }
}

Resolving Property Access

Get and set expressions just need to resolve their operands:

def visit_get(self, expr):
    self.resolve(expr.obj)

def visit_set(self, expr):
    self.resolve(expr.value)
    self.resolve(expr.obj)

Properties are resolved at runtime, not compile time, so we don't validate property names.

Binding Methods

When you access a method through an instance, we need to bind this to that instance. This is done through method binding in app/lox_function.py:

def bind(self, lox_instance):
    env = Environment(self.closure)
    env.define("this", lox_instance)
    return LoxFunction(self.declaration, env, self.is_initializer)

The bind() method:

  1. Creates a new environment with the method's closure as parent
  2. Defines this in that environment to be the instance
  3. Returns a new function with this environment as its closure

This happens when getting a method from an instance in app/lox_instance.py:

def get(self, name):
    if name.lexeme in self.fields:
        return self.fields[name.lexeme]

    method = self.klass.find_method(name.lexeme)
    if method is not None:
        return method.bind(self)  # Bind method to this instance

    raise LoxRuntimeError(name, "Undefined property '" + name.lexeme + "'.")

Each time you access a method, you get a fresh bound copy. This means:

var breakfast = Breakfast();
var cook = breakfast.cook;
cook();  // Still works! 'this' is bound to breakfast

Putting It All Together

Let's see a complete example demonstrating all the features:

class Person {
  init(name, age) {
    this.name = name;
    this.age = age;
  }

  greet() {
    print "Hello, I'm " + this.name + "!";
  }

  birthday() {
    this.age = this.age + 1;
    print this.name + " had a birthday!";
  }
}

var alice = Person("Alice", 30);
alice.greet();      // "Hello, I'm Alice!"
alice.birthday();   // "Alice had a birthday!"

var bob = Person("Bob", 25);
bob.greet();        // "Hello, I'm Bob!"

Each instance has its own fields but shares the class's methods. The this keyword in methods always refers to the instance through which the method was called.

Conclusion

With classes, our Lox interpreter now supports object-oriented programming. We've implemented:

  1. Class declaration and instantiation: Define classes and create instances
  2. Properties and methods: Dynamic fields with get/set operations and behavior associated with a class
  3. this keyword and binding: Self-reference within methods
  4. Initializers: Automatic construction via init methods

Our interpreter now supports object-oriented programming paradigms. While we haven't implemented inheritance yet, classes already enable powerful abstraction and code organization.

#compilers #lox-interpreter #programming-languages #software-engineering