Writing my own interpreter for Lox, Part 8 - Classes
This is the next part in writing my own interpreter for Lox in Python. Previously, we implemented static variable resolution to fix closure bugs. In this post, we add classes, bringing object-oriented programming to Lox. Here is the corresponding chapter from Crafting Interpreters.
Classes are the cornerstone of object-oriented programming. They bundle data (properties) and behavior (methods) into cohesive units. Lox's classes are dynamically typed and support:
- Class declarations and instantiation
- Instance properties with dynamic get/set
- Methods with implicit
this
binding - Initializers for object construction
Updated Grammar
The grammar now includes class declarations, property access, and the this
keyword. Here's the updated grammar:
program → declaration* EOF ;
declaration → classDecl
| funDecl
| varDecl
| statement ;
classDecl → "class" IDENTIFIER "{" function* "}" ;
funDecl → "fun" function ;
function → IDENTIFIER "(" parameters? ")" block ;
statement → exprStmt
| ifStmt
| printStmt
| returnStmt
| whileStmt
| block ;
expression → assignment ;
assignment → ( call "." )? IDENTIFIER "=" assignment
| logic_or ;
call → primary ( "(" arguments? ")" | "." IDENTIFIER )* ;
primary → "true" | "false" | "nil" | "this"
| NUMBER | STRING | IDENTIFIER
| "(" expression ")" ;
Key additions:
- Class declarations:
class
keyword with method definitions - Property access: Dot notation for getting and setting properties
this
keyword: Self-reference within methods
Class Declarations
Classes in Lox are declared with the class
keyword followed by a name and a body containing method definitions:
class Breakfast {
cook() {
print "Eggs a-fryin'!";
}
serve(who) {
print "Enjoy your breakfast, " + who + ".";
}
}
Class Statement Syntax Tree
I added a ClassStmt
to app/stmt.py
:
class ClassStmt(Stmt):
def __init__(self, name: Token, methods: list[FunctionStmt]):
self.name = name
self.methods = methods
def accept(self, visitor: StmtVisitor):
return visitor.visit_class_stmt(self)
The class statement holds:
name
: The class identifier tokenmethods
: List of function statements representing methods
Parsing Class Declarations
In app/parser.py
, I added parsing logic for classes:
def declaration(self):
try:
if (self.match(TokenType.CLASS)):
return self.class_declaration()
if (self.match(TokenType.FUN)):
return self.function_declaration("function")
# ... rest of declarations
def class_declaration(self):
name = self.consume(TokenType.IDENTIFIER, "Expect class name.")
self.consume(TokenType.LEFT_BRACE, "Expect '{' before class body.")
methods = []
while not self.check(TokenType.RIGHT_BRACE) and not self.is_at_end():
methods.append(self.function_declaration("method"))
self.consume(TokenType.RIGHT_BRACE, "Expect '}' after class body.")
return ClassStmt(name, methods)
The parser reuses function_declaration()
for methods since their syntax is identical to regular functions.
Creating Instances
Classes are callable and create new instances:
var breakfast = Breakfast();
breakfast.cook();
The LoxClass Class
I created LoxClass
in app/lox_class.py
that implements the LoxCallable
interface:
class LoxClass(LoxCallable):
def __init__(self, name: str, methods):
self.name = name
self.methods = methods
def __repr__(self):
return self.name
def arity(self):
initializer = self.find_method("init")
if initializer is not None:
return initializer.arity()
return 0
def call(self, interpreter, arguments):
instance = LoxInstance(self)
initializer = self.find_method("init")
if initializer is not None:
initializer.bind(instance).call(interpreter, arguments)
return instance
def find_method(self, name):
if name in self.methods:
return self.methods[name]
return None
When called, a class:
- Creates a new
LoxInstance
- Finds and calls the
init
initializer if present - Returns the instance
The LoxInstance Class
I created LoxInstance
in app/lox_instance.py
to represent instances:
class LoxInstance:
def __init__(self, klass):
self.klass = klass
self.fields = {}
def __repr__(self):
return self.klass.name + " instance"
def get(self, name):
if name.lexeme in self.fields:
return self.fields[name.lexeme]
method = self.klass.find_method(name.lexeme)
if method is not None:
return method.bind(self)
raise LoxRuntimeError(name, "Undefined property '" + name.lexeme + "'.")
def set(self, name, value):
self.fields[name.lexeme] = value
Instances store:
klass
: Reference to the class definitionfields
: Dictionary of property values
Executing Class Declarations
In app/interpreter.py
, class declarations create class objects:
def visit_class_stmt(self, stmt: ClassStmt):
self.environment.define(stmt.name.lexeme, None)
methods = {}
for method in stmt.methods:
is_initializer = method.name.lexeme == "init"
function = LoxFunction(method, self.environment, is_initializer)
methods[method.name.lexeme] = function
klass = LoxClass(stmt.name.lexeme, methods)
self.environment.assign(stmt.name, klass)
return None
The two-step process (define then assign) allows methods to reference the class itself.
Properties
Lox supports dynamic properties on instances. Properties can be get and set using dot notation:
breakfast.meat = "sausage";
breakfast.bread = "sourdough";
print breakfast.meat; // "sausage"
Get Expressions
I added a Get
expression to app/expr.py
:
class Get(Expr):
def __init__(self, obj: Expr, name: Token):
self.obj = obj
self.name = name
def accept(self, visitor: ExprVisitor):
return visitor.visit_get(self)
The get expression holds:
obj
: Expression that evaluates to an instancename
: Property name token
Set Expressions
I added a Set
expression to app/expr.py
:
class Set(Expr):
def __init__(self, obj: Expr, name: Token, value: Expr):
self.obj = obj
self.name = name
self.value = value
def accept(self, visitor: ExprVisitor):
return visitor.visit_set(self)
The set expression holds:
obj
: Expression that evaluates to an instancename
: Property name tokenvalue
: Expression to assign to the property
Parsing Property Access
In app/parser.py
, I extended the call()
method to handle property access:
def call(self):
expr = self.primary()
while (True):
if self.match(TokenType.LEFT_PAREN):
expr = self.finish_call(expr)
elif self.match(TokenType.DOT):
name = self.consume(TokenType.IDENTIFIER, "Expect property name after '.'.")
expr = Get(expr, name)
else:
break
return expr
Property access has the same precedence as function calls, allowing chains like object.property.method()
.
I also updated assignment parsing to handle property setters:
def assignment(self):
expr = self.logic_or()
if (self.match(TokenType.EQUAL)):
equals = self.previous()
rvalue = self.assignment()
if isinstance(expr, Variable):
name = expr.name
return Assign(name, rvalue)
elif isinstance(expr, Get):
return Set(expr.obj, expr.name, rvalue)
ParseError.error(self, equals, "Invalid assignment target.")
return expr
If the left side of an assignment is a Get
expression, we convert it to a Set
expression.
Executing Property Access
In app/interpreter.py
, property get and set are straightforward:
def visit_get(self, expr: Get):
obj = self.evaluate(expr.obj)
if (isinstance(obj, LoxInstance)):
return obj.get(expr.name)
raise LoxRuntimeError(expr.name, "Only instances have properties.")
def visit_set(self, expr: Set):
obj = self.evaluate(expr.obj)
if not isinstance(obj, LoxInstance):
raise LoxRuntimeError(expr.name, "Only instances have fields.")
value = self.evaluate(expr.value)
obj.set(expr.name, value)
return value
The actual get/set logic lives in the LoxInstance
class. Note that property lookup first checks fields, then falls back to methods.
Methods
Methods are functions associated with a class. They have access to the instance through the special this
keyword:
class Breakfast {
serve(who) {
print "Enjoy your " + this.meat + ", " + who + ".";
}
}
var breakfast = Breakfast();
breakfast.meat = "sausage";
breakfast.serve("Noble Reader"); // "Enjoy your sausage, Noble Reader."
Methods are stored in the class definition and bound to instances when accessed.
The this
Keyword
The this
keyword provides access to the current instance within methods:
class Cake {
taste() {
var adjective = "delicious";
print "The " + this.flavor + " cake is " + adjective + "!";
}
}
var cake = Cake();
cake.flavor = "chocolate";
cake.taste(); // "The chocolate cake is delicious!"
This Expression
I added a This
expression to app/expr.py
:
class This(Expr):
def __init__(self, keyword: Token):
self.keyword = keyword
def accept(self, visitor: ExprVisitor):
return visitor.visit_this(self)
Parsing this
In app/parser.py
, I added this
to primary expressions:
def primary(self):
# ... other cases
if self.match(TokenType.THIS):
return This(self.previous())
# ... rest of primary
Evaluating this
In app/interpreter.py
, this
is resolved like any other variable:
def visit_this(self, expr: This):
return self.look_up_variable(expr.keyword, expr)
The magic happens in how this
gets bound to the instance, which we'll see in the binding section.
Initializers
Initializers are special methods that run when an instance is created. In Lox, the initializer is named init
:
class Breakfast {
init(meat, bread) {
this.meat = meat;
this.bread = bread;
}
}
var baconAndToast = Breakfast("bacon", "toast");
The init
Method
When a class is instantiated, if it has an init
method, it's automatically called with the arguments passed to the class:
def call(self, interpreter, arguments):
instance = LoxInstance(self)
initializer = self.find_method("init")
if initializer is not None:
initializer.bind(instance).call(interpreter, arguments)
return instance
The class's arity is determined by the initializer:
def arity(self):
initializer = self.find_method("init")
if initializer is not None:
return initializer.arity()
return 0
Return from Initializer
Initializers have special return semantics. You can use return;
to exit early, but you can't return a value:
class Foo {
init() {
return "oops"; // Error!
}
}
I updated app/lox_function.py
to handle initializers specially:
class LoxFunction(LoxCallable):
def __init__(self, declaration, closure, is_initializer):
self.declaration = declaration
self.closure = closure
self.is_initializer = is_initializer
def call(self, interpreter, arguments: list[any]):
environment = Environment(self.closure)
for param, argument in zip(self.declaration.params, arguments):
environment.define(param.lexeme, argument)
try:
interpreter.execute_block(self.declaration.body, environment)
except Return as err:
if self.is_initializer:
return self.closure.get_at(0, "this")
return err.value
if self.is_initializer:
return self.closure.get_at(0, "this")
return None
For initializers:
- Early returns still return
this
(not the explicit return value) - Implicit return at the end returns
this
Resolver Updates
The resolver needs to handle class-specific scoping for this
and detect errors like returning values from initializers.
Tracking Class Context
I added a ClassType
enum in app/resolver.py
:
ClassType = Enum(
'ClassType',
[
'NONE', 'CLASS'
]
)
And updated the resolver to track class context:
class Resolver(ExprVisitor, StmtVisitor):
def __init__(self, interpreter):
self.interpreter = interpreter
self.scopes = deque()
self.current_function = FunctionType.NONE
self.current_class = ClassType.NONE # New
self.had_error = False
Resolving Classes
Class declarations create a scope with this
bound:
def visit_class_stmt(self, stmt: ClassStmt):
enclosing_class = self.current_class
self.current_class = ClassType.CLASS
self.declare(stmt.name)
self.define(stmt.name)
self.begin_scope()
self.scopes[-1]["this"] = True
for method in stmt.methods:
if method.name.lexeme == "init":
self.resolve_function(method, FunctionType.INITIALIZER)
else:
self.resolve_function(method, FunctionType.METHOD)
self.end_scope()
self.current_class = enclosing_class
return None
Key points:
- Track that we're inside a class
- Declare and define the class name
- Create a new scope with
this
bound - Resolve each method (marking initializers specially)
- Restore previous class context
Resolving this
The this
expression checks that we're inside a class:
def visit_this(self, expr):
if self.current_class == ClassType.NONE:
ResolveError.error(self, expr.keyword, "Can't use 'this' outside of a class.")
return None
self.resolve_local(expr, expr.keyword)
This catches errors like:
print this; // Error: Can't use 'this' outside of a class.
Preventing Value Returns from Initializers
I updated the FunctionType
enum to distinguish initializers:
FunctionType = Enum(
'FunctionType',
[
'NONE', 'FUNCTION', 'METHOD', 'INITIALIZER'
]
)
And added validation in return statement resolution:
def visit_return_stmt(self, stmt):
if self.current_function == FunctionType.NONE:
ResolveError.error(self, stmt.keyword, "Can't return from top-level code.")
if stmt.value:
if self.current_function == FunctionType.INITIALIZER:
ResolveError.error(self, stmt.keyword, "Can't return a value from an initializer.")
self.resolve(stmt.value)
This prevents:
class Foo {
init() {
return "oops"; // Error: Can't return a value from an initializer.
}
}
Resolving Property Access
Get and set expressions just need to resolve their operands:
def visit_get(self, expr):
self.resolve(expr.obj)
def visit_set(self, expr):
self.resolve(expr.value)
self.resolve(expr.obj)
Properties are resolved at runtime, not compile time, so we don't validate property names.
Binding Methods
When you access a method through an instance, we need to bind this
to that instance. This is done through method binding in app/lox_function.py
:
def bind(self, lox_instance):
env = Environment(self.closure)
env.define("this", lox_instance)
return LoxFunction(self.declaration, env, self.is_initializer)
The bind()
method:
- Creates a new environment with the method's closure as parent
- Defines
this
in that environment to be the instance - Returns a new function with this environment as its closure
This happens when getting a method from an instance in app/lox_instance.py
:
def get(self, name):
if name.lexeme in self.fields:
return self.fields[name.lexeme]
method = self.klass.find_method(name.lexeme)
if method is not None:
return method.bind(self) # Bind method to this instance
raise LoxRuntimeError(name, "Undefined property '" + name.lexeme + "'.")
Each time you access a method, you get a fresh bound copy. This means:
var breakfast = Breakfast();
var cook = breakfast.cook;
cook(); // Still works! 'this' is bound to breakfast
Putting It All Together
Let's see a complete example demonstrating all the features:
class Person {
init(name, age) {
this.name = name;
this.age = age;
}
greet() {
print "Hello, I'm " + this.name + "!";
}
birthday() {
this.age = this.age + 1;
print this.name + " had a birthday!";
}
}
var alice = Person("Alice", 30);
alice.greet(); // "Hello, I'm Alice!"
alice.birthday(); // "Alice had a birthday!"
var bob = Person("Bob", 25);
bob.greet(); // "Hello, I'm Bob!"
Each instance has its own fields but shares the class's methods. The this
keyword in methods always refers to the instance through which the method was called.
Conclusion
With classes, our Lox interpreter now supports object-oriented programming. We've implemented:
- Class declaration and instantiation: Define classes and create instances
- Properties and methods: Dynamic fields with get/set operations and behavior associated with a class
this
keyword and binding: Self-reference within methods- Initializers: Automatic construction via
init
methods
Our interpreter now supports object-oriented programming paradigms. While we haven't implemented inheritance yet, classes already enable powerful abstraction and code organization.