Of Rabbits and Holes

Safe Made Easy Pt.1: Single Ownership is (Not) Optional

Intro

This post introduces an approach to memory safety that I believe is more practical and more ergonomic than the available alternatives.

It all started way back when, and was inspired by things I read and wrote:

Three years of development later, I believe I finally got it. The proposal is complete.

Moreover, I have implemented it in my own programming language I intend to release soon-ish, and I want to share the design decisions and the entire path from “huh, why not” to “omg it’s live”.

So, TL;DR: linear types (which are dropped exactly once) + abstract interpretation + a bunch of tricks allows us to eliminate the same classes of bugs as Rust does (at least in non-concurrent environments) plus memory leaks, and we can extend the approach to also cover concurrent environments, all the while being more ergonomic and less restrictive.

Sounds fun? Let’s dig in.

What it promises and what it doesn’t

Motivating example

Consider this pseudocode:

var x: T = new T;
if random() > 0.5 {
    drop x;
}
print(x);

What this code does is it conditionally consumes a value.

There are two ways this could go in a real language. C++ doesn’t particularly care and will happily compile this code:

#include <cstdlib>
#include <cstdio>
int main() {
    int *i = new int(42);

    if ((double)rand() / RAND_MAX > 0.5) {
        delete i;
    }

    printf("i=%d\n", *i);
    return 0;
}

Which will then proceed to invoke UB in about 50% of runs. A modern C++ developer would reach for std::unique_ptr and std::optional here - and they would help, partially. RAII via smart pointers eliminates the manual delete, and optional gives you a way to represent “maybe moved.” But unique_ptr only manages heap-allocated objects, and the type system does not enforce the optional check - operator* on an empty optional is undefined behavior, and even .value() only gives you a runtime exception instead of a compile-time error. It is still on you to remember.

In Rust, though, this code does not compile at all:

fn main() {
    let x = Box::new(42);
    if rand::random::<f64>() > 0.5 {
        drop(x);
    }
    println!("{}", x); // error[E0382]: borrow of moved value: `x`
}

Rust takes a very different approach. The compiler tracks moves through control flow - it sees that x might have been moved in the if branch, and rejects the program outright. Rust’s ownership model requires that every variable’s move state is statically known at every point in the program - a conditionally-moved value violates that requirement, so the program is rejected. You can wrap the value in Option<T> yourself and .take() it manually, but Rust won’t do that for you - the burden is on the developer to restructure the code upfront.

So, what if there was a third way between these two?

The proposal

The proposed solution is straightforward:

var x: T = new T;
if rand() > 0.5f {
    drop x;
}
// <- At this point, typeof(x) is Option<T>

The type of the value is now control-flow-dependent - the compiler evaluates it as it goes through the program, widening it each time control flow diverges to accommodate for both possibilities. Then it becomes the developer responsibility to narrow it down when they want to use it:

var x: T = new T;
if rand() > 0.5f {
    drop x;
}   // x is _widened_ to an Option<T>
if x {
    // x is definitely available in this branch and can be used
} else {
    // x is definitely not available
}   // x is _widened_ to an Option<T> again

One way to view this is to consider which information is available at the compiler at various points:

Compared to C++ approach, we now force the developer to consider the state space explicitly and avoid the crash, because the typechecker will catch all attempts to use an Option<T> where a T should be used, or to use a definitely non-available value.

Compared to Rust approach, we gain flexibility at a cost of a runtime check - a single null/tag comparison at the point of refinement.

It should be noted that this is not a new idea. If anything, one of the most popular languages in the world, TypeScript, does exactly that. However, TypeScript compiles to JavaScript - a language with garbage collection and shared ownership, which does not concern itself with lifetimes, memory or resource management issues, or concurrency, which are all something I need to cover.

This is just the tip of the iceberg, the very beginning of the system. When we cover more ground - aggregates, references, function calls, dynamic dispatch, lambdas and closures - it will grow to accommodate new requirements.

Linear drops

One more thing that I also wanted is that each value is guaranteed to be dropped exactly once. This is known as linear typing. When linear typing is discussed, the guarantee is usually formulated as “used exactly once”, but what constitutes a use can vary. In my case, use == drop.

With the proposed approach it becomes trivially simple:

This raises a question about the refined branch:

var x: T = new T;
if rand() > 0.5f {
    drop x;
}   // x is _widened_ to an Option<T>
if x {
    // x is definitely available in this branch and can be used,
    // but what type is it?
} else {
    // x is None
}   // x is _widened_ to an Option<T> again

Which type, exactly, will x have when it is refined to something available in the second conditional statement?

If it’s refined to T, it will be dropped at the end of the scope of the if branch. That would be silly - a given value could only ever be refined once, used, then immediately dropped when the branch ends. Safe, but draconian.

Instead, we define a new type - Some<T> - whose only purpose is to avoid being dropped, or to serve as proof of availability without taking ownership.

This is one of the many types that are handled by the type checker in a special way; namely:

As I said - definitely not simple. But not that complicated either.

The formal background

The approach described above has roots in several well-established areas of programming language theory.

Flow-sensitive typing allows the type of a variable to change as the program executes. Most type systems are flow-insensitive - a variable declared as T stays T for its entire scope. Flow-sensitive systems, such as TypeScript’s control flow narrowing, track how types evolve along different execution paths. What we add is applying this to ownership - the availability of a value is part of its type, and that availability changes as the program flows through moves, drops, and conditional branches.

Refinement types allow types to be narrowed by predicates. When we write if x { ... }, we are refining the type of x from Option<T> to Some<T> (or to None in the else branch). This is a direct application of refinement typing - the conditional acts as a proof that the value is available, and the type system reflects that proof.

There are several parts of the system that link values and their types during compilation - Some<T> depends on the Option<T> it was refined from, and as we will see in later posts, references depend on the values they point to. This is related to dependent typing in the limited sense that type validity depends on specific program values and ownership relationships. The system does not attempt full dependent typing in the Idris or Agda sense, but it does track value-to-type dependencies across function boundaries and through control flow.

Abstract interpretation provides the unifying framework. What the compiler does is interpret the program abstractly in the availability domain - instead of computing actual values, it computes whether each variable is definitely available, definitely unavailable, or indeterminate. Branch joins widen the state, and conditionals narrow it. This is a standard abstract interpretation over a simple lattice: T (available) and None (unavailable) are the precise states, Option<T> is their join.

It is worth noting that availability is not the only domain the compiler interprets in. Later posts will introduce additional domains - each with its own lattice - interpreted over the same control flow structure. Dependency tracking, reference validity, and ownership of aggregates all follow the same abstract-interpretation approach.

The rules so far

I will maintain a running list of the rules, invariants, and behaviors of the system as we go. Each post will add to it. This list may seem ad-hoc and chaotic because there is no single syntactic trick that everything falls out of - there are several underlying principles playing together, which generate many small rules.

I warned you at the beginning that this is not simple.

Here’s the thing, though: you have to uphold these rules for safety anyway - in C++ you do it in your head, in Rust you do it by fighting the borrow checker. All we do here is mechanically shift the responsibility from the developer to the compiler. Each rule is grounded in well-established theory - we just apply it in a way that doesn’t require the developer to think about it.

  1. Types are partitioned into owned and plain. Owned types require destruction; plain types (integers, booleans, floats) do not
  2. Each owned value has exactly one owner
  3. Each owned value is dropped exactly once - when it or its owner goes out of scope, when it is explicitly dropped, or conditionally if its availability is indeterminate
  4. A conditionally-dropped owned value is widened to Option<T>
  5. Option<T> is itself an owned type when T is owned
  6. Option<T> can be refined to Some<T> or None by a conditional check
  7. Some<T> is non-owning - refinement does not transfer ownership
  8. Some<T> obtained by refining Option<T> and the source Option<T> depend on each other
  9. Refinement is not sticky - it expires when control flow rejoins

Conclusion

So this is the end of the beginning, and there’s much, much more left to cover.

In the next post I will generalize this approach to aggregates such as records and arrays, and introduce references to allow sharing values without transferring ownership.