Rust API Bindings: Core Foundation Memory Management and Mutability
The design patterns used by Core Foundation for memory management and mutability fit surprisingly well in idiomatic Rust. This post shares an overview of how I reached this conclusion the hard way.
As I’m designing Rust API bindings for Core Foundation, I want the user-facing API to match The Rust Standard Library as closely as possible, and memory management is a crucial area whose design significantly impacts the API surface. There are (at least) two critical differences between Core Foundation and The Rust Standard Library in their approach to memory management:
-
All Core Foundation objects are allocated on the heap and are reference counted. Generally, Rust types can be stack-allocated, heap-allocated and uniquely owned, or heap-allocated with shared ownership.
-
Core Foundation uses different types for immutable and mutable objects, while Rust expresses mutability through the type system.
The following summarizes my exploration in this space, my design goals for memory management, and how I achieved them.
Ad Hoc Approach
For many C APIs, wrapping a pointer in a tuple struct and implementing Drop
is sufficient to provide an idiomatic Rust API for a foreign interface. Consider the following example of this approach for CFString
:
struct String(*const __CFString);
impl String {
fn len(&self) -> CFIndex {
unsafe { CFStringGetLength(self.0) }
}
}
impl Drop for String {
fn drop(&mut self) {
unsafe { CFRelease(self.0.cast()) }
}
}
This approach is straightforward, but it is not a zero-cost abstraction. Each time the Core Foundation object pointer is required, for example, in the len
method, the compiler must emit a dereference of the tuple struct &self
to load the Core Foundation pointer value. This indirection is unavoidable because we must define a type to implement Drop
, though it is negligible in practice.
Although Core Foundation is a C-based API, many types have logical subclasses. If we were to add Rust API bindings for CFMutableString
with this approach, it would require defining a new, independent type. Adding an implementation of Deref
would enable the logical subclass to gain all the methods of its logical superclass through deref coercion, and the resulting Rust API would still be reasonably idiomatic.
While this is a well-trodden path[1], I wanted to find a design that:
-
Is a true zero-cost abstraction.
-
Shows Core Foundation objects are heap-allocated through the type system (e.g.,
Box
). -
Combines Rust’s mutable references with Core Foundation’s mutable types.
Box for Core Foundation
I started exploring the design space by building equivalents of the standard library’s Box
and Arc
types, knowing unique ownership (à la Box
) would be part of the mutability story and that shared pointers (like Arc
) are a fundamental part of programming on Apple platforms.
Through this exercise, I immediately achieved my first two design goals. Consider the following sample code illustrating the approach for CFString
:
struct Box<T>(NonNull<T>);
impl<T> Deref for Box<T> {
type Target = T;
#[inline]
fn deref(&self) -> &Self::Target {
unsafe { self.0.as_ref() }
}
}
struct String;
impl String {
fn new() -> Box<Self> { /* ... */ }
fn len(&self) -> CFIndex {
let cf: *const _ = self;
unsafe { CFStringGetLength(cf.cast()) }
}
}
Like the approach in the previous section, a tuple struct wraps the raw Core Foundation object instance pointer. But this wrapper has three essential differences.
First, the type name, Box<T>
, signals to the reader that T
is heap-allocated and that the instance T
is unique.
Second, it implements Deref
to T
, the Rust type implementing the API bindings, which is crucial in making the abstraction zero-cost. When the box is dereferenced by the compiler, for example, to call the len
method, the box returns the Core Foundation pointer value as a reference to T
. The reference value (i.e., &self
) is bitwise identical to the Core Foundation pointer value and can be passed directly through to the C API.
Finally, the separation of the type bindings (e.g., String
) from the memory management facility (e.g., Box<T>
) enables idiomatic, zero-cost use of references to the Core Foundation type bindings. Consider potential bindings for CFArrayGetValueAtIndex
. With the approach in this section, the function binding can simply cast the pointer into a reference with the array’s lifetime.
With the approach in the previous section, the bindings for this function could:
-
Return a new binding instance for the value, retaining and releasing the object. The new binding instance does not have a lifetime associated with the array, so the retain is necessary to guarantee that the object lives at least as long as the binding instance. In many cases, however, the retain/release is unnecessary overhead.
-
Use an intermediate type to associate a lifetime with the binding instance and sidestep its retain/release, which is effectively the same as the function binding for the approach in this section but requires more ceremony to eliminate the retain/release correctly..
Arc for Core Foundation
It took more exploration and trial and error to identify an approach to achieve my third design goal of combining Rust’s mutable references with Core Foundation’s mutable types.
At some point, I asked, "Why do immutable objects need exclusive ownership?" I was eventually able to convince myself that "They don’t!" Looking back, I don’t know why this wasn’t more obvious. Rust’s documentation for its Arc
type clearly states:
You cannot generally obtain a mutable reference to something inside an
Arc
.
With that insight, developing the guidance to identify the appropriate smart pointer type was reasonably straightforward: Is the Core Foundation object instance a mutable type uniquely owned by the raw pointer (i.e., a Create
or Copy
function return the pointer)? If yes, use Box<T>
; otherwise use Arc<T>
.
My implementations of Box<T>
and Arc<T>
for Core Foundation are virtually identical, with the primary difference being Box<T>
also implements DerefMut
, AsMut
, and BorrowMut
.
The combination of reference counting and mutability in the smart pointer types[2] fulfilled my design goals and resulted in surprisingly idiomatic Rust code.
impl String {
fn append(&mut self, s: &String) {
let cf: *mut _ = self;
let s: *const _ = s;
unsafe { CFStringAppend(cf.cast(), s.cast()) };
}
}
fn main() {
let mut s: Box<String> = String::new();
s.append(cfstr!("Hello, World!"));
println!("{s}");
}