Objective-C Internals: Class Graph Implementation
A brief look at the Objective-C runtime source code, focusing on the definition of object and class types, highlighting how inheritance is implemented, special cases for the root class, and quirks related to metaclass lookup.
The previous post explored the Objective-C class architecture and illustrated an object graph for a class hierarchy. Here, we’ll build on those concepts by examining the class object graph implementation (classes, superclasses, and metaclasses).
Let’s start with the public definitions of some key types. In Objective-C, the Class
type represents any class type, and the id
type represents an instance of any class. The Objective-C runtime header objc.h
defines these types:
/// An opaque type that represents an Objective-C class.
typedef struct objc_class *Class;
/// Represents an instance of a class.
struct objc_object {
Class _Nonnull isa OBJC_ISA_AVAILABILITY;
};
/// A pointer to an instance of a class.
typedef struct objc_object *id;
As mentioned in the previous post, Objective-C classes are also objects, but this relation is not present in the public type definitions. We do see this relation, however, If we take a look at the internal type definitions.
First, objc-private.h contains the actual objc_object
definition. While the internal definition has many non-virtual C++ member functions, its only member variable, isa_storage
, corresponds to the (deprecated) isa
instance variable. (I don’t know why the internal type is char
array, but if I had to guess, it prevents accidental direct use given the various overloads of the field. I discuss more about the isa
field in this post.)
struct objc_object {
char isa_storage[sizeof(isa_t)];
};
Next, objc-runtime-new.h contains the objc_class
data structure definition. It has a few member variables of its own, and, like objc_object
, it has many non-virtual C++ member functions.
struct objc_class : objc_object {
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
};
Here, we see that objc_class
derives from objc_object
and thus inherits the isa
field. So, a class object is implemented just like any other object type. Next is the superclass
field, which points to the parent class object, if any. (The cache
and bits
are not part of the class graph construction, so we’ll explore those in the future.)
And that’s all that’s required to construct the Objective-C class graph: two data structures (objc_object
and objc_class
) and two fields (isa
and superclass
)!
objc_class Member Functions
Next, let’s examine some of the objc_class
member functions to learn about the implementation of the edges in the architecture diagram.
Root Classes
bool isRootClass() {
return getSuperclass() == nil;
}
Root Metaclasses
bool isRootMetaclass() {
return ISA() == (Class)this;
}
The root metaclass has a self-referential isa
pointer, which is how the runtime identifies root metaclasses. As far as I know, this is the only cycle in the class graph.
Metaclass Identity
bool isMetaClass() const {
return cache.getBit(FAST_CACHE_META);
}
// Like isMetaClass, but also valid on un-realized classes
bool isMetaClassMaybeUnrealized() {
if (isStubClass())
return false;
return bits.flags() & RW_META;
}
A bit flag emitted by the compiler identifies a metaclass instance, which is the primary characteristic distinguishing a metaclass instance from a class instance.
Unrealized classes, which includes stub classes, are described in more detail in the Objective-C Internals: Unrealized Classes (and Toll-Free Bridging) post.
Metaclass Retrieval
// NOT identical to this->ISA when this is a metaclass
Class getMeta() {
if (isMetaClassMaybeUnrealized()) return (Class)this;
else return this->ISA();
}
When retrieving the metaclass from some class instance, it’s necessary to check whether that instance is the metaclass. If it is the metaclass, it returns itself. Otherwise, the class instance returns the metaclass through its isa
pointer.
Compiler Output
The objc_class
data structure is part of the Objective-C ABI, meaning the details of its size and field layout are known to third-party programs, which encode this information into their executable binaries. We can see this by examining the compiler output of the following trivial class definition.
#import <Foundation/Foundation.h>
@interface MyObject: NSObject
@end
@implementation MyObject
@end
Generating assembly for the above MyObject.m
file by running clang -S MyObject.m
will produce an assembly file containing the following snippet (and more).
.section __DATA,__objc_data
_OBJC_CLASS_$_MyObject:
.quad _OBJC_METACLASS_$_MyObject
.quad _OBJC_CLASS_$_NSObject
.quad __objc_empty_cache
.quad 0
.quad __OBJC_CLASS_RO_$_MyObject
_OBJC_METACLASS_$_MyObject:
.quad _OBJC_METACLASS_$_NSObject
.quad _OBJC_METACLASS_$_NSObject
.quad __objc_empty_cache
.quad 0
.quad __OBJC_METACLASS_RO_$_MyObject
Here, we see that the code generated by the compiler aligns with the observations we drew from the architecture diagram in the previous post:
-
The
MyClass
class object has:-
An
isa
variable that points to theMyClass
metaclass. -
A
super
variable that points to theNSObject
class object.
-
-
The
MyClass
metaclass has:-
An
isa
variable that points to theNSObject
(root object) metaclass. -
A
super
variable that points to theNSObject
metaclass.
-
(As mentioned above, the cache
and bits
fields will be the subject of a future post.)