Archaeology
UIKit NIB Archives
Much of this discussion is based on reverse-engineering of file formats and frameworks, but we haven't bothered to pepper it with qualifiers. Since our reverse-engineering skills are not beyond reproach, and macOS is always changing, a grain of salt is advised. If you have corrections to any details, please do get in touch.
What Is A UIKit NIB Archive?
From the Xcode 13 Release Notes,
“Storyboards and XIBs for macOS compile using UINibEncoder to reduce file sizes and improve runtime performance. When deploying an App before macOS 10.13, Xcode generates a backward compatible nib for the older OSes.”
This alternate compiled NIB format — which clearly comes from iOS — serves the same
purpose as a Cocoa Keyed Archive: to serialize an object graph. In other words, it
an NSCoding
-conforming archiver, akin to NSKeyedArchiver
.
UINibEncoder
isn't public API. Nor is the UINibDecoder
unarchiver public, but
it is clearly present on macOS going back to macOS 10.13 (High Sierra). (The last we checked, both were implemented
by the private UIFoundation.framework.) The decoder is presumably used by AppKit, somewhere underneath the NIB loading mechanism.
Overview Of UIKit NIB Archives
At the core of the UIKit NIB Archive format are 4 tables:
- The class names table defines the classes in the archive, with each entry giving the name of the class,
along with any fallback classes used for the
+classFallbacksForKeyedArchiver
decode mechanism. - The key names table defines the strings that are used as keys for encoding values across the archive.
- The coder values table defines the encoded values across the archive. Each entry contains a single value, along with a reference to the associated key, so this is more accurately a single key-value pair. There are different types of values that get encoded in different ways.
- Finally, the objects table defines the individual object instances. Each entry references the class that the object is an instance of, along with the coder values (key-value pairs) that the instance encoded.
Each of these tables is an ordered array, and references between tables use the index of the target entry. For example, a coder value's key is defined by the appropriate index in the key names table. Likewise, an object's key-value pairs are defined by the index and length of the appropriate entries in the coder values table (which is arranged to make the key-value pairs for a given object contiguous).
About VInt32 Variable-Sized Integer Encoding
The UIKit NIB Archive makes heavy use of integers encoded in a “VInt32” format. This is an integer of up to 32 bits in size, which is encoded using from 1 to 5 bytes, depending on the magnitude of the value.
More specifically, a “VInt32” is encoded using 7 bits per byte, in Little Endian order, with the high bit being set only on the most significant byte of the value.
This encoding of integers is used for inter-table references, as well as various lengths, as discussed below.
The UIKit NIB Archive Binary Format
Based on our reverse-engineering, the UIKit NIB Archive binary format has the following structure.
We inferred this by examining compiled NIB files and by some amount of reversing of UIFoundation, mostly on macOS 10.15. The implementation may have changed since then, but as far as we know, this is still accurate.
The archive data starts with a fixed-length header in this form:
struct UINibArchiveHeader { char _magic[ 0xa ]; // "NIBArchive" uint32_t _formatVersion; // UIMaximumCompatibleFormatVersion == 0x1 uint32_t _coderVersion; // UICurrentCoderVersion == 0xa (as of macOS 10.15) uint32_t _objectCount; // this many UINibArchiveObject objects ... uint32_t _objectOffset; // ... at this offset into archive data uint32_t _keyStringCount; // this many UINibArchiveKey objects .. uint32_t _keyStringOffset; // ... at this offset into archive data uint32_t _coderValueCount; // this many UINibArchiveCoderValue* objects ... uint32_t _coderValueOffset; // ... at this offset into archive data uint32_t _classNameCount; // this many UINibArchiveClassName objects ... uint32_t _classNameOffset; // ... at this offset into archive data } __attribute__((__packed__));
After this 50-byte header come each of the 4 tables, which is conceptually something like this:
struct UINibArchiveObject _objects[ objectCount ]; struct UINibArchiveKey _keyStrings[ keyStringCount ]; struct UINibArchiveCoderValue _coderValues[ coderValueCount ]; struct UINibArchiveClassName _classNames[ classNameCount ];
But note that all of these structures are variable in size, so these are not simple C arrays.
The Objects Table
Looking at each of the tables in turn, we start with the object, which points to the defining class name, and to the range of key-value pairs encoded by this object instance:
struct UINibArchiveObject { vint32_t _classIndex; // i.e. classNames[ _classIndex ] is the class of this object vint32_t _valueStart; // i.e. index of first UINibArchiveCoderValue contained by this object (if count>0) vint32_t _valueCount; // number of UINibArchiveCoderValues contained (can be zero though) };
Here, vint32_t
is our notation for the “VInt32” variable-sized
integer described above — and is a primary reason why this and the other table elements are not fixed sizes.
The root object of the UIKit NIB Archive is always at index zero of this table.
The Key Names Table
Next are the key names, which are simple character buffers with a length:
struct UINibArchiveKey { vint32_t _len; char _name[ _len ]; // _len characters, but NOT NULL-terminated here };
The Coder Values Table
Next, we have the coder values, which is where things get more complicated. All coder values contain at least a key name reference, and declare a value type:
struct UINibArchiveCoderValue { vint32_t _keyID; // i.e. keyStrings[ keyID ] is the key for this value uint8_t _type; // e.g. UINibCoderValueType };
where the _type
is one of the following:
typedef NS_ENUM( uint8_t, UINibCoderValueType ) { UINibCoderValueTypeInt8 = 0, // UINibArchiveCoderValueFixed [1] UINibCoderValueTypeInt16, // UINibArchiveCoderValueFixed [2] UINibCoderValueTypeInt32, // UINibArchiveCoderValueFixed [4] UINibCoderValueTypeInt64, // UINibArchiveCoderValueFixed [8] UINibCoderValueTypeFalse, // UINibArchiveCoderValue UINibCoderValueTypeTrue, // UINibArchiveCoderValue UINibCoderValueTypeFloat, // UINibArchiveCoderValueFixed [4] UINibCoderValueTypeDouble, // UINibArchiveCoderValueFixed [8] UINibCoderValueTypeBytes, // UINibArchiveCoderValueVariable UINibCoderValueTypeNil, // UINibArchiveCoderValue UINibCoderValueTypeReference, // UINibArchiveCoderValueFixed [4] };
Whether anything comes after the _type
depends on that _type
value.
The Boolean types and the nil type have no additional value information: these can be represented by the
“base” UINibArchiveCoderValue
shown above.
The numeric types have a coder value that looks more like this:
struct UINibArchiveCoderValueFixed { uint32_t _keyID; uint8_t _type; // e.g. UINibCoderValueType uint8_t _bytes[]; // from 1 to 8 bytes of data per _type };
The size of the additional _bytes
is determined from the _type
, as shown in
the enumeration above. Multi-byte values are always Little Endian.
Note that UINibCoderValueTypeReference
is a 32-bit integer that is used for inter-object
references — this is the very common case where a key points to another object instance. As elsewhere,
this value is interpreted as an index into the objects table; however, in this context, the value is a
fixed-size integer, rather than being a “VInt32.”
Finally, UINibCoderValueTypeBytes
is an arbitrary data buffer and looks like this:
struct UINibArchiveCoderValueVariable { uint32_t _keyID; uint8_t _type; // e.g. UINibCoderValueType vint32_t _length; uint8_t _bytes[ _length ]; };
The Class Names Table
Returning to the final of the 4 tables, the class names look like this:
struct UINibArchiveClassName { vint32_t _nameLen; // including NULL termination for whatever reason vint32_t _numberOfFallbackClasses; uint32_t _fallbackClassIndex[ _numberOfFallbackClasses ]; char _className[ _nameLen ]; // _nameLen characters (which includes NULL terminator) };
Special Handling of Foundation Collections
One other point mentioning is that UIKit NIB Archives do not use the
standard encoding of standard Foundation collections. In other words, UINibEncoder
does not send -encodeWithCoder:
to NSArray
, NSDictionary
or NSSet
, so these don't wind up encoded with NS.objects
and
NS.keys
keys as in Cocoa Keyed Archives. (It wouldn't
really make sense to do that, since it would create a mixture of the two serialization formats.)
Instead, UINibEncoder
“inlines” the values in these collections,
using the same coder value scheme as above. The result will be a series of coder values, all with
the same special key name UINibEncoderEmptyKey
. For an array or set, these are
the objects in the ordered collection. For a dictionary, these are the keys and values of
the dictionary, in alternating key-value-key-value sequence. In all cases, the first coder
value in the object (before any UINibEncoderEmptyKey
s) will be the special
key NSInlinedValue
with a Boolean true value.
If you use Archaeology to inspect a UIKit NIB Archive, it will simplify these standard collections, just as it does for Cocoa Keyed Archives. But if you want to see the the actual archived representation, as described above, you can disable this simplification by unchecking File > Re-open With Options > Simplify collections in archives. You can also change this option directly from the File > Open dialog.