Archaeology
Cocoa Keyed Archives
Much of this discussion is based on reverse-engineering of file formats and frameworks, but we haven't bothered to pepper it with qualifiers. Since our reverse-engineering skills are not beyond reproach, and macOS is always changing, a grain of salt is advised. If you have corrections to any details, please do get in touch.
What Is A Cocoa Keyed Archive?
A Cocoa Keyed Archive is the serialized data created by the Foundation
NSKeyedArchiver
class:
“NSKeyedArchiver
, a concrete subclass ofNSCoder
, provides a way to encode objects (and scalar values) into an architecture-independent format suitable for storage in a file. When you archive a set of objects, the archiver writes the class information and instance variables for each object to the archive.”
Keyed archives superceded non-keyed archives (a completely different format) way back in Mac OS X 10.2 (Jaguar).
Arguably, keyed archives have since been superceded by Swift Codable
and related, but that doesn't mean that
keyed archives are gone.
A properly decoded keyed archive — generated with NSKeyedUnarchiver
— requires the app-specific
code that created it (almost always). But you can actually tell quite a bit from class and key names, and
by recursively decoding the other chunks of data that it references.
Many macOS apps and services use Cocoa Keyed Archives for persistence. In addition, until Xcode 13, compiled NIB files were also Cocoa Keyed Archives; Xcode still creates this format for deployment targets of macOS 10.12 (Sierra) or before.
A Cocoa Keyed Archive Is A Property List — Mostly
A Cocoa Keyed Archive is a binary serialization of a macOS Property List, and you can certainly inspect it using (some) common property list tools. We'll discuss the semantics of this property list below, but first we need to examine what is “special” about these files.
Although a Cocoa Keyed Archive uses all the standard property list value types — strings, dates, data and so on —
it also leverages a special value type, the CFKeyedArchiverUID
. This private Core Foundation type is really
just an integer — an uint32_t
— but is used to reference an object with that unique ID.
(We presume that the reason for the special type is to efficiently distinguish between primitive integer values and
object reference values.)
Since CFKeyedArchiverUID
is a private type, and should only appear in Cocoa Keyed Archives — which usually
aren't interpreted as property lists — how or if they appear in property list-capable tools will vary. For example,
Xcode's property list editor will simply discard them.
Tools that use the XML serialization of property lists will show the CFKeyedArchiverUID
s, but not quite as they
really exist. NSPropertyListSerialization
knows about CFKeyedArchiverUID
s, and
even preserves them through the binary-to-XML-to-binary cycle, but in a way that doesn't change the XML schema.
So, if you convert a Cocoa Keyed Archive to an XML property list, you might find something like this:
<key>$class</key> <dict> <key>CF$UID</key> <integer>78</integer> </dict>
This looks like a really inefficient use of a dictionary to store a single value. But this is actually the special
XML representation of a key named $class
whose value is a single CFKeyedArchiverUID
with a
integer value of 78. If you subsequently convert this XML back into a binary property list, NSPropertyListSerialization
recognizes this special structure, and re-creates the single CFKeyedArchiverUID
.
In Archaeology, if you decode a Cocoa Keyed Archive as a macOS Property List — e.g. using File > Re-open With Options > Decode as — you'll see the actualCFKeyedArchiverUID
s listed as type “special” with the value returned byCFCopyDescription()
. Of course, Archaeology can also show the data more usefully as a Cocoa Keyed Archive (which it does by default).
The Property List Object Graph
The Cocoa Keyed Archive property list represents an object graph using a dictionary with four top-level keys,
the most important of which is the $objects
array.
A very simplified archive might look something like this:
┌───────────────────────────────────┐ ━━━$archiver━━━━━━━━━━━━━━━━━━━━━━━━━━━━▶│ NSKeyedArchiver │ └───────────────────────────────────┘ ┌───────────────────────────────────┐ ━━━$version━━━━━━━━━━━━━━━━━━━━━━━━━━━━━▶│ 100000 │ └───────────────────────────────────┘ ┌──────────────────────────────────────────────────────────────────┐ │ ┌─────┐┌───────────────────────────────────┐ │ ━━━$objects━━━━━━━━━━━━━━│━━━━━━━▶│ 0 ││ $null │ │ │ ├─────┤├───────────────────────────────────┤ │ │ │ 1 ││ "Stuff" │ │ │ ├─────┤├───────────────────────────────────┤ │ └───────▶│ 2 ││ 0xcafedeadbeef │ │ ├─────┤├───┬───────────────────────────────┤ │ │ ││ │ $class : UID[5] │ │ │ ││ D ├───────────────────────────────┤ │ │ 3 ││ I │ someString : UID[1] │ CFKeyedArchiverUID[2] │ ││ C ├───────────────────────────────┤ │ │ ││ T │ someObject : UID[4] │ │ │ ││ │ │ │ ├─────┤├───┼───────────────────────────────┤ │ │ ││ │ $class : UID[9] │ │ │ ││ D ├───────────────────────────────┤ │ │ 4 ││ I │ someData : UID[2] ───────────────┘ │ ││ C ├───────────────────────────────┤ │ ││ T │ someInteger : @(42) │ │ ││ │ │ ├─────┤├───┼───────────────────────────────┤ │ ││ D │ $classname : @"SpecialClass" │ │ 5 ││ I ├───────────────────────────────┤ ┌───┬───────────────────────────────┐ │ ││ C │ $classes : @[ ... ] ───────┼───▶│ A │ @"SpecialClass" │ │ ││ T │ │ │ R ├───────────────────────────────┤ └─────┘└───┴───────────────────────────────┘ │ R │ @"GenericClass" │ ... │ A ├───────────────────────────────┤ ┌─────┐┌───┬───────────────────────────────┐ │ Y │ @"NSObject" │ │ ││ D │ $classname : @"Other" │ └───┴───────────────────────────────┘ │ 9 ││ I ├───────────────────────────────┤ │ ││ C │ $classes : @[ ... ] │ │ ││ T │ │ └─────┘└───┴───────────────────────────────┘ ┌───┬───────────────────────────────┐ ━━━$top━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━▶│ D │ root : UID[3] │ │ I ├───────────────────────────────┤ │ C │ │ │ T │ │ └───┴───────────────────────────────┘
Each entry in $objects
will be one of the following:
- A primitive type, such as a string or a chunk of data.
- A dictionary that represents a class definition. This will have a
$classname
key that gives the name of the class, and a$classes
key that gives the inheritance hierarchy of the class, as an array of class names leading up to (usually)NSObject
. - A dictionary that represents an object instance. This will have a
$class
key that points to the class definition, and some number of other keys that point to primitive types and/or other object instances. - A special value like
$null
(which representsNSNull
).
When we say that a key “points to” a primitive or object instance, that means the value is
a CFKeyedArchiverUID
whose integer value is the index of the target entry in the $objects
array. In other words, the indexes of the $objects
array imply the object UIDs.
The $top
key defines (unsurprisingly) the topmost object of the archive, which usually has only
a single key root
(a.k.a. NSKeyedArchiveRootObjectKey
), which uses a
CFKeyedArchiverUID
to point to the root object in the $objects
array, and then
everything else in the graph is connected from there.