Archaeology
URL Bookmarks and Security-scoping
Much of this discussion is based on reverse-engineering of file formats and frameworks, but we haven't bothered to pepper it with qualifiers. Since our reverse-engineering skills are not beyond reproach, and macOS is always changing, a grain of salt is advised. If you have corrections to any details, please do get in touch.
What Is A URL Bookmark?
As introduced in Mac OS X 10.6 (Snow Leopard), a URL bookmark is a serialization of a
file:
URL, together with additional data that improves the chances of that URL being
usefully rebuilt later — even if the actual file has been renamed or moved in the interim.
In addition to the path itself, a bookmark contains inode and volume information, for example.
In Mac OS X 10.7 (Lion), to support the App Sandbox, security-scoped URL bookmarks were introduced. But in order to understand these, we need to take a detour into security-scoped URLs, which requires another detour into sandbox extensions.
Before diving into this, note that security-scoped bookmarks and security-scoped URLs are not the same thing — they are related and you can make one from the other, but there are valid reasons to, say, make a non-security-scoped bookmark from a security-scoped URL. So don't let the overuse of the term security-scoping trip you up.
A Detour Into Sandbox Extensions
A sandboxed process has a detailed list of capabilities that it is allowed or denied, such as being able to open specific files for reading and/or writing. Broadly, sandboxed processes are allowed to read system files (e.g. the root-level System or Library folders) but not files under your home folder (except for being able to read and write files inside their own container).
Typically, a sandboxed app gains access to a specific user file by the user selecting it in a standard macOS Open dialog. The Open dialog is controlled by a macOS service (com.apple.appkit.xpc.openAndSavePanelService.xpc), which is not sandboxed and has full access to user files. In order to transfer that access to the requesting sandboxed app — for the selected user file only — it uses a sandbox extension.
More generally, any process that has the ability to read or write a specific file might need to transfer that ability to a related (sandboxed) process, such as an XPC service that it uses to process the file in some way. The original process might've acquired that ability in various ways — whether through the Open dialog, or simply by virtue of not being sandboxed itself — but as long as it can access the file, it can transfer that ability to another process using a sandbox extension.
Essentially, a sandbox extension is a token, vended by the kernel, which allows any process possessing it to acquire a specific capability — such as being able to open a specific path as read-only or read-write. (Actually, extensions can be limited to a specific process, but these are less interesting to this discussion.) The original process asks the kernel to issue an extension, and hands the resulting token to some other (sandboxed) process, which asks the kernel to consume the extension, granting it the encapsulated capability.
All of this happens underneath the public APIs for security-scoped URLs, as we'll discuss below. The private API involved here is mainly in libsystem_sandbox.dylib, which has functions likesandbox_extension_issue_file()
andsandbox_extension_consume()
. There are also extensions that are not file-related, such assandbox_extension_issue_mach()
, which extends the capability to look up a Mach service by name; but only file-related extensions are relevant to this discussion. These sandbox functions are basically shims around a system call, which goes into the kernel and gets handled by Sandbox.kext.
The sandbox extension token itself is actually formatted as a string, which might look something like this:
1bfe955dde5d40a9395dd9f9687c9aabff654f7f3cb99b71b24357557f1e3377;00;00000000;00000000;00000000;0000000000000020;com.apple.app-sandbox.read-write;01;01000005;0000000000c23f8e;23;/users/randy/desktop/todo.txt
You can see that it has a capability (com.apple.app-sandbox.read-write
) and a (downcased) file path
(/users/randy/desktop/todo.txt
). The other semicolon-delimited fields contain various information about
the extension and the specific file (such as the volume and inode of the file), but that's all beyond the scope of
this discussion.
The first hex-encoded value is worth mentioning, though: this is a message authentication code that authenticates the extension as valid. Specifically, it is an HMAC-SHA256, which is calculated on the remainder of the token string (from the first semicolon), using a 64-byte secret key that is randomly chosen by Sandbox.kext after startup. Obviously, the kernel will refuse to consume an extension unless this HMAC is deemed correct, per the private-to-the-kernel secret key.
The implication here is that sandbox extensions are transient: they survive the process that issued them (at least, extensions of the non-process-specific variety), but will be useless after system restart. Which is why Security-scoped bookmarks are a thing...
What Is A Security-scoped URL?
Now that we understand sandbox extensions, we can say that a security-scoped URL is simply an
NSURL
(or CFURL
) that also carries a sandbox extension token, which grants accesss
(read-only or read-write) to the named file path.
The sandbox extension is carried as a URL resource property named _NSURLSecuritySandboxExtensionKey
,
which as you might guess from the leading underscore, is strictly private. But you can (if you're not worried
about App Store review or other Apple validations) query it like this:
NSURL* theURL; id value [theURL getResourceValue:&value forKey:@"_NSURLSecuritySandboxExtensionKey" error:NULL]; // value will be an NSData, which is just a UTF-8 encoded string
The public API for a security-scoped URL consists of two methods that wrap actual access to the file, like so:
NSURL* theURL; if ( [theURL startAccessingSecurityScopedResource] ) { // access the file here [theURL stopAccessingSecurityScopedResource]; }
Basically, -startAccessingSecurityScopedResource
fetches the sandbox extension from the resource property,
and asks the kernel to consume it. If that works, the process uses the granted capability to read or write the file.
Then -stopAccessingSecurityScopedResource
is used to relinquish the capability (which is tracked in the
kernel and would otherwise cause a memory leak).
Returning to Security-scoped Bookmarks
So with all that backstory, what actually is a security-scoped bookmark? You might think it is simply a URL bookmark in which the sandbox extension is saved, but is absolutely not that, because that would only be useful until the system is restarted, at which point the extension becomes useless.
A security-scoped bookmark is a way for an app that has access to a specific file — such as by virtue of a security-scoped URL — to save that access and regain it again later, even if the app has been quit and reopened — or the system has been restarted — in the interim.
In order to create a security-scoped bookmark, an app starts with an NSURL
that gives it
access to the file of interest — either because that URL is security-scoped (and
-startAccessingSecurityScopedResource
has been sent), or because the app is not sandboxed at all.
When the app asks Foundation to make a security-scoped bookmark for the URL (using the
NSURLBookmarkCreationWithSecurityScope
option), the app's access is first validated, and then
the request is sent to ScopedBookmarkAgent, which is the (unsandboxed) macOS service that
is responsible for creating and resolving these bookmarks.
The ScopedBookmarkAgent creates a normal bookmark for the URL, but also calculates a security scope cookie, which is a SHA-256 digest that identifies the “scope” for which the bookmark should later be resolved (thus granting access to the file). We'll return to what constitutes a scope momentarily.
Later, the app holding the bookmark data asks Foundation to resolve it back into an NSURL
(using
the NSURLBookmarkResolutionWithSecurityScope
option). This request also gets sent over to
ScopedBookmarkAgent, which validates that the security scope cookie in the bookmark matches the scope of the resolution
request. If the scope is valid, the agent issues a new sandbox extension for the file, and adds that as a resource property
in the new NSURL
. This now security-scoped URL is sent back to the app, which can use it to access the file.
The above is vague about the nature of the security scope cookie, because there are actually two forms of scoping, although one is way more common than the other...
Security Scope Cookie for App-scoped Bookmarks
Almost every security-scoped bookmark we've ever seen is of the app-scope type. These are scoped to a specific app as run by a specific user. (An app run by user X might have access to a file on that user's desktop, but this doesn't give even the same app access to that file when run by user Y.)
For an app-scoped bookmark, ScopedBookmarkAgent first calculates a crypto key from two pieces of data:
- The code signing identifier of the requesting app. This is almost always the same as the app's bundle identifier, but is fetched from the code signature directly.
- A user-specific 32-byte secret key, which is randomly chosen by ScopedBookmarkAgent and stored in your
keychain. (You can find this key in Keychain Access, by searching for an item named
com.apple.scopedbookmarksagent.xpc
. The key is chosen the first time that a scoped bookmark is created for the user, so is quite long-lasting.)
An HMAC-SHA256 is made of the code signing identifer, by using the user-specific secret as the key, to create the 32-byte crypto key.
Then, the actual security scope cookie is calculated as an HMAC-SHA256 of the bookmark data, using the above crypto key. The resulting 32-byte value is the security scope cookie, which is added to the bookmark data. (These 32 bytes are always present in the bookmark data, but they are zeroed out before calculating the HMAC to avoid any circularity.)
Security Scope Cookie for Document-scoped Bookmarks
The other kind of security-scoped bookmark is the document-scope type. In theory, this is supposed to grant access to any process (and any user) that can access a specific document. For example, perhaps a document references some external media file, and you want that media file to be accessible to any user (and any app) that can access the document itself.
We're honestly not sure how or if this is actually used, but for completeness, we'll mention that, in this
case, the equivalent crypto key is randomly chosen and attached to the document file as an extended attribute,
with the name com.apple.security.private.scoped-bookmark-key
. As above, this crypto key is
used in an HMAC-SHA256 of the bookmark data, to yield the security scope cookie.
Assuming that the document can be shipped to another user, and that the extended attribute gets preserved, and that the referenced file can still be found by that user, the security-scoped bookmark can be resolved to provide access.
What About Non-security-scoped Bookmarks for Security-scoped URLs?
As mentioned above, a sandbox extension can be used to transfer a capability to another process, such as an XPC service. But how does one do this with the public API? This is where non-security-scoped bookmarks come in handy.
When you ask NSURL
to create a non-security-scoped bookmark (i.e. omitting the
NSURLBookmarkCreationWithSecurityScope
option), the bookmark will contain a sandbox
extension for the file, with whatever access the calling process has (i.e. read-write or read-only).
This sandbox extension is newly issued at bookmark creation time, and is non-process-specific,
regardless of whether the URL itself is security-scoped or if the calling process is simply not sandboxed.
When this bookmark data is sent to (say) an XPC service, and that process goes to resolve it,
the sandbox extension will be preserved in the resulting NSURL
, and the service can
now use it like any other security-scoped URL. Of course, if the service needs to persist access
past restart, it would need to make a new, security-scoped bookmark, but that's no different from
any other sandboxed process.
Note that it won't work to simply useNSKeyedArchiver
on theNSURL
, because the sandbox extension resource property will not be preserved. Nor will it work to send a security-scoped bookmark, because the receiving process will have a different code signing identifier and thus won't be allowed to resolve the bookmark, even as the same user. A non-security-scoped bookmark for a security-scoped URL is the right way to do this, even though the overuse of the term “security-scoped” makes it sound dubious.
The Bookmark Binary Format
Based on our reverse-engineering, the bookmark binary format has the following structure.
We inferred this by examining bookmark files and by some amount of reversing of CoreFoundation, CoreServicesInternal and /System/Library/CoreServices/ScopedBookmarkAgent, mostly on macOS 10.15. The implementation may have changed since then, but as far as we know, this is still accurate.
The bookmark data starts with a fixed-length prolog in this form:
struct CFBookmarkProlog { uint32_t _magic; // "book" as char[4] or 0x6b6f6f62 as Little Endian uint32 uint32_t _bookmarkLength; // total length of the bookmark data, including prolog uint32_t _version; // 0x10040000, at least as of macOS 10.15.7 uint32_t _prologLength; // size of entire prolog, including cookie, currently 0x30 uint8_t _securityScopeCookie[ CC_SHA256_DIGEST_LENGTH ]; };
All of the integers here appear to be strictly Little Endian.
The _securityScopeCookie
field is used as discussed above; if the
bookmark is not security-scoped, this will be all zeroes.
The prolog is followed by an offset (in bytes from the end of the prolog) to the first
CFBookmarkTOC
. A number of other references are encoded as payload-relative
offsets, which also means a number of bytes from the end of the prolog, so we call this point the
CFBookmarkPayload
:
struct CFBookmarkPayload { uint32_t _offsetOfFirstTOC; // payload-relative offset to first CFBookmarkTOC };
Next come a variable number of CFBookmarkDataItem
s, each with a type and size:
struct CFBookmarkDataItem { uint32_t _dataSize; // i.e. byte length of _data[] CFBookmarkDataType _dataType; // see below uint8_t _data[ _dataSize ]; // the data (but it can be zero in size for some types) } __attribute__( ( aligned( 4 ) ) ); // plus zero padding (not included in _dataSize) to dword-align the next data item
Note that the _dataSize
can be zero for certain types. Each CFBookmarkDataItem
is padded to 32-bit alignment, but the specified _dataSize
does not include any such padding.
The _dataType
will be one of the following, with the implied contents of _data
for each shown below:
typedef enum : uint32_t { // CFBookmarkDataItem->_data will be: CFBookmarkDataTypeString = 0x101, // UTF-8 string (not NULL-terminated but length is _dataSize) CFBookmarkDataTypeData = 0x201, // simple data buffer, e.g. becomes a CFData CFBookmarkDataTypeNumber = 0x300, // general numeric type, where subtype corresponds to the CFNumberGetType(), e.g.: CFBookmarkDataTypeUInt32 = 0x303, // _data from CFNumberGetValue() with kCFNumberSInt32Type CFBookmarkDataTypeUInt64 = 0x304, // _data from CFNumberGetValue() with kCFNumberSInt64Type CFBookmarkDataTypeDate = 0x400, // CFDateGetAbsoluteTime(), swapped with CFConvertDoubleHostToSwapped() CFBookmarkDataTypeBoolFalse = 0x500, // nothing (_dataSize==0) CFBookmarkDataTypeBoolTrue = 0x501, // nothing (_dataSize==0) CFBookmarkDataTypeArray = 0x601, // ( _dataSize / sizeof( uint32_t ) ) payload-relative offsets to CFBookmarkDataItems CFBookmarkDataTypeDictionary = 0x701, // ( _dataSize / 2 * sizeof( uint32_t ) ) payload-relative offsets to CFBookmarkDataItems, // with keys and values alternating CFBookmarkDataTypeUUID = 0x801, // bytes of a UUID (probably, not seen in practice) CFBookmarkDataTypeURL = 0x901, // the URL as a UTF-8 string CFBookmarkDataTypeRelativeURL = 0x902, // 2 payload-relative offsets to CFBookmarkDataItems, first a CFBookmarkDataTypeURL for the base URL, // second a CFBookmarkDataTypeString for the relative path (but not seen in practice) } CFBookmarkDataType;
Some types are defined such that byte 1 is a primary type (e.g. number or URL) and byte 0 is a subtype (e.g. number type, absolute or relative URL).
These CFBookmarkDataItem
s constitute the values of the bookmark data. These are
then referenced by a table of contents (or possibly multiple TOCs). The TOC is essentially
a set of key-value pairs, with the values (and possibly some keys) being defined in terms of CFBookmarkDataItem
s.
The first CFBookmarkTOC
is found via CFBookmarkPayload->_offsetOfFirstTOC
, as noted above.
Each TOC starts with this header:
struct CFBookmarkTOC { uint32_t _unknown1; uint32_t _sentinel; // always 0xfffffffe uint32_t _unknown2; uint32_t _offsetOfNextTOC; // payload-relative offset of next TOC, or zero if none uint32_t _tocItemCount; // number of CFBookmarkTOCItems that follow };
The CFBookmarkTOC
is followed by _tocItemCount
of CFBookmarkTOCItem
s,
each of which is basically a key-value pair:
struct CFBookmarkTOCItem { uint32_t _itemKey; // see below uint32_t _itemValueOffset; // payload-relative offset to the CFBookmarkDataItem for this value uint32_t _unknown; // possibly flags? generally zero };
The _itemKey
here can take one of two forms. If the high bit is clear, the key is an enumerated value: we've
deduced a subset of these values as the CFBookmarkTOCItemType
below.
Alternatively, if the high bit is set, ( _itemKey & 0x7fffffff )
is a payload-relative offset to a
CFBookmarkDataItem
of type CFBookmarkDataTypeString
. This string key form
seems to be used for arbitrary CFURL properties of one kind or another.
Finally, here is an undoubtedly incomplete sample of enumerated _itemKey
values:
typedef enum : uint32_t { // Attributes of the referenced file itself CFBookmarkTOCItemTypePathComponents = 0x1004, // array of strings for each component of the URL path CFBookmarkTOCItemTypeInodeComponents = 0x1005, // array of integers for the inodes corresponding to each path component CFBookmarkTOCItemTypePropFlags = 0x1010, // data from _CFURLGetResourcePropertyFlags() CFBookmarkTOCItemTypeCreateDate = 0x1040, // date file at URL created // Attributes of the volume that the file was on at bookmark creation time CFBookmarkTOCItemTypeVolumePath = 0x2002, // path from CFURLCopyFileSystemPath() on kCFURLVolumeURLKey, e.g. "/" CFBookmarkTOCItemTypeVolumeURL = 0x2005, // kCFURLVolumeURLKey CFBookmarkTOCItemTypeVolumeName = 0x2010, // kCFURLVolumeNameKey (the visible one, not the APFS Data volume name) CFBookmarkTOCItemTypeVolumeUUID = 0x2011, // kCFURLVolumeUUIDStringKey (but as a string, *not* as a UUID type) CFBookmarkTOCItemTypeVolumeCapacity = 0x2012, // kCFURLVolumeTotalCapacityKey as integer CFBookmarkTOCItemTypeVolumeCreateDate = 0x2013, // creation date of the *volume*, kCFURLCreationDateKey CFBookmarkTOCItemTypeVolumePropFlags = 0x2020, // data from _CFURLGetVolumePropertyFlags() CFBookmarkTOCItemTypeVolumeStartup = 0x2030, // true if boot volume (at least at bookmark creation time) // Attributes of the user for whom the bookmark was created CFBookmarkTOCItemTypeUserHomeDepth = 0xc001, // count of path components under home directory CFBookmarkTOCItemTypeUserName = 0xc011, // CFCopyUserName() CFBookmarkTOCItemTypeUserID = 0xc012, // _CFGetEUID(), so really the euid, but mostly the same // Attributes of bookmark creation itself CFBookmarkTOCItemTypeCreateOptions = 0xd010, // the original CFURLBookmarkCreationOptions // Other attributes CFBookmarkTOCItemTypeRWSandboxExtension = 0xf080, // a re-issued non-pid-specific com.apple.app-sandbox.read-write CFBookmarkTOCItemTypeROSandboxExtension = 0xf081, // a re-issued non-pid-specific com.apple.app-sandbox.read } CFBookmarkTOCItemType;
What About macOS Alias Files?
As far as we know, macOS aliases are not a form of URL bookmark. They also have
book
as their first 4 bytes, but the rest of the prolog doesn't match (though, oddly,
the third group of 4 bytes is mark
). Perhaps there is some relationship here, but we haven't
found it, and it definitely doesn't match the above binary format.
Of course, you can use a bookmark to create an alias: use the NSURLBookmarkCreationSuitableForBookmarkFile
option to create a bookmark, and then feed that into +[NSURL writeBookmarkData:toURL:options:error:]
.