Bringing Explicit Pipeline Caching Control to Vulkan

August 27, 2024 by Shahbaz Youssefi - Google vulkan

The Vulkan® Working Group has released the VK_KHR_pipeline_binary extension, enabling direct retrieval of binary data associated with individual pipelines, bypassing the VkPipelineCache mechanism, and enabling applications to explicitly manage pipeline caching.

VkPipelineCache objects were designed to enable a Vulkan driver to reuse blobs of state or shader code between different pipelines. Originally, the idea was that the driver would know best which parts of state could be reused, and applications only needed to manage storage and threading, simplifying developer code.

Over time however, VkPipelineCache objects proved to be too opaque, prompting the Vulkan Working Group to release a number of extensions to provide more application control over them. The current capabilities of VkPipelineCache objects satisfies many applications, but has shortcomings in more advanced use cases.

In particular, VkPipelineCache objects can be difficult to work with in the following scenarios:

Bounded cache size and trimming: The VkPipelineCache API provides no control over the lifetime of the binary objects that it contains. An application wanting to implement an LRU cache, for example, has a hard time using VkPipelineCache objects.
Integration with the application cache: Some applications maintain a cache of VkPipeline objects. The VkPipelineCache API makes it impossible to efficiently associate the cached binary objects within a VkPipelineCache object with the application’s own cache entries.

What’s more, most drivers maintain an internal cache of pipeline-derived binary objects. In some cases, it would be beneficial for the application to directly interact with that internal cache, especially on some specialized platforms as explained below.

When considering how to address these advanced needs, the Vulkan Working Group decided that making the existing VkPipelineCache API more complex by adding many more knobs is not a sustainable solution. Instead, the new VK_KHR_pipeline_binary extension introduces a clean new approach that provides applications with access to binary blobs and the information necessary for optimal caching, while smoothly integrating with the application’s own caching mechanisms.

It’s worth noting that the VK_EXT_shader_object extension already includes analogous functionality to VK_KHR_pipeline_binary. The two extensions were worked on concurrently to provide a universally available solution, including devices where the VK_EXT_shader_object extension cannot yet be supported.

Applications that do not need the advanced functionality of the new VK_KHR_pipeline_binary extension can continue to use VkPipelineCache objects for their simplicity and optimized implementation. But developers that are not satisfied with the VkPipelineCache API should read on to learn more about this powerful new approach..

Caching With VK_KHR_pipeline_binary

To understand VK_KHR_pipeline_binary, let’s first see how drivers deal with caching. There is the VkPipelineCache object, of course, but some Vulkan implementations (whether driver or layer) also maintain an internal cache of blobs generated from pipelines!

This internal cache increases performance for applications that may not be efficiently using VkPipelineCache objects, allows the driver to share blobs between seemingly unrelated pipelines (even if the same VkPipelineCache object is not provided when creating them), and enables pre-population of the cache on some specialized platforms.

Either way, there are two parts to caching: generating blobs and caching them! Let’s investigate blob generation first.

When the driver creates a pipeline, it may generate multiple binary blobs. Some are a function of a stage’s shader code and pipeline static state, some additionally take the interface of the surrounding shader stages into account, while others are purely a function of the pipeline static state. For example, in a pipeline with only vertex and fragment shaders, with static vertex input and blend states, the following blobs may be derived.

Diagram illustrating derivation of objects

Note that when creating complete pipelines, some of these blobs may be merged (for example by using state blobs to patch the shader blobs), but that is not relevant to this discussion.

Many pipelines may be using the same vertex shader for example, so Blob2 in the example above could easily be shared between many pipelines. Many pipelines most likely share the same vertex input and fragment output state, so Blob1 and Blob4 are even more likely to be shared between many pipelines. The above is the reason why the VK_EXT_graphics_pipeline_library and VK_EXT_shader_object extensions work as well as they do.

Now generating the blobs is one thing, but for caching to be effective, the driver has to be able to look blobs up in the cache before generating them! For that, a key needs to be generated for each blob. As can be observed in the above, the key for each blob depends on different bits of the pipeline CreateInfo struct.

So, the most important aspect of this extension is in fact about retrieving the right keys in the most optimal way. This blog post gives an overview of the API, but please refer to the original proposal document for more details.

To obtain the key and data for each blob, a pipeline should be created with the VK_PIPELINE_CREATE_2_CAPTURE_DATA_BIT_KHR flag. Then, vkCreatePipelineBinariesKHR() can be used to create VkPipelineBinaryKHR objects (encapsulating a blob) out of the pipeline, and vkGetPipelineBinaryDataKHR() can be used to retrieve their keys and contents. In this case, VkPipelineBinaryCreateInfoKHR::pipeline is used when creating VkPipelineBinaryKHR objects.

VkPipelineBinaryKHR diagram illustrating key and data from blob 1

The application can then store these key/data pairs in any way it prefers. Some applications may already create a cache that’s keyed by the pipeline CreateInfo (or information from which the CreateInfo is derived), so they could associate the retrieved blobs with that same info. In that case, it is important to note that if different pipelines generate the same blob, those blobs may be associated with the same key and the application should deduplicate the blobs accordingly.

The VK_KHR_pipeline_binary extension provides a convenience function for applications that don’t already have their own cache based on the pipeline CreateInfo. The vkGetPipelineKeyKHR() function will generate a key for the pipeline based on its CreateInfo, so the blobs can be associated with that. This is completely optional, but bear in mind that the driver-generated key would skip state that does not affect the binary blobs and so may be a better choice than an application-generated key.

On the next run of the application, where the cache is warm and blobs are retrieved from persistent storage, the application can attempt to create the pipeline purely from its blobs. That can be done by chaining VkPipelineBinaryInfoKHR to the pipeline CreateInfo, in which case the shader modules can be omitted from the CreateInfo (which implies that the application wouldn’t need to load the SPIR-V at all). In this case, VkPipelineBinaryCreateInfoKHR::pKeysAndDataInfo is used when creating VkPipelineBinaryKHR objects.

VkPipelineBinaryKHR diagram illustrating key and data into blob 1

And that’s about it, the application is free to manage these blobs as desired. A summary of the above can be found in the diagrams below.

Creating a pipeline and retrieving binaries:

Complex diagram illustrating creation of pipeline and retrieving binaries

Creating a pipeline from binaries in the cache:

Complex diagram illustrating creation of pipeline from binaries in the cache

Validity of Blobs

The binary blobs retrieved from the pipeline won’t last forever. Driver updates for example can result in changes to the contents of the blobs, or in some cases significantly change how the pipeline is split into blobs. Additionally, the blob contents may differ based on globally enabled Vulkan features such as robustness or protected memory.

To determine whether a binary blob is valid for the device, a global pipeline binary key can be retrieved. As long as that global key matches the global key of the device from which the pipeline binary was retrieved, that pipeline binary is valid and can be used to create pipelines.

The global key can be simply retrieved by a call to vkGetPipelineKeyKHR() without providing a pipeline CreateInfo.

The Driver’s Internal Cache

With the VK_KHR_pipeline_binary extension extension, an application is able to efficiently reproduce the driver’s internal cache while simultaneously integrating it with its own cache. However, there are certain situations where the existence of an internal cache in an implementation could still be preferred.

One benefit of exposing the internal cache as done by this extension is that applications are able to asynchronously create binary objects ahead of time, effectively pulling in the cache contents from disk, and avoid micro-stutters during pipeline creation which would have otherwise had to perform disk I/O. Additionally, this allows an application to know ahead of time which pipeline binaries are missing and would later require compilation.

Specialized Platforms and Internal Cache Properties

Specialized platforms may provide additional capabilities that make the internal cache compelling. For example, Steam has an awesome feature for Vulkan games where the pipeline cache contents are distributed between clients. That means that even the first run of a game can observe and benefit from a warm cache! In that case, the application saves significant pipeline creation time by using the internal cache of the implementation (which is a Steam-provided Vulkan layer).

A number of properties are exposed by the VK_KHR_pipeline_binary extension for this purpose:

pipelineBinaryInternalCache declares that an internal cache exists. Applications are actually able to retrieve blobs from this internal cache directly, without first creating a pipeline. In this case, VkPipelineBinaryCreateInfoKHR::pPipelineCreateInfo is used when creating VkPipelineBinaryKHR objects. This operation may fail if the needed blobs are not in the cache.
pipelineBinaryInternalCacheControl indicates that the internal cache can be disabled. It is wasteful for both the application and the implementation to maintain effectively the same cache, so the application can disable the driver’s internal cache if this property is true. That can be achieved by chaining VkDevicePipelineBinaryInternalCacheControlKHR to VkCreateDeviceInfo.
pipelineBinaryPrefersInternalCache indicates whether the Vulkan implementation prefers that the application uses the implementation’s internal cache rather than its own cache.
pipelineBinaryPrecompiledInternalCache indicates that the internal cache may contain binaries that the application has never produced before (as described in the Steam feature above). In that case, applications are encouraged to try to retrieve the blobs from the internal cache.

On specialized platforms with an internal binary cache, the application should prefer using the internal cache due to the additional features provided by the platform as indicated by pipelineBinaryPrefersInternalCache. Otherwise, if the application is maintaining the cache, it should disable the implementation’s internal cache if possible to avoid duplication of work. If the application is not maintaining the cache, it may choose to benefit from using the implementation’s internal cache.

Conclusion

This extension was several years in the making, with heroic efforts from multiple Vulkan Working Group members to mold it into a form that empowers developers to solve their problems without sacrificing application performance. We are excited to share the result of that work with you in the form of the VK_KHR_pipeline_binary extension and curious to see what the community builds around it and how applications benefit from them.

We are always seeking and listening to feedback. Please do not hesitate to start a discussion on Vulkan Discord, or in any of the support forums available at vulkan.org, and you’re invited to add your comments below.