It also behaves like a microservice in that it has REST endpoints. It has backend actions that many workflows use and automated use cases in which users and applications do not directly deal with files and folders. The REST endpoints and the POSIX interface can co-exist in any Netflix Drive instance. They are not mutually exclusive.
We built Netflix Drive as a generic framework so that users can plug in different types of data and metadata stores. For example, you could have Netflix Drive with DynamoDB as the metadata-store backend, and S3 as a data-store backend. You could also have MongoDB and Ceph Storage as the backend data stores and metadata stores. For a more detailed presentation of this framework, you can watch the full video presentation.
Netflix is, in general, pioneering the concept of an entertainment studio in a cloud. The idea is to allow artists to work and collaborate across the world. To do so, Netflix needs to provide a distributed, scalable, and performant platform infrastructure.
From the starting point of ingestion, when cameras record video (produce the data), until the data makes its way to movies and shows, these assets get tagged with a variety of metadata by different systems, based on the workflow of the creative process.
At the edge, where artists work with assets, the artists and their applications expect an interface that allows seamless access to these files and folders. This easy workflow is not restricted only to artists, but extends to the studio. A great example is the asset transformations that happen during the rendering of content, which uses Netflix Drive.
Studio workflows need to move assets across various stages of creative iterations. At each stage, an assert gets tagged with new metadata. We needed a system that could support the addition of different forms of metadata to the data.
We also need levels of dynamic access control that can change in each stage, so that the platform projects only a certain subset of assets to certain applications, users, or workflows. We investigated AWS Storage Gateway, but its performance and security aspects did not meet our requirements.
We came up with the design of Netflix Drive to satisfy all of these considerations in multiple scenarios. The platform can serve as a simple POSIX file system that stores data on and retrieves data from the cloud, but it has a much richer control interface. It is a foundational piece of storage infrastructure that supports the needs of many Netflix studios and platforms.
The POSIX interface (figure 2) allows simple file-system operations on files, such as creation, deletion, opening, renaming, moving, etc. This interface deals with the data and metadata operations on Netflix Drive. Files stored in Netflix Drive get read, write, create, and other requests from different applications, users, and scripts or workflows that do these operations. This is similar to any live file system.
As mentioned, events (figure 4) are of primary importance in the Netflix Drive architecture, and events contain telemetry information. A great example of this is the use of audit logs that track all actions that different users have performed on a file. We might want services running in the cloud to consume audit logs, metrics, and updates. Our use of a generic framework allows different types of event backends to plug into the Netflix Drive ecosystem.
For performance reasons, Netflix Drive does not deal with sending the data directly to the cloud. We want Netflix Drive to perform like a local file system as much as possible. So we use local storage, if available, to store the files and then have strategies for moving the data from the local storage to cloud storage.
Overall, the Netflix Drive architecture has the POSIX interface for data and metadata operations. The API interface deals with different types of control operations. The event interface tracks all the state-change updates. The data-transfer interface abstracts moving the bits in and out of Netflix Drive to the cloud.
Intrepid is the transport layer that transfers the bits from and to Netflix Drive. Intrepid is an internally developed high-leverage transport protocol used by many Netflix applications and services to transfer data from one service to another. Intrepid not only transports the data but transfers some aspects of the metadata store as well. We need this ability to save some states of the metadata store on the cloud.
Because we are using a FUSE-based file system, libfuse handles the different file-system operations. We start Netflix Drive and bootstrap it with a manifest, along with the REST APIs and control interface.
The workstation machine has the typical Netflix Drive API and POSIX interface. Netflix Drive on a local workstation will use the transport agent and library to talk to the metadata store and the data store.
Security is a concern in Netflix Drive. Many applications use these cloud services; they front all of the corpus of assets in Netflix. It is essential to secure these assets and to allow only users with proper permissions to view the subset of assets that they are allowed to access. So, we use two-factor authentication on Netflix Drive.
Security is built as a layer on top of our CockroachDB. Netflix Drive leverages several security services that are built within Netflix at this point. We don't have external security APIs that we can plug in. We plan to abstract them out before we release any open-source version so that anyone can build pluggable modules to handle that.
We initially bootstrap Netflix Drive using a manifest, and that initial manifest could be empty. We have the ability to allow workstations or workflows to download assets from the cloud and preload the Netflix Drive mount point with this content. Workflows and artists modify these assets, which Netflix Drive will periodically snapshot with explicit APIs or use the auto-sync feature to upload these assets back to the cloud.
Different types of applications and workflows use Netflix Drive, and the persona of each supplies it with its particular flavor. For example, one application may rely specifically on the REST control interface because it is aware of the assets and so will explicitly use APIs to upload files to the cloud. Another application may not necessarily be aware of when to upload the files to the cloud, so would rely on the auto-sync feature to upload files in the background. These are the sorts of alternatives that each persona of Netflix Drive defines.
Figure 9 shows a sample bootstrap manifest. After defining the local storage, Netflix Drive manifests the instances. Each mount point can have several distinct instances of Netflix Drive, and here we see two in use: a dynamic instance and a user instance, each with different backend data stores and metadata stores. The dynamic instance uses a Redis metadata store and S3 for the data store. The user instance uses CockroachDB as a metadata store and Ceph for the data store. Netflix Drove assigns a unique identity to each workspace for data persistence.
The namespace of Netflix Drive is all the files that are viewed inside it. Netflix Drive can create the namespace statically or dynamically. The static method (figure 10) specifies at bootstrap time the exact files to pre-download to the current instance with. For this, we present a file session and container information. Workflows can pre-populate a Netflix Drive mount point with files, so that the subsequent workflows can then be built on top of it.
The dynamic way to create a namespace is to call Netflix Drive APIs in the REST interface (figure 11). In this case, we use the stage API to stage the files and pull them from cloud storage, then attach them to specific locations in the namespace. These static and dynamic interfaces are not mutually exclusive.
Figure 12 is an example of how a file is uploaded to the cloud, with the Publish API. We can autosave files, which would periodically checkpoint the files to the cloud, and we have the ability to perform an explicit save. The explicit save would be an API that different workflows invoke to publish content.
We always intended Netflix Drive to be a generic framework that could accept any data store and metadata store that someone wanted to plug into it. Designing a generic framework for several operating systems is difficult. After investigating alternatives, we decided to support Netflix Drive on CentOS, macOS, and Windows with a FUSE-based file system. That multiplied our testing matrix and our supportability matrix.
We work with disparate backends and have different layers of caching and tiering, and we rely on cached metadata operations. We built Netflix Drive to serve exabytes of data and billions of assets. Designing for scalability was one of the cornerstones of the architecture. We often think that the bottleneck of scaling a solution on the cloud would be the data store, but we learned that the metadata store is the bottleneck for us. The key to our scalability is handling metadata. We focused a lot on metadata management, on reducing the number of calls to metadata stores. Caching a lot of that data locally improved performance for the studio applications and workflows that are often metadata heavy.
Use of objects brings up the issues of deduplication and chunking. Object stores use versioning: every change of an object, no matter how small the change, creates a new version of the object. Traditionally, the change in one pixel of a file means sending the entire file and rewriting it as an object. You cannot just send the delta and apply that delta on cloud stores. By chunking one file into many objects, we reduce the size of the object that you have to send to the cloud. Choosing the appropriate chunk size is more of an art than a science, because many smaller chunks means managing a lot of data and a lot of translation logic, and the amount of the metadata will increase. Another consideration is encryption. We encrypt each chunk, so more, smaller chunks lead to many more encryption keys, and metadata for that. Chunk size in Netflix Drive is configurable.
90f70e40cf