Availability: The cloud file system
The key feature of a cloud controller is its file system. A main principle behind the cloud controller's file system is the physical and logical separation between payload data and metadata. In a traditional file system, a snapshot contains both payload data and metadata, and it's managed as a single, large chunk of information. With a cloud controller file system, metadata can be easily extracted from a snapshot and transported separately from the payload data while maintaining file system consistency. When clients interact with a file system, the bulk of their actions are actually metadata operations that do not require access to the payload data.
Navigating through directory structures, opening folders, looking through file lists, and sorting/searching for files based on attributes such as file name, file size, file type, date created, and date modified are all metadata operations. The user experience is greatly impacted by how quickly metadata operations occur, so most file systems cache metadata in RAM to speed response times. Thus, a core design principle in a cloud controller file system is to preserve the response time for metadata operations in a global deployment.
Since the cloud controller presents the same file system view through all controllers, the metadata at each controller must be kept in sync. The file system accomplishes this by taking frequent snapshots of the file system, extracting the metadata changes from the snapshot, and rapidly distributing those changes to all controllers that are members of the file system. Each cloud storage controller receives metadata updates from all other controllers and applies them to its own metadata. In this manner, the file system always appears the same regardless of which controller presents it.
When clients browse through the cloud controller file system, their user experience is identical to browsing a local file system because they are, in fact, browsing a local copy of the file system metadata. Thus, even if the actual file data exists on a controller in a different site or in the cloud, it is always possible to navigate the file system quickly.
Lock management is another crucial component in a global file system. Because multiple clients share access to a common file repository, there must be a mechanism that locks a file against simultaneous edits from multiple users. Once a user opens a file, the file is locked for editing to all others until the original user closes the file. An effective cloud storage controller includes a lock management system in which lock information is exchanged in real time among all controllers so that no two users ever contend for file editing privileges.
As for availability of files in the event of a cloud outage, cloud controllers can be set to automatically synchronize copies of files stored at two or more locations in the cloud, such as different Amazon sites. If one copy of the data becomes unavailable, the other copy is still on hand.
Sign up for CIO Asia eNewsletters.