* NAS-like responsiveness: File directory browsing must be as responsive as a local NAS. In order to do this, not only should the active data be cached locally, but the metadata of all files, not just cached files, must also be cached in SSD at all sites. SSDs are necessary as the user is seeing a full representation of all the files in the entire file system even though less than 5 percent of the files are locally cached. When the user navigates up and down files and folders in the network drive, it has to "feel" like all those files are there. As a portion of the file metadata is often displayed along side the file name, and file locking has to be instantaneous for any file even if not locally cached, metadata has to accessed as fast as possible. Without all the file metadata in cache, users think that their computer or network is running slow as navigating a folder is one of the most basic functions.
* Support for "chatty" applications: Applications must work across sites as well as they work at a single site. Many technical applications (CAD, PLM, BIM) are extremely chatty, which normally increases the time to open, save, or sync a file from less than 30 seconds on a local NAS to over 20 minutes when centralized in the cloud. Most people think this is a bandwidth issue, but in fact it is because the applications are very chatty.
For example, a common CAD application has nearly 16,000 sequential file operations that need to occur before a file is opened. If the authoritative copy is on the same LAN, the file lock is only 0.5 ms away so the file opens in 8 seconds (16,000 * 0.5 milliseconds). However, chattiness causes massive delays if over a WAN. If a file that is centralized in Syracuse was opened from San Diego, the file lock is 86 milliseconds away (the round trip latency from San Diego to Syracuse), so it takes 16,000 * 86 milliseconds to for the file to open -- approximately 22 minutes. The actual data transfer is a fraction of the 22 minutes.
* Data integrity and cross-site locking. When data lives on a file server, we only have to worry about maintaining one consistent copy (as long as the file is locked when a user is editing it). This changes when data lives in the cloud but is accessed from many sites. To avoid file corruption when using cloud storage, you need two things:
- A clear separation between the authoritative copy of data in the cloud and the local cache copy at each site. A "transactionally consistent" file system can maintain file integrity even if there's a hardware or power failure -- without falling back on a file system check or earlier file version. This assures data integrity in a distributed environment.
Sign up for CIO Asia eNewsletters.