Disaster-Resource.com

Is Your Data at Risk in the Cloud?

By Cameron Bahar, Chief Technology Officer, ParaScale, Inc.

Cloud storage is being offered up as the ultimate business continuity solution with built in disaster recovery. However, many professionals are unclear on this new technology and how it can be leveraged. This is further complicated by a growing number of vendors jumping on the bandwagon and claiming their solutions are cloud storage. The cloud can offer a safe and economical solution for business continuity and disaster recovery. This article will define the different types of cloud storage, strategies, and challenges associated with keeping your data in the cloud. Readers will come away with a clear understanding of where to apply this technology in their environments.

Public versus Private Cloud Storage
When considering cloud storage it is important to understand that there are two primary categories: private and public. Both are defined by three common characteristics: storage as service delivered over a network, scalability and manageability. What defines a cloud as private or public is control of the data and the network connection.

Public clouds are common infrastructures where many customers share physical resources with security provided via virtualization. The connection is typically over the internet using a web-based protocol such as REST or WebDAV. Public cloud storage is an on-demand resource where the user pays for consumption of resources. Typically this is charged in $s/GB of storage in the cloud per month PLUS a bandwidth fee for writing data to or reading data from the public cloud. Since the user is only billed for consumption there is no capital outlay necessary to begin using the cloud.

Private (or internal) clouds are typically deployed inside the enterprise and managed by internal employees. This is essentially the same technology that an Amazon or a Google uses, but now available for purchase and deployment in your own data center. Designed to scale to petabytes (PB) using commodity hardware without downtime, private clouds can emulate the scalability of tape silos without the need to buy and replace media. Private cloud storage uses a LAN or intranet connection providing a high speed data exchange. Private clouds can start at a few TBs and grow to PBs as the user adds commodity hardware. Some will argue that a private cloud is an oxymoron – that a cloud necessarily means that the infrastructure is owned by a third party. This is at this point a religious debate, and for lack of a better term we will stick with the term private cloud and instead focus on how IT organization can intelligently use these emerging technologies cost effectively for storage and DR.

Both private and public cloud storage solutions provide valuable options for business continuity and disaster recovery, each with use cases where one or the other is a better fit.

If we deploy a cloud, what should our provisions be for disaster recovery?
Public cloud storage is, by definition, at a different remote location(s) relative to the originating site and thus naturally enables off-site backup. Customers can copy their data to the cloud and they are protected against site failures or local disruption. There is no need to build a second data center or maintain administrators to manage the data. Instead you leverage the service provider’s infrastructure and manpower in exchange for a monthly bill.

Many of the larger public cloud providers offer global data distribution as an option, for an extra charge. This can be strategic to multi-national organizations that are looking for backups to be located in multiple geographies. Recovery can be performed from a regional cloud site, limiting long duration LAN latencies and high charges.

Public clouds also introduce challenges that should not be overlooked. The first challenge is a slower access to data over WANs. Remember the old adage, “who cares about backups, it’s the restore that matters.” Recovering multiple TBs over an internet connection can take days, and will likely break corporate recovery SLAs. When selecting a provider be sure to inquire about options for fast recovery by physically shipping hardware.

Security of data is another challenge that should be considered. Public clouds are shared resources. Providers use virtualization and other strategies to segment customer data but standards for measuring and enforcing this segmentation are just starting to emerge. Be sure you understand the details of how your data is secured, how to identify where your data is located if at all possible, and inquire about encryption and independent audits. In many cases, the customer may be better off to send an encrypted stream in the first place.

Disaster Recovery in a private cloud deployment is accomplished by having offsite copies on a second private cloud in a remote location or by integrating with a public cloud offering. Polices can be set to move data between locations so management is simple but a remote site must be acquired and connected. Moving between public and private clouds dynamically (sometimes called cloud bursting by the buzzword-happy) is bleeding edge, so spend more time in the due diligence.

Private cloud storage requires a base infrastructure to function. While modest compared to many enterprise endeavors, the investment is not zero as with public cloud offerings. Depending on your selected vendor, private clouds can start as small as a few TBs on as few as three commodity servers. Installation is simple but will require your team to setup the hardware and software before use.

Most private clouds provide data redundancy by creating multiple copies of each file and placing them on different hardware resources. Localized failures can be overcome by implementing policies that place the data in a robust and highly available way. For example, polices can be set so that copies of files are always on two different racks within the data center or building in a campus. Therefore, if power is down for a segment of the data center or campus, data is still available at local LAN speeds.

These policies can also include a remote capability as well. For example a user could set a policy that creates one copy in-building, a second copy in building two, and a third copy in the remote data center 1000’s of miles away. In the instance of a site level failure, all the data would be instantly available in the remote location enabling data center operations to continue as soon as the application processes failover. Recovery can be accelerated as well. When a rack or site comes back online copies can be pulled from the closest location in parallel from multiple nodes, all automatically and without administrator interaction.

Considerations when planning or deploying a cloud for business continuity and disaster recovery
Writing data to the public cloud will require integrating your applications with the service provider API or protocol and an internet connection. One strategy to overcome this is to work with cloud management and integration tools or vendors. As public cloud storage popularity continues to advance, there are more tools being provided to simplify this integration and more options for ease of integration. Nevertheless, many larger DR operations have established applications and processes and if the incumbent software is not public-cloud aware, the transition will require changing applications protocols and accounting for public-cloud file size limits.

Ownership of data in a public cloud is a legal matter without a clear answer and another very key consideration. The courts have yet to define if ownership is based on physical location or creation. There have been judicial rulings in analogous cases (cell phone records for example) that have set precedence for both arguments. If subpoenaed, the service provider will have to honor a legal request with or without your knowledge. Until this issue is clearly defined in law, it is a challenge that should be considered when selecting public cloud storage. In certain regulated businesses, this alone will disqualify public cloud storage until there is more legal clarity.

Private cloud storage is designed to support familiar enterprise protocols such as NFS, CIFS and FTP that are widely deployed. Backup applications have leveraged these protocols for a very long time so adapting existing technologies, processes and workflows is easy and seamless. A common strategy is to change backup targets from existing NAS to private cloud deployments. The result is better performance, economics and manageability for mission critical data protection without sacrificing business continuity and disaster recovery objectives.

In a large datacenter multiple media servers have potential to backup in parallel into a private cloud, with vastly greater parallel ingestion speeds versus even a beefy NAS appliance. Reading from the cloud is via tens or hundreds of nodes all at GigE speed so recovery times can be extremely fast. A common strategy is to work with the backup vendor to tune file sizes to match cloud defaults and add metadata information for management. The good news here is the backup vendors are very cloud-aware and are proactively working these challenges.

How to choose
Both public and private cloud storage provide new options for business continuity and disaster recovery, especially in the current difficult budgetary environment. Which to choose depends on your goals, your expectations for security and privacy of the data, the amount of data, and plan for data growth. If you are startup with four founders in three cities working out of your homes handling non-sensitive data, a public cloud is right for you. If you are twenty plus person company with a few servers and a part-time IT manager, an internal storage cloud is well within your reach. Relative to other storage technologies like NAS and iSCSI, storage clouds are easier to deploy and manage. We expect to see many companies embrace both options and leverage each where appropriate. The protection, economies, ease of management and scalability are such that in five years this will be a key ingredient of your integrated DR strategy.  However, there are several issues to be satisfactorily addressed before storing sensitive data on any of the cloud options.

 

About the Author

To learn more about using cloud storage for business continuity and disaster recovery visit the ParaScale Website.

Cameron Bahar is the CTO for ParaScale , a Silicon Valley start-up focused on addressing the exploding bulk storage requirements for digital content and archival data.