Analyst Report: How to resolve cloud migration challenges in physical and virtual applications

1 Summary

Enterprise IT infrastructure largely predates the emergence of cloud computing as a viable choice for hosting mission-critical applications. Although large organizations are now showing real signs of adopting cloud computing as part of their IT estate, most cloud-based deployments still tend to be either for new and self-contained projects or to meet the needs of traditional development and testing functions.

Compatibility, interoperability, and performance concerns have kept IT administrators from being completely comfortable with the idea of moving their complex core applications to the cloud. And without a seamless application migration blueprint, the project can seem more of a headache – and risk – than it’s worth. This report will highlight for systems administrators, IT directors, cloud architects, and decision-makers at Software as a Service (SaaS) companies and Cloud Service providers, the different approaches they can take in moving existing applications to the cloud.

  • IT planners with an eye to costs should explore moving their existing applications to the cloud.
  • Automation is a key component for ensuring that IT environments can be reliably replicated, whether that’s on new hardware in the same data center or out in the cloud.
  • In areas such as disaster recovery (DR) or for enterprise applications that lack sufficient justification for the expense of a comprehensive redesign, an efficient and reliable means of augmenting on-premise capabilities with the cloud is likely to be interesting for the foreseeable future.

2 The advantage of migrating enterprise applications to the cloud

Traditional enterprise IT applications are difficult beasts, often spanning multiple servers and requiring complex configurations of server software and inter-server networking to keep the individual components working smoothly together. Typically hardware is significantly overspecified to ensure that spare capacity is available to cope with spikes in demand. Alongside the unused capacity on the servers actually running an application, identical servers may stand idle in a separate disaster recovery facility, bought, provisioned, and maintained as insurance against serious problems in the primary data center.

Rather than continuing to bear the cost of procuring, overspecifying, and redundantly duplicating the systems required to run core enterprise applications, IT planners have a number of reasons to explore the alternative model of moving their existing applications to the cloud.

Agility

Agility is often cited as a primary motivation for moving workloads to the cloud, where the pay-as-you-go model can be a far more cost-effective way to provision IT resources for new or growing applications, or for those in which resource usage fluctuates dramatically. Mainstream enterprise applications experience less dramatic growth than consumer applications and the products emerging from tech startups. They also tend to experience usage that is predictable, consistent, and liable to continue relatively consistently for months or even years. Nevertheless, they may find value in exploring the occasional use of cloud-based resources to augment on-premise systems at times of peak demand (around Black Friday for ecommerce systems, around year-end for financial systems, etc.).

Cost-effectiveness

One of the most commonly cited arguments in favor of cloud computing back in its early days was cost. Cloud computing, it was argued, was cheaper than its on-premise equivalents. For many workloads this was – and is – true, but, as we rapidly realized, the cloud’s most vocal proponents were oversimplifying a complex situation. The low hourly rates charged by cloud infrastructure providers rapidly grow when you’re running large numbers of servers, hour after hour, for years at a time. In those situations, the sunk investment in existing hardware, existing data centers, and existing staff will often make far more sense. Buying hardware that you know you will use to capacity over the three-to-five years of its expected life is simply cheaper than renting capacity at the hourly rates charged by Amazon Web Services (AWS), Rackspace, and others.

The cost-effectiveness argument remains particularly compelling in areas directly relevant to enterprise IT, however. These include development and testing (dev/test) functions and disaster recovery (DR). Dev/test (and training) labs are typically created for short periods of time, used intensively, and then shut down completely. This type of workload is well suited to the cloud. DR facilities can, at their most resilient, require creating a second data center to mirror the capabilities and infrastructure of the primary facility. Only one of these runs and only one of these is being used to meet the day-to-day IT requirements of the enterprise and its customers. The second facility stands idle, waiting to take over in the event of significant failure in the primary facility. The cost of equipping and maintaining this mirror image is huge, but the cost of not being able to keep mission-critical applications running could – conceivably – be higher.

Resilience and protection

High-profile outages in the public cloud can be used to suggest that cloud infrastructure is not resilient or reliable. But while individual data centers supporting parts of the public cloud might partially or even wholly fail, the nation- or continent-spanning cloud offerings of large public cloud providers like Amazon are typically extremely reliable. Well-architected applications, built with the cloud in mind, are designed to cope with the loss of individual hardware components. Despite relying upon a data center at the heart of a recent Amazon outage, companies such as Netflix were able to continue serving customers with little or no noticeable degradation in end-user experience. The public cloud has proved to be extremely resilient. It also offers a logical place to house the DR solutions that enterprises need to have but so rarely use, provided an effective way can be found to keep them up to date and available for rapid and reliable failover when required.

3 The challenge of migrating enterprise applications to the cloud

Enterprise applications, as we have already seen, typically require a complex set of interactions between multiple hardware and software components. This complexity is perhaps the single largest reason why established applications do not usually move from a local data center to the public cloud. A number of other issues must also be considered.

Security

Security concerns may really be a sufficient reason for keeping applications on-premise for some applications, but the number is small. These include a subset of financial systems and perhaps a subset of healthcare systems concerned with specific patient data.

Despite recent concern around PRISM and state-sponsored snooping on computer systems, most IT systems could run as securely in the cloud as they do on-premise. Indeed cloud providers may prove more secure than a typical internal IT operation, as their scale allows them to employ and support a larger number of dedicated security professionals. Also, cloud providers are increasingly taking the additional step to gain and demonstrate compliance with relevant regulatory regimes, such as healthcare’s HIPAA.

Regardless of the relative merits of on-premise and cloud-based security, the perception of cloud insecurity remains a very real challenge to adoption. IT executives from organizations in the business of managing sensitive data often prefer to keep entire workflows under their direct control; whether the cloud is more or less secure than their own data center, they are able to control and understand the processes at work within their own data center in a way that they cannot in the cloud.

Cost

Established enterprise applications typically run on dedicated infrastructure that has been purchased and deployed within an enterprise data center or co-location facility. Unless a significant upgrade is taking place, there is not typically an opportunity to save money by moving an existing application to the cloud. The complexity normally associated with enterprise applications can also raise the cost of cloud-based deployments as costly extra features, such as secure network connections and fixed IP addresses, tend to be required. For an IT team already used to managing their applications in-house, the additional training costs required to gain familiarity with cloud tools may not be insignificant.

Integration

Integration of systems across an enterprise can be a complex undertaking. Ensuring that the stock control system communicates with the production line monitoring systems, and that the staff directory reliably provisions email accounts and manages login credentials for everything else is hard enough within a single data center. Keeping all of that working and secure across the public internet is a challenge that IT managers may simply prefer to avoid.

4 Opportunities for the hybrid cloud

The challenges we’ve outlined go some way toward explaining the relatively slow adoption of cloud for running large, complex enterprise applications. Fundamentally, though, the process of moving multiple components from one place to another and ensuring that they run as expected is difficult. Where on-premise solutions remain fit for purpose, the rationale for moving to the cloud is not compelling enough to offset the complexity and risk associated with the move.

Increasingly sophisticated automation, which we’ll explore further on, simplifies the process of migration to the extent that certain use cases become viable and even compelling. This is where cloud migration companies such as CliQr, CloudVelocity, CohesiveFT, and Ravello have initially focused their attention. While competitors such as Ravello emphasize support for just one or two of these use cases, CloudVelocity seeks to grow its market by addressing all three.

Flexibility and agility

Seattle-based QL2 Software delivers real-time and historical business analysis solutions to approximately 300 customers in industries such as travel and manufacturing. The company currently runs its application on about 50 servers, operating out of a co-location provider’s data center. With the current co-location contract drawing to a close and much of the hardware in need of replacement, QL2 director of operations Samir Bhakta was interested in evaluating whether the time might be right for a move to the cloud. Bhakta was keen to find ways to spend less staff time on basic hardware management, freeing his team to focus on directly supporting paying customers.

Bhakta’s team began a phased migration earlier in 2013, but the complex and bespoke nature of their environment made the transition more complex than anticipated. The size and specification of their servers and applications simply didn’t fit AWS’s standard machine instances well. Complex rules, interconnections, and dependencies further complicated an already difficult task. The company’s target of migrating before the co-location contract expired began to look unrealistic, forcing Bhakta to begin planning for the cost of an unwanted extension to the co-location arrangement.

According to Bhakta, proof-of-concept trials with CloudVelocity significantly accelerated the speed of migration. He anticipates spending just 25 percent of the budget originally earmarked for migration, and expects to complete the job in one-third the expected time – well ahead of his December deadline. Bhakta is now waiting for the installation of a dedicated network connection between his co-location provider and AWS, which will enable him to complete the migration more securely and rapidly than would be feasible over his current public internet connection.

Bhakta notes that his AWS bill moving forward may be higher than the amount he previously paid his co-location provider, but he stresses that the total cost of ownership for his new system (including staff and licensing savings, cancelled support agreements, etc.) will be demonstrably lower than his current spend.

Training and testing

John Vastano, VP of worldwide customer support and services at Santa Clara–based ScaleArc, has use cases for which cloud infrastructure would be ideal, if only he could configure everything accurately and quickly.

His company is in the business of delivering training and test environments for complex multicomponent IT systems. According to Vastano, trainers in this area have tended to cut corners in the past. It has been too complex for them to create and maintain test or training environments that properly simulate all of the interactions in a real system, so they have simplified and abstracted, providing only parts of the whole. Trainees therefore received a simplified and not entirely accurate environment in which they could learn. ScaleArc is an early customer of CloudVelocity, and uses the company’s migration product to understand all of the software and networking components within a live system in order to accurately replicate it in Amazon’s cloud. According to Vastano, he is now able to accurately simulate entire systems in Amazon’s cloud, and he is also able to cost-effectively create enough systems for every trainee in a class. No more sharing, and no more limited training environments.

Vastano also has an interest in testing systems. To do this, he typically needs to configure a test system using common industry tools like Puppet or Chef, run his test, and then modify the configuration slightly before testing again. Vastano swears by Puppet and its ability to deploy hundreds – or thousands – of identical systems on the basis of a single template. However, he feels that existing tools struggle in a testing situation where he actually wants several hundred systems, all with slightly different configurations. CloudVelocity enables Vastano to specify the components that he wishes to replicate, leaving him free to modify the configuration of the pieces he’s interested in testing.

Disaster recovery

A California investment firm runs a number of mission-critical trading systems and has deployed a traditional DR solution to protect them. A mirror of the live system sits idle on servers in a remote data center. The hardware, the software, and the processes designed to ensure that data regularly updates are expensive, and will hopefully never be used. This expensive and wasteful approach is widely used for core enterprise systems, and remains the default solution for ensuring that systems are able to survive outages and failures in a primary data center.

The cost of a traditional DR solution typically means that it is only used for the most critical systems within an organization. Less critical systems (an internal SQL database and various reporting applications, in this investment firm’s case) are backed up to tape or disk, because the cost of short periods of downtime is considered to be less expensive than the cost of retaining a full DR capability for these secondary systems.

Cloud-based solutions, such as those from Bluelock and CloudVelocity, present a viable alternative for less critical systems. System images and data are stored with a cloud provider such as AWS, but only activated – and paid for – when actually required to take over from the primary data center.

Indeed, the investment firm is currently so happy with their CloudVelocity-based solution that it went so far as to suggest it might be a suitable replacement for the DR system already covering its primary trading systems.

5 The role of automation

Automation is key to ensuring that IT environments can be reliably replicated, either on new hardware in the same data center or out in the cloud. Operations staff typically uses Chef recipes or Puppet scripts to describe the steps required to configure a system. These recipes or scripts can then be run in order to provision a new system for use.

Dedicated systems such as CloudVelocity’s One Hybrid Cloud (OHC) platform go further, analyzing multiple facets of a Windows- or Linux-based system in order to accurately replicate the entire application stack in a new environment. The company describes a five-step process that OHC works through. Other cloud migration tools undertake a similar set of processes, although typically in a less automated fashion.

CloudVelocity’s approach to automating a system migration

Screen shot 2013-11-15 at 4.28.54 PM

 

 Source: CloudVelocity

Discovery

OHC’s first task is discovery, during which the application learns all that it can about the target system. OHC builds a picture of both physical hardware and virtualized servers, gathers information about the storage services available to the application, and learns IP addresses and dependencies upon external services such as an Active Directory or LDAP authentication service.

Blueprinting

Next comes blueprinting. In that phase, OHC takes the information gathered during the discovery phase and turns it into a blueprint or model suitable for running in Amazon’s cloud. Server configurations from the target application are mapped to EC2 instance sizes, storage requirements are compared with Amazon’s disk-based and SSD storage products, etc.

Provisioning

During the provisioning phase, a secure connection is created between the target application and Amazon’s cloud. Virtual machines are configured, data is replicated, IP addresses are assigned, and everything is prepared for launch.

Synchronization

The synchronization phase runs continually on Amazon, extracting changes from the target system and packaging them up as Amazon Machine Images (AMIs) which are generally 3,060 seconds behind the live application on the target system. This synchronization phase continues until the Amazon-based copy of the target system is required for use. If operating system or software patches are applied to the target system, the synchronization process will ensure that these updates are reflected in the blueprint.

Service initiation

In order to keep costs low, most of the virtual machines specified during blueprinting remain dormant until service initiation. At this point, the blueprint is applied in order to boot a series of virtual machines in order, pulling current data from the AMIs gathered during synchronization. Within a few minutes, a fully featured and accurate copy of the target system is running in Amazon and available for use.

6 Addressing security challenges

Security remains a concern for those considering the cloud for their core enterprise applications. Systems such as CloudVelocity’s address valid customer security concerns in a number of ways, including use of data encryption and firewalls.

Communications between customer data centers and Amazon’s cloud are encrypted and pass through a point-to-point SSH tunnel. Data sitting in Amazon’s cloud is stored in encrypted data volumes, secure from inspection by Amazon, other Amazon users, and malicious attacks.

Servers are protected by their own firewalls inside the Amazon cloud, and existing application security features from the target system are replicated in the Amazon-hosted version. No system with a connection to the internet can be completely secure, but it is possible and feasible to extend traditional enterprise security practices from the data center to the cloud.

Maintaining data security, in the cloud and as it moves between data centers 

Screen shot 2013-11-15 at 4.31.43 PM

 

Source: CloudVelocity

7 Migration as a sustainable strategy for the enterprise

The complexity of legacy enterprise applications traditionally makes them difficult to move to the cloud without significant reengineering or virtualization. The workload migration approach proposed by companies like CloudVelocity may avoid the need for expensive and time-consuming reengineering, but there are concerns that this approach is little more than a short-term fix. Simple migration, it is suggested, fails to exploit the underlying capabilities of the chosen cloud computing platform; the migrated workload may run, but it will often only run inefficiently.

CloudVelocity argues that the intelligence built into the blueprinting and provisioning phases of its process serves to reduce the validity of this concern. Early evidence from the company’s first customers suggests that the performance gap may, indeed, be narrowing. QL2 Software’s Samir Bhakta, for one, is quick to praise CloudVelocity’s automation. The system will not, he attests, completely automate his complex migration, but it does more than enough to significantly reduce the cost, complexity, and time that would otherwise be required.

For mission-critical applications, where performance matters, and where a clear case for redesigning in order to “leverage the native features of the cloud” exists, rearchitecting as much as possible has value. But many more enterprise applications have insufficient justification for the expense of a comprehensive redesign. In these cases and, especially, in areas such as DR, an efficient and reliable means of augmenting on-premise capabilities with the cloud is likely to be of interest for the foreseeable future.

8 Key takeaways

  • Traditional IT applications depend upon detailed configuration and a complex set of interactions between hardware, software, and network. This complexity has reduced the appetite for migrating these applications to the cloud, regardless of potential cost and agility benefits.
  • Short-term and highly replicable workloads such as those typically associated with training or dev/test activities were early examples of enterprise use of the cloud. A new generation of more capable cloud migration tools offers a richer and more complete simulation of complex enterprise IT environments.
  • Specific enterprise use cases such as DR present a more compelling economic case for exploring the cloud, even for the most complex applications.
  • Automation is key to ensuring that enterprise applications can be replicated in the cloud in a consistent, timely, and cost-effective manner. Complex multiserver system migrations may not be completely automated yet, but tools are definitely now available that significantly simplify the process.
  • Migration of workloads to the cloud is a cost-effective way to address a particular set of business challenges. Migration is not the only option. Rearchitecting applications to take advantage of cloud-based resources may prove more beneficial in the longer term.

9 About Paul Miller

Paul Miller is an analyst and consultant, based in the East Yorkshire (U.K.) market town of Beverley and working with clients worldwide. He helps clients understand the opportunities (and pitfalls) around cloud computing, big data, and open data, as well as presenting, podcasting, and writing for a number of industry channels. His background includes public policy and standards roles, several years in senior management at a U.K. software company, and a Ph.D. in archaeology.

Paul was curator for Gigaom Research’s infrastructure/cloud computing channel during 2011, routinely acts as moderator for Gigaom Research webinars, and has authored a number of underwritten research papers such as this one.

10 About Gigaom Research

Gigaom Research gives you insider access to expert industry insights on emerging markets. Focused on delivering highly relevant and timely research to the people who need it most, our analysis, reports, and original research come from the most respected voices in the industry. Whether you’re beginning to learn about a new market or are an industry insider, Gigaom Research addresses the need for relevant, illuminating insights into the industry’s most dynamic markets.

Visit us at: research.gigaom.com.

11 Copyright

© Knowingly, Inc. 2013. "How to resolve cloud migration challenges in physical and virtual applications" is a trademark of Knowingly, Inc. For permission to reproduce this report, please contact sales@gigaom.com.

Tags