The issue of cloud migration and cloud operations is tricky. There are numerous questions that arise from the idea of moving to the cloud. We have addressed some of these questions here and hope that this paper will help you better understand the issues surrounding cloud computing.

Only 5–6 years ago, the idea that the cloud would be in widespread use was viewed by most IT businesses as dubious at best. Why give away your own — or, worse, your clients’ — sensitive data to an independent company, which could prove careless or malicious? Why upload it somewhere on the Web, when you have your own private and reliable data center with carefully developed systems and databases and physically available servers, over which you have full control?

Today, we can safely say that a mindshift has happened: with the rise of massive data objects and distributed data management, the cloud has become a massive trend and general interest in cloud technologies continues to grow. According to Forbes, spending on cloud computing has grown at 4.5 times the rate of IT spending since 2009 and is expected to grow at more than 6 times the rate of IT spending through 2020.

However, the issue of cloud migration and cloud operations is tricky. There are numerous questions that arise from the idea of moving to the cloud. We have addressed some of these questions here and hope that this paper will help you better understand the issues surrounding cloud computing.

Every business situation is unique, and we recommend working with an IT consultancy to identify and implement the cloud solutions that best fit your needs. Should you need deeper insights tailored specifically to your business needs, we are always here to help.

1. Introduction

1.1. What are the advantages of moving to the cloud?

1.2. Is it for everyone, or are there any companies that are better with an in-house system?

We should distinguish between three options: the traditional in-house model, the public cloud and a private cloud. The traditional model is characterized by the manual or semi-automated provisioning of server resources, which means that the data is much less safe in such environments. Cloud computing means the fully automated provisioning of server resources (the pet vs. cattle metaphor).

Usually, when discussing the advantages of cloud migration we are speaking of the public cloud, which is owned and maintained by third-party cloud service providers. Private clouds are those that are built exclusively for an individual organization. Some organizations that use huge data warehouses and BI workloads may prefer a public cloud solution due to the benefits in workload performance, data integration, privacy and data security. Take for example the cases of Dropbox [see Bloomberg, 01.03.2018] and Walmart [see Reuters, 14.02.2018], who have recently migrated from AWS, built their own DCs and saved millions by doing this. But you need to be that big. For most small and mid-sized companies without a huge data warehouse like Walmart’s, a public cloud probably is the most efficient option.

There is also another option that could be beneficial for customers who want to use cloud but are precluded by regulations, data sensitivity, or location of data from using the public model. Recently, Microsoft developed its own hybrid solution, Azure Stack, which gives customers a way to use a familiar cloud platform without placing their sensitive data into a multi-tenant environment. It should not be viewed as a standalone virtualization platform: it includes basic infrastructure-as-a-service (IaaS) functions that make up a cloud, such as virtual machines, storage and virtual networking, as well as some platform-as-a-service (PaaS) features including container service, serverless computing software, and MySQL and SQL Server support.

1.3. What should we do in advance? Are there any typical mistakes that we should watch for?

To name just a few:

For example, a good prototype for a database should include the deployment and benchmarking of queries with the volume of data that the database owners expect in a year.

2. Technical issues

2.1. How do we plan the migration? How should we organize the entire process to make it efficient and painless?

Here is our typical recommended migration framework:

One important recommendation is to always keep in mind the wider development context within the organization. Just like with any other complex transformation program, the cloud migration roadmap should be aligned with the roadmap of other company-wide projects to avoid conflicts and duplication of efforts as well as to ensure that in these other roadmaps there will not be any critical milestones that should be considered as we design and plan our own.

2.2. How do we select a proper migration strategy? Can we just ‘lift and shift’ our apps, or should we transform the entire architecture?

There are several migration strategies:

Rehost

Lift and shift style migration by deployment of the app in the cloud or migrating VMs.

Pros/Cons

Replatform

Change the platform to be able to run in cloud.

Pros/Cons:

Refactor

Change the implementation and architecture of your solution to remove dependencies, optimize performance and make it more robust, scalable and fault tolerant.

Pros/Cons

At first, we can always choose the quickest and easiest option and then adapt the solution to the cloud as necessary. In our experience, approximately one third of clients are fine with just the rehosting option, and another third with the re-platforming option. However, bear in mind that the cloud has huge transformational potential and the migration is a great opportunity to optimize the old architecture, get rid of unnecessary dependencies and replace the obsolete parts of the system.

There are symptoms suggesting that some re-architecture is required: “We are happy with how everything works, but we have this small problem…” (the cost of the cloud solution, MS SQL servers, etc).

2.3. Our system is huge. How do we transfer our data to avoid high network costs, long transfer times and security concerns?

There are several solutions. One example is Amazon Snowball. It is a data storage and transfer appliance for AWS with a capacity up to 50 Tb that you can request from AWS (one or several in parallel). It is almost indestructible, self-contained and tamper-resistant. To quote the Amazon website, “It is rugged enough to withstand a 6G jolt and light enough for one person to carry. It is weather-resistant and serves as its own shipping container.” Amazon Snowball was the first solution of the kind on the market, and since 2017 the Azure Data Box and Google Transfer Appliance have also been introduced.

2.4. Our service processes multiple transactions per second. How do we move it to the cloud and ensure we don’t lose any transactions?

With large, complex and specialized transactional systems (banking, for example), we never just ‘switch’ from the old system to the new one. We always strive to establish a double-write system, with one copy in the cloud and another backup copy remaining on premises, to allow for a rollback if something goes wrong. The migration is always phased to avoid any risks of malfunction in the new system. During the pilot migration we could transfer 5% of all transactions, or a particular type of transactions, or only transactions belonging to a particular client. Only after a thorough testing, when correct processing of all types of transactions is confirmed, would we proceed with the following phases of migration.

2.5. Should we transfer the entire system or are there parts better left out of cloud?

A common approach is to keep hybrid infrastructure and use the cloud to enhance one’s technical capabilities, not to move to it, accept it as the only new model and never look back. One type of data often left out of cloud, is third party confidential data or software — many companies still prefer to store such data on premises for security and compliance reasons. Another type is specialized computing-intensive operations, e. g. scientific GPU computing, rendering and so on. There is usually a highly developed on-premises infrastructure in place for such operations, and the cloud may lack sufficient computing power or simply be too expensive.

2.6. Which particular technologies could you recommend?

Based on our experience, the suggested approaches are, first and foremost, continuous integration and delivery (CI/CD), and second, infrastructure as code (IaC). Those two practices involve massive automation of infrastructure, DevOps and release management.

In particular, we recommend the following implementations:

2.7. Should we use any cloud-specific SaaS- or PaaS- components to replace parts of the legacy systems?

It depends on the particular business case.

Yes: because such components make it possible to cut numerous costs on licenses, operation, security updates and patches, etc.

No: in case you use the system in a very specific way that requires either numerous customizations and enhancements, or some very specific performance / SLAs, so standard SaaS and PaaS offerings will not work for you.

One example of the latter could be the choice between hosting a relational database on an Amazon EC2 instance or migrating its contents to an Amazon RDS instance. Amazon RDS is easier to set up, manage, and maintain than a database in Amazon EC2, and lets you focus on other tasks rather than day-to-day database administration. It is a simple out-of-the-box solution if you just want to run a regular database. Alternatively, running a database in Amazon EC2 gives you more control, flexibility, and choice, and makes it possible to set up specific performance or a customized configuration.

2.8. Should we ‘buy’ instead of ‘build’ (i.e., replace legacy components or entire solutions with servers on the market)?

The answer is the same as for the question about SaaS- or PaaS-components. Buying is reasonable and cost-effective if your needs are not too specific. Building is recommended if you need a fully customized solution.

We did solution design for a client, a multinational travel corporation, who wanted to combine several proprietary apps into one and needed to select a platform for this new system. They had very specific requirements with regard to UX, and they also needed to integrate with their other systems.

We had to decide between building it ourselves, which would require millions in investment and many months of development, or buying and adapting ready-made components from MS Dynamics 365.

At first, the second option seemed cheaper. However, eventually we discovered that:

  1. MS Dynamics products lacked most of the required functionality, so we would have to customize them to make them work with our other systems, and this would still cost much more than building from scratch;
  2. any new features that we add would become the intellectual property of Microsoft.

These are the two main dealbreakers that lead some companies to develop their own solutions instead of extending third-party products.

3. Financials

3.1. How do we measure financial gains? (Also, ‘cheaper or not?’)

There are several factors you should consider in order to determine if the cloud is less expensive for you.

  1. Costs: a monthly (or yearly) invoice from a cloud provider (which could prove quite high depending on usage). The cloud is not cheap.
  2. Savings: our current IT expenses. Hardware cost and maintenance cost. Software licenses. Renting data center space. Labor costs (not just server admins, also infosecurity, audit, physical security and building maintenance, etc.) Most of it will no longer be necessary after permanently migrating systems to the cloud.
  3. Costs: our new IT expenses. Some expenses may be gone, but there will be new expenses. The cloud involves a new discipline and different DevOps practices. You will need to hire new people and train existing employees for a new required skill set. You will need to spend their time to optimize CI/CD in the cloud and to manage it.
  4. Savings: elasticity. If you will need a thousand VMs for the Thanksgiving sale, you can easily do that with the cloud.
  5. Costs: If you migrate to the cloud lift-and-shift style, without any optimization, then the operational costs in mid-term will be higher.
  6. Savings: However, in long-term the cloud will allow you to reduce costs even without any optimization — due to the lack of hardware replacement and maintenance. (Hardware breaks, if it is your server — you buy another one or deal with a vendor, you replace disks and clean up the dust from server blades. In case of the cloud, you receive an email 30 days in advance and in about 10 minutes you replace the VM in question with a newer version.) (There also many options for optimization!)

3.2. What are possible cost optimization options?

To name just a few:

3.3. What are the hidden costs of the cloud and how can we avoid them?

One big source of unexpected costs is the lack of data governance with too much of technical freedom. Sometimes people simply forget to turn off virtual machines that they do not really need. If nobody monitors the resource allocation, costs can start to grow uncontrollably. It is highly recommended to plan ahead for 12–18 months at least.

In one case, a client used cloud virtual machines to run builds on CI. They started with 5 servers and the costs were very moderate, but after several months this number grew to 80 and the cloud costs went up to 20,000 USD/month. The system was designed in such a way that it was not possible to simply downscale the number of machines. We solved this problem by redesigning the system and moving CI processing to spot instances.

Another example: Redshift. Redshift is an amazing database, very fast, very scalable, but also very expensive. You need to have a very good reason to use it. You can scale it up any time by adding more servers. Your cluster can grow to thousands of dollars per month, and if you are not careful, this cluster becomes an essential part of your system that your business will depend on. Without thoughtful long-term planning it is highly risky to invest in such technology as it can make you bleed money.

Software licenses are another topic. Enterprise systems, such as database servers, are sometimes licensed per CPU core. The more CPUs your server has, the more your license costs. If your cloud solution is very scalable and you have a lot of CPU cores in the cloud, the license can become prohibitively expensive.

4. Security and Compliance

4.1. This old question again: why should we even trust a third party — a public cloud provider — with any sensitive data?

Mature cloud providers invest heavily in security and compliance. The effort that will be required to match their level may be even too expensive to implement within most companies’ on-premises systems. Check the main provider pages on security Microsoft Azure, AWS, and Google.

4.2. How do we select a cloud provider?

A potential cloud service provider must be checked carefully to make sure it is qualified, reliable, financially stable to operate over a long term, and has a good reputation on the market. Among industry best practices and standards, there are certifications like ISO 27001 or the Cyber Essentials Scheme. Also check Microsoft’s short guide on provider selection.

4.3. How to ensure security of the cloud and DevOps?

From the security point of view, the cloud is not much different from any other service: you must implement best practices for the security management of your data, authentication and access controls. However, cloud solutions give you tools that better support a cloud environment.

  1. User permissions. You must manage permissions for your cloud’s users, operators and developers. This is called identity and access management (IAM). The most important best practice is that you should limit user access to resources and data on a need-to-have basis, striving to have a minimal set of permissions.
  2. Networking infrastructure security includes managing network topology, firewall rules, traffic security policies, and points of contact with the Internet — in case of both security ‘at rest’ and ‘in transit’.
  3. Data security includes access control to the data storage, including files and databases, and encryption of data to mitigate damage even in case of data leak.
  4. Secure software development. Software developed for the cloud should use the usual set of industry-accepted security best practices. Practice security-first programming, audit code changes, code reviews, security audits, etc.
  5. Secret management. Do not leak your security credentials to the code repository, but instead keep credentials (such as the database server password) in secret management vaults integrated in your cloud solution. This allows you to easily monitor access to passwords, rotate credentials and revoke access that is no longer necessary.
  6. Human factor. At the end of the day, the biggest threat is neither security vulnerabilities nor breaches: it’s people. Your users may be not careful with their passwords or keys. Educate them, remind them regularly of the importance of data security, and do not forget to set a password rotation policy.

By Yuri Gubin, solutions consultant & cloud expert at DataArt

Originally published at blog.dataart.com on October 01, 2018.