When discussing the vendor strategy for the technical systems, traditionally in software engineering you find conflicting requirements. This is not different in today’s business where cloud based technology is an innovation accelerator. One example: You want to be fast, e.g. do not develop again what others already provide; and you do not want to be dependent on your vendors too much to reduce strategic risks. Elaborating on the different reasons, there is a number of drivers that may force businesses towards basing their products on more than one cloud provider’s services. These drivers include reducing perceived risk from a single supplier, meeting regulatory requirements, avoiding a lock-in and meeting end-customer demands.
In terms of risk reduction, some organizations have rules or preferences to reduce the dependency on a single supplier. Risk reduction through diversification is a classic approach. In such cases, the organization may view the cloud provider as a supplier and may seek to use multi-provider as a strategy to reduce risks. Such risk include the relationship with a particular cloud vendor turning sour, operational issues the cloud provider may run into, sudden price increases, change in licensing, breaking changes or other similar types of scenarios.
From a regulatory perspective, businesses may face challenges meeting certain regulatory requirements such as data residency in particular locations or other similar concerns that may make it difficult to deploy solutions on only one cloud provider, due to the regional locations of the provider’s data centers. When expanding globally, businesses experience very different local regulations and laws, which may prohibit usage of certain providers as well.
Finally, many organizations offer solutions to their end customers that are built on a particular set of cloud service. Such organizations may feel pressure to deploy their solutions on other cloud providers as well to broaden their addressable market, especially if some of their end customers have strong preferences or significant contractual relationships with a particular cloud provider. In addition, this plays a role in scenarios where large amounts of data is used in integrated use cases that would require implementing cross-provider data transfer and integration.
For all the aforementioned reasons, customers may find themselves in a position in which a multi-vendor strategy appears to be preferable to the opportunity cost, such as not entering a particular market or not addressing certain customer segments. Then the question becomes, “what is the recommended way to navigate the complexities and challenges of a multi-cloud-vendor approach?” To answer this question adequately, it may be best to recap the drivers for cloud adoption once more. What are the benefits to using cloud services in the first place? Then we will discuss the consequences of going multi-vendor.
Cloud Benefits in a Nutshell
There are numerous benefits the cloud offers that broadly fall into three main categories: financial, operational, and competitive benefits.
The financial benefits of the cloud include trading inflexible and potentially risky capital investments for variable expenses that closely track the amount of actual resource usage. This pay-as-you-go approach lowers the bar for investing in new products and services, as it allows you to try out ideas with minimal resources and then scale up or shut them down based on the value the products or services actually create.
The ability to easily scale resources up and down as needed also simplifies operations. Instead of needing to worry about capacity planning and then spending time and other resources on racking, stacking, and powering servers, you can focus on projects that differentiate your business. At the same time, the cloud enables you to go global in minutes, expanding your reach around the world with minimal operational overhead. Finally, the rapid pace of innovation in cloud services makes it much easier for customers to be agile from an operational perspective. You can easily turn on new features and quickly integrate new cloud services into your existing projects when you are already present in this envrionment. This applies to very basic services like virtual machine and virtual network infrastructure but also applies to higher-level services like machine learning and AI services.
The combination of the financial and operational benefits of the cloud allows you to create and sustain competitive advantage. With a low financial barrier to entry to bring new services and features to market, and significantly lower time-to-market at virtually limitless scale, you can create a culture of innovation that encourages experiments, gains valuable feedback from your customers, and rapidly responds to your customers’ needs – setting your business apart from the competition.
Consequences of a multi-vendor Approach
A cloud vendor strategy may contain several different elements, considering e.g. large portfolios of different products, solutions and trials. In larger organizations, you may find a strategy that includes a vendor concept over different departments and businesses. Here, in the following paragraph, we focus on the effect of a multi-vendor approach to a single product / solution or business unit. So, what does it mean for an organization to run a product / solution on top of at least two different cloud providers? When bringing a solution to market on multiple cloud providers, there are typically trade-offs between inherent benefits of the cloud-provider’s features and the consistency of the solution in question.
The two extreme approaches are:
- Build the same product twice, fully leveraging the provider strengths and technology but basically resulting in two different (at least to some extend) product implementations
- Build on the least common denominator between the providers, resulting a more consistent implementation but basically not using any higher level services of a given provider
Of course, there is shades of gray in between, but for the moment, lets analyze these extremes starting with the latter:
Building it natively twice
The natural way to start a business based on cloud technology is to start with one provider and use all the available features. The goal is to quickly build a minimal viable product and involve customers for early feedback. While the product is maturing, it also needs to scale with the growing customer demand and the teams must innovate. This works optimally under using e.g. the full breath of a probiders services. You may want to use tools for operational monitoring and audit logs, services for dynamic authentication, automatic scaling mechanisms for fitting your resources to customer demand, any of the many purpose-built database for optimal data storage, services for machine learning and storage services for cost-optimized data storage… With every new feature in a provider’s service portfolio and with emerging new services your product can adapt innovations quickly and so respond to emerging requirements. Engineering hours of your software developers have a strong focus on differentiating features and customer value.
Now, expanding the solution to a second cloud provider, you will face a major challenge. Your product depends on a lot of interfaces and technology modules that are either not existing in the second provider’s portfolio or are not interface or behavior compatible. Behavioral compatibility mainly aims at operational qualities of the services in terms of e.g. security, scaling, backup & restore and similar. Therefore, if you want to preserve the speed of development and optimized architecture to meet operational requirements like cost and performance optimizations, you would need to build your system a second time around the available services and interfaces of the other provider. This applies to all code elements and scripts that make direct use of cloud provider interfaces. Of course, it depends on how much and what you have used. In any way, you need to check the implementations and determine how much of the code base needs to be touched and how you map the features of provider A with those of provider B. This determines the amount of engineering staff you have to add to build a second version of the solution.
In this case, you need to expect that both solutions will not be equal. They will certainly differ on technical, code level. However, they may also differ from a customer’s point of view. Both systems will not have equal features due two main reasons: 1) the both solutions will have two different teams, which have different levels of maturity, especially when we assume that not both solutions were started at the same time. 2) Based on the used features on the cloud vendor’s services, there might be missing services on the second cloud provider’s portfolio so that you cannot provide resulting product features to your customers unless you compensated this with an own development. Over time, feature drift is hard to manage, especially because the underlying cloud vendors will provide different new features with different priorities, you cannot do new things on vendor A and expect this on vendor B as well; and vice versa.
Responding to customer demand quickly may so become harder when you need to figure out the differences in both systems first. In addition, educating the marketing and sales teams to precisely explain the differences of both solution variants may become an obstacle and potentially lead to customer confusion – this needs to be managed explicitly.
Building an abstracted solution
The other approach is to create an abstracted solution that runs in a self-contained manner so that it can be deployed on multiple cloud providers with nearly no modification. Such an approach may initially seem ideal as the product is developed once and deployed in multiple cloud environments. However, different cloud providers have different features and they are at different levels of maturity in their cloud service offerings. Consequently, the abstracted approach necessarily implies using the lowest common denominator in terms of cloud services.
At the same time, abstraction layers create numerous other challenges over time. The burden of developing each service that will be used in a self-contained fashion or, alternatively, creating and maintaining an integration layer between the abstraction layer and cloud native services, results in significantly less agility, slower time to market, and increased operational overhead. It also creates a blocker for innovative experimentation with new features and services that the cloud provider releases. Employing an abstraction layer to maintain consistency between cloud providers can effectively tie the hands of product teams.
In addition to abstraction layers, there is also the concept of focusing on basic compute only. In this concept, you would only use virtual machines and/or container services to run the entire needed software stack by yourself – not using any higher-level services of the cloud provider. You would deploy and maintain databases, messaging systems, machine learning environments (to name a few prominent examples); and all your application code would only interface with the software that you have deployed. In this case you would not need an abstraction layer, but you need to maintain the base software and ensure the operational qualities yourself (security, performance, scalability, cost optimization). Going this way, you will find that your engineers will focus on providing, stabilizing and optimizing basic services like database clusters or message brokers, dealing with basic security and scalability issues. A considerable amount of engineers with highly specialized technological knowledge will be required – in addition to the staff that you need to build the actual solution with your product’s differentiating features. While this solution will be highly portable and consistent, it will be the most costly to implement with a very low effort-to-customer-value ratio.
Conclusion on consequences
Whether the choosing to deploy a solution in a lowest common denominator, highly abstracted approach or creating multiple versions of the product using cloud native services, there is the added operational complexity of learning and operating multiple clouds. Operational efficiency is lowered as teams must handle security, networking, provisioning, and operations in more than one cloud environment. It is a significant enough effort to hire, train, and retain individuals to effectively manage one provider’s cloud, so the scale of this challenge should be factored in when considering a multi-vendor approach.
Finally, there is the consideration around cost. Customers have better purchasing power and end up paying less with a predominant provider because of volume discounts and other advantages. These savings aren’t possible when you split what you buy across multiple vendors, and are typically eroded further the more you abstract out the underlying native services you choose to build your solution on.
Taken together, it is clear that a number of challenges and trade-offs emerge to support a multi-vendor approach to building, deployment, and operating a solution. It is important to fully factor in the financial, operational, and competitive benefits that you may be giving up and determine whether the market opportunity truly justifies the trade-offs.
How many additional sales will you actually close in exchange for the substantial higher cost of going multi-vendor?
Strategies & Approaches
Most strategies to multi-vendor include one or more of the following three approaches: self-contained hosting, third-party services for feature abstraction, and cloud native services.
The idea of self-contained hosting is to use containers, virtualization, or other common abstraction layers to enable your solution to be deployed virtually as-is across different vendors’ clouds. On the surface, this approach appears to offer straightforward security boundaries, simple units of scale, and minimal feature drift between versions of your solution that run on different clouds. However, in practice, you may encounter a number of challenges with this approach. For example, your solution’s current requirements or future roadmap may include higher-level services such as data lakes, machine learning/artificial intelligence, or others that cannot be easily replicated in a self-contained environment. Furthermore, the unit of scale may not be cost-effective compared to other approaches (e.g. containers versus functions). And this approach does not provide a means to take advantage of the rapid pace of innovation of the cloud providers.
To address the need for higher-level services, while preventing feature drift between versions of a solution that runs on multiple vendors you can employ third-party services that have already been built to run on the clouds you plan to use. For example, you could use a third-party solution for machine learning that runs on multiple cloud providers as a means of integrating ML into your multi-cloud solution. Like the self-contained hosting option, the third-party services option minimizes feature drift at the cost of agility, operational efficiency, and ongoing cost-effectiveness. You become responsible for testing, deploying, and integrating updates to the third-party software; securing, scaling, and operating the infrastructure the software runs on; and maintaining the relationship with the third-party vendor(s) for licensing, feature requests, and support across multiple clouds. Here, you may seek software as a service solutions, where a vendor provides a service model that makes you independant from the underlying cloud infrastructure for that service.
Alternatively, you can use cloud native services in your solution. The challenge with this approach is that different cloud vendors are at different levels of maturity in both the breadth and depth of the native services they offer. You may face trade-offs such as using third-party services on some clouds to make up for incomplete feature sets or services that are lacking on one hand, or launching different versions of your solution that have differences based on the underlying native services available on the other.
Whichever approach, or combination of approaches, you choose, there is a fundamental need to address end-customer data in terms of tenancy and availability to your services. Does your solution have the requirement that your end-customer must choose a version of your product based on the underlying cloud vendor? If not, does the solution require customer data to be in sync if they use your product/service on more than one cloud?
If data needs to live in multiple places, you may need a migration mechanism to move data from one cloud environment to another. Or, in many cases, you may need to continually synchronize the data between the different clouds. Continually moving data between cloud providers creates architectural challenges around synchronization latency (how does your solution handle updates that have yet to fully replicate and how does this handling affect your end-user experience?) and cost of data transfer.
One solution is to abstract the data access through an API and implement architectural components to reduce and, when necessary, accelerate data transfer. Another approach is to enforce separation through tenancy. A tenant on AWS, for example, is separate from a tenant on another cloud provider. If your end-customer creates the tenant on AWS, they will not have access to the data they put on another tenant in some other cloud, and vise-versa.
Remember: Data has gravity. Applications will always need to be deployed close to where the data is.
Summary
When defining their vendor and cloud strategies, todays companies face a complex set of requirements to build a sustainable business based on cloud technology. Fast time to market and agility are as essential and high quality products and services that serve the customers well. In addition, total cost of ownership, strategic risk considerations and strategies for maximizing the customer base are substantial considerations.
As as conclusion, you should not use a multi-vendor strategy for one solution or product unless you can justify the additional efforts and complexites with a substencial increase of market reach or customer adoption.
