During my time at Palantir, I have spent significant time deploying our software in cloud environments and also a good chunk of time deploying our software in on-premise (on-prem) environments (including starting a team doing just that). I have noticed that despite the common preference for cloud deployment, there are still merits to deploying on-prem.
The Shift from On-Prem to Cloud Computing
Over recent years, the IT landscape has increasingly favored cloud computing, driven by the flexibility of Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) offerings. The global cloud computing market grew from $24.63 billion in 2010 to $156.4 billion in 2020, and that trend continues and is predicted to surpass $1 trillion by 2028. This meteoric rise is powered by both new demand of compute by the world, but also migration of on-prem workflows to the cloud.
There are good reasons for this shift, the cloud enables rapid provisioning of resources, geographic redundancy, and a shift from capital expenditures (CapEx) to operational expenditures (OpEx). However, I believe that there are certain scenarios where the use of on-prem infrastructure, particularly where specific technical requirements, such as deterministic latency, hardware-level control, and stringent security measures, are paramount.
Before we dive into the meat of comparing between cloud setups vs. on-prem setups, let’s take a little time to explore how each deployment is usually set up.
Canonical On-Prem Set Up
A typical on-prem setup involves a fully controlled environment where the enterprise manages all layers of the technology stack. This includes:
- Physical Layer: Hardware such as servers, storage arrays (SAN/NAS), and networking equipment (routers, switches, firewalls). You also need a data center to run your servers in.
- Virtualization Layer: Often implemented using hypervisors like VMware vSphere, Microsoft Hyper-V, or open-source alternatives like KVM, providing virtualized resources and isolation.
- Storage and Compute: Directly managed, often optimized for specific workloads with custom configurations (e.g., RAID levels, caching mechanisms).
- Networking: Full control over networking protocols, routing, and security policies, enabling fine-tuned QoS (Quality of Service) and minimizing latency.
Canonical Cloud Set Up
In a typical cloud setup, the infrastructure is abstracted and managed by the cloud provider, which offers:
- Virtualized Infrastructure: Compute instances, virtual networks, and storage are provisioned via APIs. Cloud-native technologies such as Kubernetes and serverless architectures are leveraged for orchestration and scaling.
- Managed Services: Databases (e.g., Amazon RDS, Google Cloud SQL), data lakes, AI/ML services, and other advanced analytics tools are offered as managed services, reducing the operational overhead.
- Multi-Tenant Architecture: Resources are often shared among multiple tenants, with virtualization and containerization providing isolation.
Comparison
- Scale and speed: The cloud excels in elastic scaling, facilitated by horizontal scaling mechanisms like auto-scaling groups and serverless functions. On-prem infrastructure requires careful capacity planning and investment in physical hardware, often involving vertical scaling. If you are a start-up that cannot wait 9 months for new compute, cloud is the way to go. However, if you are a big enterprise with the ability to forecast compute load over the next year, then on-prem could be a viable option.
- Latency: On-prem setups can achieve deterministic low-latency due to the proximity of servers and direct control over network paths. For latency-sensitive applications (e.g., high-frequency trading, real-time analytics), this can be critical. Cloud environments, while optimized for low-latency through features like AWS Direct Connect or Google Cloud Interconnect, can introduce variable latency due to factors like network congestion (noisy neighbors problem) and virtualization overhead. So in the case where predictable latency and low-latency workflows are critical to your business, then on-prem set ups might not be a bad idea!
- Cost: Cloud pricing models (pay-as-you-go, reserved instances) provide flexibility but can become costly for sustained, high-volume workloads, particularly with high data egress costs. On-prem solutions, while requiring significant initial CapEx for hardware acquisition, can offer lower total cost of ownership (TCO) over time, especially for predictable, high-utilization workloads. Cloud models can get pricy, Dropbox saved $16.8 million in 2016 when it moved most of its storage away from AWS to its own data centers. Here is an article that goes deeper into how being on-prem might save you $$.
- Expertise: On-prem environments demand deep expertise in hardware maintenance, network engineering, and systems administration. Conversely, cloud environments offload much of the infrastructure management to the provider, allowing teams to focus on application development and deployment, often using DevOps practices and Infrastructure as Code (IaC) tools like Terraform or CloudFormation. SREs and Sys-admins are a rare breed nowadays and you probably will need a sizable chunk of compute in order to justify hiring a team of these folks to maintain and operate an on-prem setup (and do not forget you need a sustainable on-call rotation!)
- Security: On-prem setups provide complete control over security configurations, from physical security measures to granular network segmentation and encryption protocols. This control is essential for meeting specific compliance standards (e.g., PCI-DSS, HIPAA). In contrast, cloud environments require trust in the provider’s security measures, though features like Virtual Private Clouds (VPCs), dedicated hardware (e.g., AWS Outposts), and customer-managed encryption keys can mitigate some concerns. If you are running workflows that necessitate these security controls and protocols, deploying on-prem might be the only way to go. If not, a tighter security might be a consideration! However, I do think that current cloud providers are adapting and have offerings that meet some of these standards (e.g. AWS GovCloud).
Conclusion
While cloud computing offers unmatched flexibility, scalability, and access to advanced managed services, on-premise solutions are still indispensable in scenarios requiring low-latency, high security, and full control over hardware and software configurations. As the IT landscape evolves, the decision between cloud and on-prem should be guided by the specific technical and business requirements, ensuring that the chosen infrastructure aligns with the organization’s strategic goals, and you can use the factors above and comparisons to make a more informed choice of what setup is better for you! Good luck!
This article was originally published by Adam (Xing Liang) Zhao on HackerNoon.