Why 99.9% Uptime Requires Infrastructure You Cannot Build at Home
A 99.9% uptime SLA means the server is unavailable for a maximum of 8.7 hours per year — roughly 44 minutes per month. Achieving this requires eliminating every single point of failure across power, cooling, network, and hardware. This is not possible in a standard office environment. It requires the layered redundancy architecture that only professional data centres provide.
Power Redundancy: Never Running on One Source
Power failure is the most common cause of server downtime in Indian office environments. Data centres address this at multiple levels:
- Dual utility feeds: Connection to the electricity grid from two independent substations — an entire substation failure does not cut power
- Uninterruptible Power Supply (UPS): Large battery banks providing instantaneous switchover during grid interruptions — no gap between grid failure and generator startup
- Diesel generators: Automatic startup within 10–15 seconds — provides power during extended grid outages indefinitely
- PDU redundancy: Servers with dual power supplies connected to independent power distribution units — a single PDU failure does not affect servers with redundant power
The contrast with an office UPS: office UPS units provide 15–60 minutes of backup power, after which they fail. During extended power cuts — 3–6 hours common in many Indian areas — office servers are down. Data centre generators run indefinitely with diesel replenishment.
Cooling Redundancy: Servers Cannot Overheat
Servers generate substantial heat. Uncontrolled temperature rise causes thermal throttling (reduced performance) and eventually hardware failure. Data centres maintain precision cooling:
- Computer Room Air Handlers (CRAHs): Multiple redundant cooling units — typically N+1 configuration where N units cool the facility and one is on standby
- Chilled water systems: Chillers with redundant units providing cold water to air handlers
- Hot aisle / cold aisle containment: Server racks arranged to direct hot exhaust air to hot aisles and cool air to cold aisles — maximising cooling efficiency
- Temperature monitoring: Continuous temperature sensors throughout the facility with automated alerts
In contrast, a server room in an Indian office typically has a single split AC unit. When that unit fails or is turned off, server temperature rises rapidly. Many Indian server failures trace back to cooling failures — particularly in summer months when ambient temperatures are high.
Network Redundancy: Multiple Internet Connections
Data centres connect to the internet through multiple independent internet service providers (ISPs) with diverse physical routing. If one ISP fails or one fibre cable is cut, traffic automatically reroutes through other providers. Border Gateway Protocol (BGP) routing manages this automatic failover at the network layer.
In practice, internet connectivity is one of the most reliable aspects of data centre infrastructure — multiple independent redundant paths make a complete internet connectivity failure extremely rare. Individual ISP outages are transparent to hosted servers.
Hardware Redundancy: Components That Fail Gracefully
- RAID storage: Multiple drives in redundant array — single drive failure does not cause data loss; array continues operating while failed drive is replaced
- Dual power supplies in servers: Independent power paths — single PSU failure does not power down the server
- ECC memory: Detects and corrects single-bit memory errors automatically — prevents silent data corruption
- Hot-spare hardware: Replacement components available on-site for rapid swap without procurement delays
The Compounded Effect: Why 99.9% Is Achievable
No single layer of redundancy achieves 99.9% uptime alone. The compounded effect of multiple independent redundant systems does. Power fails, but UPS and generator cover it. An ISP fails, but BGP routes traffic through another. A drive fails, but RAID continues. A cooling unit fails, but the standby unit takes over. Each layer's redundancy covers the other layers' failures — simultaneous independent failures across multiple layers are extremely rare.
Frequently Asked Questions
Even with full redundancy, some downtime is inevitable: planned maintenance windows (OS updates, hardware maintenance, infrastructure upgrades), rare simultaneous failures exceeding redundancy capacity, or events that overwhelm all redundancy layers (natural disasters, extended utility failures beyond generator fuel capacity). M A Global Network schedules maintenance during off-hours and provides advance notice. The 0.1% accounts for these unavoidable events — in practice, managed uptime often significantly exceeds the 99.9% SLA.
99.9% Uptime SLA — Backed by Redundant Infrastructure
Redundant power · Precision cooling · Multi-ISP network · RAID storage. ₹700/user/month + 18% GST. 7-day risk-free guarantee.