Azure resiliency concepts
Last updated:
AZURERESILIENCY
1. Fault domains
- common set of hardware that has a SPoF like a rack
2. Update Domains
- group of nodes that are upgraded together
2. Availability sets (99.95%)
Protection against rack level failures.
- Logical grouping of nodes so that they are deployed over different racks.
- So that if 1 goes down, other is available
- Don’t mix functionalities. Example 1 for DCs. 1 for sql for example.
3. [[202404081830 Azure Availability Zones|Availability Zones]] (99.99%)
Racks live in a DC set (separate power, cooling, network) This provides protection again DC level failures. Minimum of 3 zones in every region. Even if there are more, in your subscription you will see 3.
4. Regions and Pairs
Set of DC sets becomes a region. 2ms latency roundtrip window between DC sets/ Availability Zones. Paired regions - main thing is azure does not update both regions at the same time