New challenge, acquire all DevOps certifications in 3 popular public cloud providers, Azure is the first target of mine. Let's fucking go!!!

I will try to refer to AWS as much as I can because AWS is the first cloud I learned.

This article could contains some wrong information since I'm new to this, correct me if I'm wrong, thanks xD


Prepare for the exam

Useful resources:

Some section need to be understand before taking the exam.

  • Domain 1 — Identity & Governance (20–25%)
  • Domain 2 — Compute (20–25%)
  • Domain 3 — Storage (15–20%)
  • Domain 4 — Networking (15–20%)
  • Domain 5 — Monitor & Backup (10–15%)
  • Exam Tip/Tricks
  • Secure Weapons

Ok, let's take them one by one!


Domain 1 - Identity & Governance

Hierarchy

We can simply understand it by following (Generated by Sonnet 4.6 xD):

Hierarchy

RBAC

It likes IAM Roles of AWS but can be assign in multiple levels:

Management Group  →  inherit all belows
  └── Subscription  →  inherit all Resource Groups belows
        └── Resource Group  →  inherit all resources belows
              └── Resource  →  only that resource

Here are common built-in roles of Azure RBAC:

Role Permissions
Owner Full access + assign roles
Contributor Full access, without assign roles
Reader View only
User Access Administrator Manage access only

Resource reference: https://learn.microsoft.com/en-us/azure/role-based-access-control/overview

Inheritance: very very very important, assigned 1 time in Management Group, all sub resource inherit.

Azure Policy vs IAM Policy

Totally different versus AWS IAM Policy. Azure Policy enforce rules on resources, not permissions.

AWS Azure
IAM Policy = who can do what RBAC = who can do what
SCP = restrict accounts Azure Policy = enforce rules on resources

Example about Azure Policy:

  • VM only created in Southeast Asia.
  • Storage Account must enable encryption.
  • All resources must have tag Environment

Management Groups

Like AWS Organization, group Subscription for apply policies/RBAC

Root Management Group
  ├── Production
  │     ├── Subscription: prod-app1
  │     └── Subscription: prod-app2
  └── Development
        └── Subscription: dev-sandbox

Policy/RBAC assigned in Management Group, inherit to all subscription. Easy to understand, right?

Resource reference:

https://learn.microsoft.com/en-us/azure/governance/management-groups/overview

Little concept comparison

Concept AWS Azure
Identity provider IAM Entra ID (Azure AD)
Permissions IAM Policies RBAC Role Assignments
Resource rules Service Control Policies Azure Policies
Account grouping Organizations and Organization Units Management Groups
Billing boundary Account Subscription
Resource grouping Tags only Resource Groups + Tags
Delete protection Termination protection (EC2) Resource Locks (any resource)

Entra ID (Azure AD) vs AWS IAM

B2B: Business to Business

B2C: Business to Consumer

Feature AWS IAM Entra ID
Built-in MFA IAM MFA Azure MFA (enforce via Conditional Access)
SSO to apps IAM Identity Center Enterprise Apps
B2B access Cross-account roles Guest Users
B2C access Cognito Azure AD B2C

System-Assigned Managed Identity (SAMI) vs User-Assigned Managed Identity (UAMI)

Resource reference:

https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/overview

https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/managed-identity-best-practice-recommendations

SAMI UAMI
Lifecycle attach with resource, when delete resource, SAMI will be delete also Don't care, still stay after resource deletion xD
Reuse 1 resource → 1 identity 1 identity → multiple resources
When Throwaway, 1-1 with VM/App Service Reuse identity between multiple resources

Control Plane vs Data Plane

Subscription Owner
├── Control plane: create/delete/manage resource (Microsoft.Storage/*, Microsoft.KeyVault/*)
└── Data plane: No  must assign role explicitly

For example:

  • Owner can create Storage Account
  • Owner access blob (Binary Large Object) -> need role "Storage Blob Data Reader"
  • Owner can create Key Vault
  • Owner access secret -> need role "Key Vault Secrets User"

Resource Reference:

Managed Identity + Key Vault

Pattern:

App Service  [SAMI]  Key Vault Reference  Secret

Setup step:

  • Enable SAMI in App Service
  • Assign role Key Vault Secrets User for managed identity in Key Vault
  • App setting uses @Microsoft.KeyVault(SecretUri=...)

No need to store credentials anywhere, managed identity auto get token from Azure AD

K8s analogy: This pattern is identical to Workload Identity (AKS/GKE) or IRSA (EKS — IAM Roles for Service Accounts):

K8s:    Pod  [ServiceAccount]  IAM Role / Managed Identity  AWS/Azure resource
Azure:  App Service  [SAMI]  RBAC role  Key Vault secret

Same idea: workload is bound to an identity, the identity is granted permissions, and the runtime auto-exchanges tokens (via OIDC / IMDS endpoint). Code only calls the SDK — no secrets or keys needed anywhere. When Pod/App restarts, the identity persists — no manual credential rotation.

It also applies to in-cluster K8s RBAC: a Pod runs as a ServiceAccount, that SA is bound via Role/RoleBinding (or ClusterRole/ClusterRoleBinding), and calls to the K8s API server (default endpoint https://kubernetes.default.svc inside the Pod) are authenticated using the projected token at /var/run/secrets/kubernetes.io/serviceaccount/token. Same identity-binds-to-workload pattern, just scoped to the cluster instead of cloud resources.

Resource Reference:

Conclusion and Gotchas

  • RBAC = IAM Roles.
  • Azure Policy = Service Control Policies (but have more flexibility)
  • SAMI is often prefer because it will not make orphaned identities xD
  • Subscription owner can not read KV secret by default. Owner only have control plane permissions — need to self-grant data plane role (e.g. Key Vault Secrets User) if access is needed
  • Key Vault Contributor = manage vault configuration (control plane), NOT read secret
  • Key Vault Administrator = full data plane access (read/write/delete secrets, keys, certs)
  • For vault access model best practice: pick RBAC!

Resource Reference:


Domain 2 - Compute

Virtual Machine (VM) Availability

Availability Set Availability Zones VM Scale Sets (VMSS)
Protection Hardware/rack failure Datacenter failure Scale + HA
SLA 99.95% 99.99% depend config
Use case Legacy app, lift-and-shift Production critical Web tier, stateless
Region support All region Only region with zones All region

Exam Gotchas

  • SLA 99.99% → must pick Zones, not Set
  • Can not combine Availability Set and Availability Zones
  • Availability Set: Fault Domain (FD): different physical rack (power, switch) — max 3, default 2
  • Availability Set: Update Domain (UD): group VM not going to restart together when Azure patch — max 20

Resource Reference:

VM Encryption

Type What When
SSE (Server-Side Encryption) Azure auto encrypt in storage layer Default
ADE (Azure Disk Encryption) Encrypt OS + Data disks at OS level (BitLocker/DM-Crypt), key in Key Vault Compliance required OS-level encryption

Verdict: SSE almost ready for everything. ADE only when compliance is required.

Resource Reference:

Note: ADE is scheduled for retirement on September 15, 2028. Microsoft recommends migrating to encryption at host for new VMs.

Deployment slot

Azure App Service deployment slots are live, distinct environments for your web apps. They allow you to test changes, warm up instances, and achieve zero-downtime deployments. By managing configurations independently, you can deploy updates to a staging slot, verify them, and instantly swap them into production. (from AI Overview when google search xD).

Some note for exam:

  • Deployment slot is not available in Free/Basic plan. If question related to staging environment, pick Standards.
  • After slot swap, both slot still running, for rolling back, just simple by swap back.
  • App Service Plan = compute resource.

Resource Reference:

ARM (Azure Resource Manager) and Bicep

  • ARM Template = Cloudformation Template of AWS
  • Bicep still is ARM but better syntax (Azure only)
  • Still prefer to use Terraform instead of this fucking shit when it comes to production!

Containers

Comparison between 3 services

Service What When
ACI (Azure Container Instances) Run 1 container fast, no fucking cluster Dev/test, batch jobs, CI/CD runners
Container Apps Serverless containers with scaling (KEDA inside), ingress Production apps, microservices
AKS Managed Kubernetes You know when....

Resource Reference:


Domain 3 - Storage

Azure Storage Account vs AWS S3 — Mental Model

AWS                              Azure
───────────────────────────────────────────────────
(No)                            Storage Account (wrap layer)
S3 Bucket                       Blob Container
S3 Object                       Blob
S3 Bucket Policy                Container Access Policy / RBAC
S3 Lifecycle Rules              Lifecycle Management
S3 Versioning                   Blob Versioning
S3 Storage Classes              Access Tiers (Hot/Cool/Archive)
S3 Presigned URL                SAS Token
S3 Static Website               Static Website (in Blob)
───────────────────────────────────────────────────
EFS                             Azure Files
SQS                             Queue Storage
DynamoDB (simple use)           Table Storage

Key Differences:

AWS Azure
Create S3 bucket directly Need to init Storage Account first, then create container
Each bucket have it owns settings Storage Account contains setting for all services inside
Bucket name globally unique Storage Account name globally unique, Blob container only need unique in internal Storage Account
Bucket policy per bucket RBAC can be set at account or container level

Example:

AWS:
  - s3://my-images-bucket        (Create)
  - s3://my-backup-bucket        (Create)  
  - EFS: my-team-share           (Other service)

Azure:
  - mystorageaccount
      ├── container: images      (Like S3 bucket)
      ├── container: backups     (Like S3 bucket)
      └── fileshare: team-share  (Like EFS)

Immutable vs Mutable Settings

Can not change after created:

  • Storage Account name
  • Location/Region
  • Performance tier (Standard ↔ Premium)
  • Hierarchical namespace (For exam, but in fact we have migration tool)

Can be change after created:

  • Redundancy (LRS/ZRS/GRS) - but limit
  • Access tier (Hot/Cool)
  • Network/Firewall rules
  • Data protection (soft delete, versioning)
  • Encryption settings

We need to pick Name, Region, Performance tier right for the first damn time!

Replication Types

In single region:

Type What When
LRS (Locally Redundant) 3 copies, 1 datacenter Hardware failure
ZRS (Zone Redundant) 3 copies, 3 zones Datacenter / zone failure

Cross-region:

Type What When Read from secondary?
GRS (Geo-Redundant) 6 copies (3+3), 2 regions Region failure No, Never able to read from secondary, must failover first, after failover, secondary become new primary
RA-GRS (Read-Access GRS) 6 copies, 2 regions Region failure Yes (always)
GZRS ZRS primary + LRS secondary Zone + region failure No
RA-GZRS ZRS primary + LRS secondary Zone + region failure Yes

Pick replication by keyword:

When What
Disk/hardware failure, cheapest LRS
Zone failure, within region ZRS
Region down, data still available GRS (If no continuously read)
Region down + can read secondary continuously RA-GRS
Zone + region, highest availability RA-GZRS
Available even if a region goes down and cost-effective GRS

Resource Reference:

Exam Tricks for Storage Features (non-replication)

For replication keywords (LRS/ZRS/GRS/RA-GRS/GZRS/RA-GZRS), see the Replication Types section above.

When What
Data cannot be deleted/modified for compliance Immutable storage (WORM - Write once read many)
Automatically move old data to cheaper tier Lifecycle Management Policy
Share files between VMs (SMB) Azure Files
Store VM disk Managed Disk (Premium SSD/Standard SSD/HDD)
Cheapest storage for rarely accessed archived data Archive tier
Restore accidentally deleted blob Soft Delete
Grant temp access to a specific blob without sharing key SAS token

Domain 4 - Networking

Resource Reference:

Quick Mapping from AWS VPC xD

AWS                              Azure
───────────────────────────────────────────────────
VPC                          →   VNet
Subnet                       →   Subnet
Security Group               →   NSG (Network Security Group)
Elastic IP                   →   Public IP (Static)
ENI                          →   NIC (Network Interface Card)
VPC Peering                  →   VNet Peering
NAT Gateway                  →   NAT Gateway
Route Table                  →   Route Table

But there is still some little different. And we will go one by one

Subnet and Availability Zones (AZs)

  • AWS: each Subnet locked into 1 AZ. To achieve HA, we need to create multiple subnet in different AZ.
  • Azure: Subnet expanded to all AZ in single region. We can place VM in AZ 1, AZ 2, AZ 3 into same subnet.

NSG (Network Security Group) vs SG (Security Group) / NACL

  • AWS: Use SG (applied at ENI/Instance) and Network ACL (NACL) (applied at Subnet)
  • Azure: Use only NSG. NSG Can be applied both to NIC or Subnet
  • NSG is stateful, NACL is stateless

Stateful vs Stateless — what does it mean?

  • Stateful (NSG, AWS SG): firewall remembers connections. When outbound traffic is allowed, the return traffic is automatically allowed — the firewall tracks the session in a connection table.
  • Stateless (AWS NACL): firewall has no memory. Every packet is evaluated independently → you must define rules for both directions (inbound AND outbound).

Quick example — VM calling GitHub API (HTTPS port 443):

VM (random port 54321) ──[request]──→ GitHub (port 443)
VM (random port 54321) ←──[response]── GitHub (port 443)
NSG (stateful) NACL (stateless)
Rules needed 1 outbound: Allow TCP 443 Outbound: Allow TCP 443 + Inbound: Allow TCP from 443 → ephemeral ports 1024-65535
Forget the return rule? Still works (auto-tracked) Response dropped → connection fails

That's why Azure only needs one NSG instead of AWS's two-layer SG + NACL model — simpler, no need to open wide ephemeral port ranges, less misconfiguration risk.

Resource Reference:

VNet Peering & Gateway

AWS                              Azure
───────────────────────────────────────────────────
VPC Peering (same region)       VNet Peering
VPC Peering (cross-region)      Global VNet Peering
VPN Gateway                     Virtual Network Gateway
Transit Gateway                 Virtual WAN / VNet Gateway
Site-to-Site VPN                VPN Gateway Connection
Direct Connect                  ExpressRoute
When What
Connect 2 VNets same region VNet Peering
Connect 2 VNets different region Global Peering
On-prem ↔ Azure (internet) VPN Gateway
On-prem ↔ Azure (dedicated) ExpressRoute

Resource Reference:

Azure DNS

AWS                              Azure
───────────────────────────────────────────────────
Route53                      →   Azure DNS
Route53 Hosted Zone (public) →   Public DNS Zone
Route53 Private Hosted Zone  →   Private DNS Zone
Route53 Domain Registration  →   App Service Domain
Route53 Health Checks        →   Traffic Manager
Route53 Routing Policies     →   Traffic Manager

Azure separates health checks + traffic routings via Traffic Manager, not built-in like Route53.

Azure Load Balancing

AWS                              Azure
───────────────────────────────────────────────────
NLB (Layer 4)                →   Azure Load Balancer
ALB (Layer 7)                →   Application Gateway
CloudFront + ALB (global)    →   Azure Front Door
Route53 routing policies     →   Traffic Manager (DNS-based)
Service Layer Use case
Load Balancer L4 VM load balancing, non-HTTP
Application Gateway L7 Web apps, SSL termination, WAF
Front Door L7 Global Global apps, CDN + LB + WAF
Traffic Manager DNS Failover, geo routing

Resource Reference:

Azure Network Watcher (Debug tools)

AWS                              Azure
───────────────────────────────────────────────────
VPC Flow Logs                →   NSG Flow Logs
VPC Reachability Analyzer    →   IP Flow Verify / Connection Troubleshoot
VPC Traffic Mirroring        →   Packet Capture

Verdict: Almost 1:1 with AWS VPC/Route53/ELB, just a fucking name different.

Resource Reference:

Exam Gotchas

  • ExpressRoute does not go via internet → question "secure private connection" = ExpressRoute.
  • VPN Policy-based not support Point-to-Site.
  • Active-Active VPN Gateway = 2 tunnel, high availability.
  • ExpressRoute Global Reach: connect 2 on-prem sites via Azure backbone (no need internet).
  • Question "consistent latency, high bandwidth" = ExpressRoute.
  • Question "quick setup, cost-effective" = VPN Gateway.

Domain 5 - Monitor & Backup

Quick Mapping from AWS for Backup xD

AWS                              Azure
───────────────────────────────────────────────────
AWS Backup                   →   Azure Backup
Backup Vault                 →   Recovery Services Vault
AWS DRS / CloudEndure        →   Azure Site Recovery (ASR)
EBS Snapshots                →   VM Backup (disk snapshots)

Backup Services

Service What
Azure Backup Backup VMs, databases, files (daily, point-in-time)
ASR (Azure Site Recovery) Replicate VMs to another region, failover when DR

Verdict: Azure Backup = AWS Backup, ASR = DRS (AWS Elastic Disaster Recovery).

Resource Reference:

Quick Mapping from AWS CloudWatch

AWS                              Azure
───────────────────────────────────────────────────
CloudWatch                   →   Azure Monitor
CloudWatch Metrics           →   Metrics
CloudWatch Logs              →   Log Analytics Workspace
CloudWatch Logs Insights     →   Kusto Query Language (KQL)
CloudWatch Alarms            →   Alerts
CloudWatch Agent             →   Azure Monitor Agent
X-Ray                        →   Application Insights

Resource Reference:


Exam Tip/Tricks

Summary of common pitfalls, this is not study guide. Only pattern to drop wrong answer xD

1. "X replaces NSG/Firewall"

This is always wrong. Bastion, Private Endpoint, VPN Gateway, Front Door, App Gateway... not replace NSG. NSG is layer filter , always apply together.

2. VNet Peering

Trap Correct
Peering is transitive Wrong — A↔B, B↔C does not imply A↔C
Enable Use Remote Gateway in 1 side is enough Wrong, must be paired with Allow Gateway Transit with gateway side
1 spoke use multiple remote gateway Wrong — only one

3. Storage Replication

See Domain 3 — Replication Types for the full picker table.

Pitfalls: "highest availability within a region" → ZRS (not GRS). GRS is cross-region. G stands for GEO

4. Backup vs Site Recovery

  • Azure Backup = data recovery (file, VM disk, DB) — RPO calculate by hour/day
  • Azure Site Recovery (ASR) = disaster recovery (replicate VM into another region) — RPO/RTO low

Question "recover deleted file" → Azure Backup. "Region down, switch to another region" → ASR.

5. Recovery Services vault vs Backup vault (naming pitfall )

Vault Workload
Recovery Services vault (Old) Azure VM, MARS agent (files/folders/system state on-prem), SQL/SAP HANA in VM, Azure File Shares
Azure Backup vault (New) PostgreSQL, Blob, Managed Disks, AKS, MySQL

Keyword "files, folders, system state" → alway are Recovery Services vault (Because attach with MARS agent). Don't be wrong with "Azure Backup vault"

Resource Reference:

6. RBAC vs Azure Policy

  • RBAC = who did what (access)
  • Policy = resource allows to exist or not (compliance, deny tag missing, deny SKU ...)

If questions are "enforce tag", "block region" → Policy. Otherwise "grant read access" → RBAC.

7. Four built-in roles

Role Manage resources Assign roles
Owner Ok Ok
Contributor Ok Hell no
Reader No (read-only) No
User Access Administrator No Ok

Common pitfalls: exam question "full manage resources but no assign roles" → answer is Contributor, NOT User Access Administrator (UAA). UAA is reversed role — Only assign roles, not going to have permissions to touch resources.

Trick:

  • Owner = Contributor + UAA
  • Contributor = Everything but not able to give permissions
  • UAA = Only able to give permissions

8. RBAC scope — 4 levels

Management Group  (top, contains multiple subscription)
    ↓ inherit
Subscription
    ↓ inherit
Resource Group
    ↓ inherit
Resource          ← VNet, VM, Storage, Key Vault... individual resource.

Rules: Anything that contains Resource ID can be scope. Role assigned at higher scope, inherit to lower scope.

Pitfalls: List specific like "Virtual Network", "Storage Account" make distractor, it still is valid scope (Level Resource).

Least privilege pattern: Instead assign Contributor at Subscription, assign role lower scope, like Network Contributor at that damn Virtual Network.

9. Locks

Lock what Lock
ReadOnly Both modify + delete
CanNotDelete only delete, still able to modify

Gotcha: ReadOnly lock can break some operations thought it was "read".

Example: list keys of Storage Account actually is post request.

Resource Reference:

10. Cost Management Pitfalls

  • Reserved Instance = commit 1/3 year, reduced up to 72%
  • Spot VM = cheap but can be evicted
  • Hybrid Benefit = bring license Windows/SQL on-prem to Azure

If question "predictable workload, lowest cost" → Reserved Instance. "Batch job, can be restart" → Spot.

Resource Reference:

11. An overly absolute answer → 90% chance of being a distractor

"red flag" in answers xD:

Red flag Why
automatically Azure rare do for user — license, scaling, backup... all need config
all / every All PaaS services, every VM in subscription —> Azure doesn't apply blanket
only Each user can have only one license, only via portal — Azure normally can have multiple ways
never / cannot Azure dynamic, Absolute prohibitions are rare
replaces Service A "replaces" Service B → mostly they are individual layer

12. Network Watcher — choose the right tool

What Tool
latency / packet loss VM↔VM Connection Monitor
Test 1 time: VM A can connect to VM B port X? Connection Troubleshoot
This Packet would dropped by NSG or not? IP Flow Verify
Effective NSG rules applied NIC Effective Security Rules
Log all traffic via NSG NSG Flow Logs
Draw network topology Topology
Capture packet to analyze Wireshark Packet Capture

Need to remember:

  • If question "between two VMs" + "network" → reflex Network Watcher.
  • If question "continuous/monitor over time" → Connection Monitor.
  • If question "one-time test" → Connection Troubleshoot.

13. Azure Monitor — Metrics vs Logs vs Insights

Pitfalls: exam ask "diagnostic between 2 specific VM" → Hell No, it is NOT Insights. Insights is overview, not deep-dive.

Differentiation Insights vs Log Analytics when exam have "analyze":

Exam Pick
view dashboard / performance overview / dependency map Insights
analyze / query / compliance report Log Analytics
across multiple VMs + summary data Log Analytics
Update Management / patch compliance Log Analytics (data put into workspace)
Change tracking Log Analytics

14. Private DNS + Private Endpoint + Hybrid resolution

Questions around "Can VM/on-prem resolve name of private or not?". Need to remember scope of each components.

Component Scope Note
Private DNS Zone Global 1 zone link, multiple VNet in every region/sub
VNet Region No cross-region
VNet Peering Cross-region/cross-sub OK Not share DNS by default
DNS Private Resolver (inbound/outbound endpoint) Region HA need resolver for each region
Private Endpoint NIC Region (in subnet)

Pitfall 1: Peering not share DNS

  • VNet-A have Private Endpoint + zone already link → VM-A resolve OK. VM-B in VNet-B (peered) fail because zone haven't link to VNet-B.

  • Fix: link zone to both VNet. Zone is global, link is unlimit

Pitfall 2: Hybrid DNS need Private Resolver

  • On-prem need to resolve mystorage.privatelink.blob...
On-prem DNS  ──conditional fwd──→  DNS Private Resolver
                                   (inbound endpoint, region X)
                                          
                                   linked Private DNS Zones
                                          
                                   return IP private endpoint
  • Inbound endpoint = on-prem DNS forward query into Azure (resolve privatelink zones)
  • Outbound endpoint = Azure VM resolve DNS on-prem via conditional forwarding rules.
  • HA cross-region → deploy resolver in 2 region

Common Scenario:

Symptoms Reason Fix
VM resolve private endpoint → public IP Zone haven't link VNet Link zone into VNet
Spoke not resolve even peered hub Zone link only hub Link zone into spoke also
On-prem resolve → public IP Don't have hybrid DNS Deploy DNS Private Resolver + conditional fwd
HA cross-region hybrid Resolver region-bound Resolver in 2 region
Zone in different sub without link Cross-sub permission Provide Network Contributor for zone

Resource Reference:


Published

Category

Knowledge Base

Tags

Contact