Series: Container Networking Deep Dive

In the previous two posts, we performed a deep dive into Docker's Ingress Flow and Egress Flow by manually tinkering with veth, iptables, and nsenter.

But in Kubernetes (K8s), thousands of Pods are created and destroyed continuously. Kubelet can't afford the manual labor of typing those commands for every single Pod. It needs an "assistant" to do the heavy lifting. That assistant is the CNI (Container Network Interface).

Today, we will "strip down" CNI by running it manually (by hand), without using any sophisticated CNI suites like Calico or Cilium.


1. Prerequisites (Setup)

To follow this lab on Ubuntu 24.04, you will need:

  1. Docker: The foundational engine to run Kind.
  2. Kind v0.31.0 (Released Dec 18, 2025): A tool to create K8s Nodes as Docker containers.
  3. Kubectl: The CLI to control your cluster.
  4. jq: A divine JSON processor for parsing PIDs.

Detailed Installation:

# Install Kind v0.31.0
curl -Lo ./kind https://github.com/kubernetes-sigs/kind/releases/download/v0.31.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind
kind version 
# Expected output: kind v0.31.0 go1.25.5 linux/amd64

# Install jq
sudo apt update && sudo apt install jq -y

2. Decoding the "Core" Concepts

What exactly is CNI?

It's not a background service. CNI is actually just a Specification that defines how Kubelet calls binaries located in /opt/cni/bin/. Kubelet feeds it JSON, the Binary executes the task and returns the result.

IPAM (IP Address Management)

If the main CNI Plugin (like bridge or ptp) handles "plugging the wires," then IPAM handles the "ledgering" and decides which IP to use.

  • It manages IP ranges (Subnets), tracking which IPs are assigned and which are free.
  • The Delegation: The main CNI binary (like ptp) doesn't choose the IP itself. It reads the "ipam" section in the JSON config, sees "type": "host-local", and implicitly executes the host-local binary (also located in /opt/cni/bin/).
  • The IPAM plugin returns an available IP back to the main plugin, and the main plugin then applies that IP to the container's network interface.
  • In this lab, host-local stores its IP database simply as text files on the Node's disk (in /var/lib/cni/networks/).

crictl - When Docker is no longer "The Boss"

Modern K8s uses the CRI (Container Runtime Interface - API between Kubelet and Container Runtime) β€” usually containerd. crictl is the tool to interact with the CRI. We use it here because Kind runs K8s nodes using containerd inside Docker.


3. Lab: Running CNI by Hand (Manual Execution)

Step 1: Spin up a "Naked" Cluster

Delete the old cluster (if any), since I have only 1 kind cluster for testing purpose.

root@kienlt-lab-utilities:~# kind get clusters
kind
root@kienlt-lab-utilities:~# kind delete cluster --name kind
Deleting cluster "kind" ...
Deleted nodes: ["kind-control-plane"]

Create a kind-config.yaml file to block the default network:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  disableDefaultCNI: true 
kind create cluster --config kind-config.yaml

Output:

root@kienlt-lab-utilities:~# kind create cluster --config kind-config.yaml
Creating cluster "kind" ...
 βœ“ Ensuring node image (kindest/node:v1.35.0) πŸ–Ό
 βœ“ Preparing nodes πŸ“¦
 βœ“ Writing configuration πŸ“œ
 βœ“ Starting control-plane πŸ•ΉοΈ
 βœ“ Installing StorageClass πŸ’Ύ
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Have a nice day! πŸ‘‹
root@kienlt-lab-utilities:~# kubectl get nodes
NAME                 STATUS     ROLES           AGE   VERSION
kind-control-plane   NotReady   control-plane   12s   v1.35.0

Check running containers:

root@kienlt-lab-utilities:~# docker ps
CONTAINER ID   IMAGE                  COMMAND                  CREATED         STATUS         PORTS                       NAMES
e174ac300eb9   kindest/node:v1.35.0   "/usr/local/bin/entr…"   4 minutes ago   Up 4 minutes   127.0.0.1:43013->6443/tcp   kind-control-plane

Step 2: Create a Pod β€” Witness Kubelet's "Helplessness"

Try to create a Pod WITHOUT any CNI config:

kubectl run nginx-test --image=nginx

Result: The Pod is stuck in Pending:

root@kienlt-lab-utilities:~# kubectl get pods
NAME         READY   STATUS    RESTARTS   AGE
nginx-test   0/1     Pending   0          12s

Run kubectl describe to see why:

root@kienlt-lab-utilities:~# kubectl describe pod nginx-test
...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  10s   default-scheduler  0/1 nodes are available: 
           1 node(s) had condition {Ready False}. preemption: ...

# Check node status
root@kienlt-lab-utilities:~# k get nodes
NAME                 STATUS     ROLES           AGE   VERSION
kind-control-plane   NotReady   control-plane   13s   v1.35.0

Why? No CNI config β†’ Kubelet marks the Node as NotReady (as seen in Step 1) β†’ The Node is tainted with node.kubernetes.io/not-ready:NoSchedule β†’ The Scheduler cannot place the Pod on any Node β†’ The Pod stays in Pending indefinitely.

Step 3: Call the CNI Binary Manually β€” Exposing the Truth

Now, we will prove that CNI is just: a binary + environment variables + JSON via stdin. No magic involved. But let's me show you complete flows first (made with Mermaid)

cni flow

Enter the Kind node:

docker exec -it kind-control-plane bash

Check available CNI binaries:

root@kind-control-plane:/# ls -l /opt/cni/bin
total 18804
-rwxr-xr-x 1 root root 4354987 Dec 15 23:24 host-local
-rwxr-xr-x 1 root root 4306167 Dec 15 23:24 loopback
-rwxr-xr-x 1 root root 5108415 Dec 15 23:24 portmap
-rwxr-xr-x 1 root root 5475215 Dec 15 23:24 ptp

Manually create a network namespace (simulating the namespace containerd would create for a Pod):

ip netns add demo-cni-ns

Set all required Environment Variables according to the CNI Spec:

export CNI_COMMAND=ADD
export CNI_CONTAINERID=demo-container-001
export CNI_NETNS=/var/run/netns/demo-cni-ns
export CNI_IFNAME=eth0
export CNI_PATH=/opt/cni/bin

Variable Breakdown:

Variable Meaning
CNI_COMMAND=ADD Tells the plugin: "Add a network interface to this namespace"
CNI_CONTAINERID Arbitrary ID to identify the container (used for IPAM tracking)
CNI_NETNS Path to the network namespace to be configured
CNI_IFNAME Name of the interface to be created inside the namespace
CNI_PATH Directory containing CNI binaries (used to find auxiliary plugins like IPAM)

We gonna use ptp for this demo, why? ptp easier than bridge because it doesn't need to create bridge device, suitable for demo.

Feed the JSON config into the ptp binary's stdin:

cat <<EOF | /opt/cni/bin/ptp
{
  "cniVersion": "0.4.0",
  "name": "demo",
  "type": "ptp",
  "ipam": {
    "type": "host-local",
    "subnet": "10.249.0.0/24"
  }
}
EOF

Wait, where the fuck cniVersion comes from? can I put 9.9.9?

Same energy as the question: But here you go:

https://github.com/containernetworking/cni/blob/main/SPEC.md#version

Result: The ptp binary returns JSON confirming a successful IP assignment:

{
    "cniVersion": "0.4.0",
    "interfaces": [
        {
            "name": "vetha6b605d3",
            "mac": "8e:c0:6e:6a:da:c4"
        },
        {
            "name": "eth0",
            "mac": "da:4d:d4:6f:fb:84",
            "sandbox": "/var/run/netns/demo-cni-ns"
        }
    ],
    "ips": [
        {
            "version": "4",
            "interface": 1,
            "address": "10.249.0.2/24",
            "gateway": "10.249.0.1"
        }
    ],
    "dns": {}
}

Verify β€” the namespace now has a real IP:

root@kind-control-plane:/# ip netns exec demo-cni-ns ip addr show eth0
# Output
2: eth0@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether da:4d:d4:6f:fb:84 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.249.0.2/24 brd 10.249.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::d84d:d4ff:fe6f:fb84/64 scope link
       valid_lft forever preferred_lft forever
root@kind-control-plane:/# ip netns
demo-cni-ns
# Check what ip that have been assigned
root@kind-control-plane:/# ls /var/lib/cni/networks/demo/
10.249.0.2  last_reserved_ip.0  lock

Boom! With just a binary call + env vars + JSON stdin, we created a functional network interface with an IP for a namespace. This is exactly what Kubelet does behind the scenes whenever it creates a Pod.

Clean up the demo namespace, so we will have to do this everytime when we create new namespace, that is why we are heading to step 4 for automation. But please clear old namespace first:

# Call DEL to release the assigned IP (returning it to IPAM)
# The assigned IP WAS 10.249.0.2
export CNI_COMMAND=DEL # We need to set DEL to release the IP
export CNI_CONTAINERID=demo-container-001    # must be same as ADD
export CNI_NETNS=/var/run/netns/demo-cni-ns
export CNI_IFNAME=eth0                   # interface name in netns
export CNI_PATH=/opt/cni/bin

cat <<EOF | /opt/cni/bin/ptp
{
  "cniVersion": "0.4.0",
  "name": "demo",
  "type": "ptp",
  "ipam": {
    "type": "host-local",
    "subnet": "10.249.0.0/24"
  }
}
EOF

# Manually delete!
root@kind-control-plane:/# ip netns del demo-cni-ns

# Validate after delete!
root@kind-control-plane:/# ls /var/lib/cni/networks/demo/
last_reserved_ip.0  lock
root@kind-control-plane:/# cat /var/lib/cni/networks/demo/last_reserved_ip.0
10.249.0.2

Step 4: "Rescuing" the Pod β€” Letting Kubelet Call CNI

Now, let's create a CNI config file so Kubelet knows which plugin to use:

# Still inside the Kind node
mkdir -p /etc/cni/net.d/

cat <<EOF > /etc/cni/net.d/10-dummy.conf
{
  "cniVersion": "0.4.0",
  "name": "dummy",
  "type": "ptp",
  "ipam": {
    "type": "host-local",
    "subnet": "10.249.0.0/24"
  }
}
EOF

Back on your Host machine, wait a few seconds and check:

root@kienlt-lab-utilities:~# kubectl get pods
NAME         READY   STATUS    RESTARTS   AGE
nginx-test   0/1     Pending   0          118s
root@kienlt-lab-utilities:~# kubectl get pods
NAME         READY   STATUS              RESTARTS   AGE
nginx-test   0/1     ContainerCreating   0          2m1s
root@kienlt-lab-utilities:~# kubectl get pods
NAME         READY   STATUS    RESTARTS   AGE
nginx-test   1/1     Running   0          2m28s
root@kienlt-lab-utilities:~# k get pod -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP           NODE                 NOMINATED NODE   READINESS GATES
nginx-test   1/1     Running   0          87m   10.249.0.4   kind-control-plane   <none>           <none>

Check how many IP it assigned inside control plane

root@kind-control-plane:/# ls /var/lib/cni/networks/dummy/
10.249.0.2  10.249.0.3  10.249.0.4  10.249.0.5  last_reserved_ip.0  lock

Ok, we see IP 10.249.0.4. How about other IPs? Easy AF, get all pod and grep! (forget about local-path-provisioner xD)

root@kienlt-lab-utilities:~# k get pod -owide -A|grep 10.249
default              nginx-test                                   1/1     Running   0               91m   10.249.0.4   kind-control-plane   <none>           <none>
kube-system          coredns-7d764666f9-hdwlt                     0/1     Running   0               91m   10.249.0.2   kind-control-plane   <none>           <none>
kube-system          coredns-7d764666f9-p94hv                     0/1     Running   0               91m   10.249.0.5   kind-control-plane   <none>           <none>
local-path-storage   local-path-provisioner-67b8995b4b-qxhwt      0/1     Error     6 (3m45s ago)   91m   10.249.0.3   kind-control-plane   <none>           <none>

So, we know where the IP assigned from!

The Pod lives! Here is what just happened:

cni flow

  1. Kubelet detects the new config file and initializes its network state.
  2. Kubelet updates the Node's status to Ready (removing the NotReady taint).
  3. The Kubernetes Scheduler sees a Ready Node and finally schedules our pending nginx-test Pod to kind-control-plane.
  4. Kubelet receives the assigned Pod and tells containerd to create the Pod Sandbox.
  5. containerd calls /opt/cni/bin/ptp (exactly as we did in Step 3) β†’ Network setup succeeds β†’ Pod transitions to Running.

Verify using crictl + nsenter:

docker exec -it kind-control-plane bash

POD_ID=$(crictl pods --name nginx-test -q | head -n 1)
PID=$(crictl inspectp $POD_ID | jq '.info.pid')
echo "PID: $PID"

nsenter -t $PID -n ip addr show eth0

Output:

PID: 1516
2: eth0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 0a:68:56:c0:9c:a3 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.249.0.4/24 brd 10.249.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::868:56ff:fec0:9ca3/64 scope link
       valid_lft forever preferred_lft forever

Kubelet did exactly what we did manually in Step 3: set env vars β†’ piped JSON into the binary β†’ received the IP.


4. Summary

Step What happens
No CNI config Kubelet is helpless, Node is NotReady, Pod is stuck in Pending
Manual CNI execution Proves CNI = binary + env vars + JSON stdin
Create config file Kubelet automates the process for every Pod
  • CNI isn't magic; it's a coordination between a Binary (execution) and JSON (declaration).
  • IPAM (host-local) manages IP addresses β€” stored directly on disk at /var/lib/cni/networks/ to prevent conflicts.
  • Kubelet is just a middleman β€” reading config from /etc/cni/net.d/, setting env vars, then calling the binary. That's it.

Understanding how CNI operates "by hand" will take the fear out of networking errors. In the next post, we'll see how Cilium levels up this game using eBPF.

References:


Published

Category

Knowledge Base

Tags

Contact