y0ngb1n

Aben Blog

欢迎来到我的技术小黑屋ヾ(◍°∇°◍)ノ゙
github

Kubernetes From Beginner to Practice: Building K3s Cluster Environment Across Clouds with WireGuard

Introduction#

Recently, I found another way to have fun from the article "Deploying k3s Cluster Across Cloud Vendors", which inspired me to write this article. With the recent Double 11 promotions from major cloud service providers, Tencent Cloud is again offering small instances for just a few bucks for three years. The question arises: each cloud service provider can only purchase one instance at the promotional price, and ultimately our cloud instances will be distributed across different cloud service providers, leading to underutilization. Is it possible to integrate them to output computing power? Of course! That is to use WireGuard to set up a Kubernetes cluster.

Since the version mentioned in the previous article is outdated, I will start based on the latest software version and follow the article's method.

PS: After completing the setup, I realized a hard truth: whether you have something to do or not, you really need to read the official documentation of the software or engage in the official community (like GitHub) more often. The official documentation is already detailed enough, and there are people in the community who have already encountered various pitfalls. If you want to find clarity, read the official documentation more.

Environment Preparation#

SoftwareVersion
Ubuntu20.04
Docker20.10
WireGuardv1.0.20200513
K3sv1.23.14+k3s1

I have prepared several cloud instances pre-installed with Ubuntu 20.04 on Tencent Cloud and Vultr. Of course, they can be any cloud service provider's instances, as long as they have public access and can run Linux systems.

Cloud ProviderPublic IPConfigurationNode NameNode RoleOS-IMAGEKERNEL-VERSIONCONTAINER-RUNTIME
Tencent Cloud42.193.XXX.XXX4C4Gk3s-node-01control-plane,masterUbuntu 20.04 LTS5.4.0-96-genericdocker://20.10.13
Vultr45.63.YYY.YYY1C1Gk3s-node-02agent/workerUbuntu 20.04.3 LTS5.4.0-131-genericdocker://20.10.11
Vultr13.22.ZZZ.ZZZ1C1Gk3s-node-03agent/workerUbuntu 20.04.5 LTS5.4.0-122-genericdocker://20.10.12

Install Docker#

sudo apt install docker.io -y

Install WireGuard#

Make sure to install the WireGuard software on each node. The installation details for Ubuntu 20.04 are as follows:

# Switch to root privileges
sudo -i

# Update software sources
apt update

# Install WireGuard software
apt install wireguard resolvconf -y

# Enable IPV4 IP forwarding
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
sysctl -p

Here, you only need to complete the correct installation of WireGuard; no configuration or startup is required. Other tasks can be left to K3s for network configuration. What we need to do is to make K3s work efficiently and effectively. In fact, K3s has already prepared everything for us; we just need to configure the startup parameters simply.

Building K3s Cluster Across Clouds#

Since my cloud instances are distributed across different cloud service providers, they cannot access each other through the internal network environment provided by the providers. Here, we need to use WireGuard to complete the remote networking. Since K3s has already integrated WireGuard through Flannel, we can easily complete the networking with some simple configurations.

You need to install WireGuard on every node, both server and agents before attempting to leverage the WireGuard flannel backend option. The wireguard backend will be removed from v1.26 in favor of wireguard-native backend natively from Flannel.

Before starting, it is recommended that everyone read the official guide. Only by understanding the background can we clear up the confusion. It mentions that the configuration parameters for different versions of K3s vary, and it is worth noting that starting from v1.26, the startup parameter has changed from flannel-backend: wireguard to flannel-backend: wireguard-native.

We can refer to the deployment commands in the previous article "Kubernetes Getting Started to Practice: Initial Experience with K3s Cluster".

Install K3s Server#

When starting each Server, you need to add the following startup parameters to activate WireGuard:

--node-external-ip <SERVER_EXTERNAL_IP> --flannel-backend wireguard-native --flannel-external-ip

The complete startup process for K3s Server is as follows:

# Specify K3s version
export INSTALL_K3S_VERSION=v1.23.14+k3s1
# Only install, do not start
export INSTALL_K3S_SKIP_START=true
# Design a unique name for each node added to the cluster: https://docs.rancher.cn/docs/k3s/installation/installation-requirements/_index#先决条件
export K3S_NODE_NAME=k3s-node-01
# Get the public IP of the node, or set it manually
export PUBLIC_IP=`curl -sSL https://ipconfig.sh`
##
# Custom startup execution command:
# --docker: Enable Docker runtime
# --disable servicelb: (optional) Disable servicelb
# --disable traefik: (optional) Disable traefik
# --node-ip $PUBLIC_IP --node-external-ip $PUBLIC_IP: Set the public IP of the node for inter-node communication
# --flannel-backend wireguard-native --flannel-external-ip: Enable Flannel CNI and use WireGuard for networking
export INSTALL_K3S_EXEC="--docker --disable servicelb --disable traefik --node-ip $PUBLIC_IP --node-external-ip $PUBLIC_IP --flannel-backend wireguard-native --flannel-external-ip"
# Install using Alibaba Cloud mirror source script
curl -sfL https://rancher-mirror.oss-cn-beijing.aliyuncs.com/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -
# Manually start K3s service, as INSTALL_K3S_SKIP_START=true has been configured above
systemctl enable --now k3s
# Check K3s service status
systemctl status k3s
# Check metrics-server running logs for troubleshooting
kubectl logs -f pod/metrics-server-5bb8d5f679-btt96 -n kube-system

Install K3s Agent#

When starting each Agent, you need to add the following startup parameters to activate WireGuard:

--node-external-ip <AGENT_EXTERNAL_IP>

The complete startup process for K3s Agent is as follows:

# Basic parameters remain consistent with Server startup parameters
export INSTALL_K3S_VERSION=v1.23.14+k3s1
export INSTALL_K3S_SKIP_START=true
# Only configure Agent-specific parameters
export K3S_NODE_NAME=k3s-node-02
export PUBLIC_IP=`curl -sSL https://ipconfig.sh`
export INSTALL_K3S_EXEC="--docker --node-ip $PUBLIC_IP --node-external-ip $PUBLIC_IP"
# Set K3S_URL, which defaults to "agent". If K3S_URL is not set, it defaults to "server"
export K3S_URL=
# Shared secret for joining server or agent to the cluster
# Obtain on the main node: cat /var/lib/rancher/k3s/server/node-token
export K3S_TOKEN=K105a308b09e583fccd1dd3a11745826736d440577d1fafa5d9dbaf5213a7150f5f::server:88e21efdad8965816b1da61e056ac7c4
# Since my nodes are on Vultr, I will use the official source script for installation; domestic hosts can still use Alibaba Cloud mirror source script
# curl -sfL https://rancher-mirror.oss-cn-beijing.aliyuncs.com/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -
curl -sfL https://get.k3s.io | sh -
# Manually start K3s service, as INSTALL_K3S_SKIP_START=true has been configured above
systemctl enable --now k3s-agent
# Check K3s service status
$ systemctl status k3s-agent
 k3s-agent.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s-agent.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2022-11-19 12:23:59 UTC; 50s ago
       Docs: https://k3s.io
    Process: 707474 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited, status=0/SUCCESS)
    Process: 707476 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
    Process: 707477 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
   Main PID: 707478 (k3s-agent)
      Tasks: 14
     Memory: 255.6M
     CGroup: /system.slice/k3s-agent.service
             └─707478 /usr/local/bin/k3s agent

Nov 19 12:23:59 vultr k3s[707478]: I1119 12:23:59.986059  707478 network_policy_controller.go:163] Starting network policy controller
Nov 19 12:24:00 vultr k3s[707478]: I1119 12:24:00.057408  707478 network_policy_controller.go:175] Starting network policy controller full sync goroutine
Nov 19 12:24:00 vultr k3s[707478]: I1119 12:24:00.698216  707478 kube.go:133] Node controller sync successful
Nov 19 12:24:00 vultr k3s[707478]: I1119 12:24:00.702931  707478 kube.go:331] Overriding public ip with '45.63.XXX.XXX' from node annotation 'flannel.alpha.coreos.com/public-ip-overwrite'
Nov 19 12:24:00 vultr k3s[707478]: time="2022-11-19T12:24:00Z" level=info msg="Wrote flannel subnet file to /run/flannel/subnet.env"
Nov 19 12:24:00 vultr k3s[707478]: time="2022-11-19T12:24:00Z" level=info msg="Running flannel backend."
Nov 19 12:24:00 vultr k3s[707478]: I1119 12:24:00.865979  707478 wireguard_network.go:78] Watching for new subnet leases
Nov 19 12:24:00 vultr k3s[707478]: I1119 12:24:00.866258  707478 wireguard_network.go:172] Subnet added: 10.42.0.0/24 via 42.193.XXX.XXX:51820
Nov 19 12:24:00 vultr k3s[707478]: I1119 12:24:00.956299  707478 iptables.go:260] bootstrap done
# Check cluster node information; you will find that the INTERNAL-IP of k3s-node-02 is the public IP
$ kubectl get nodes -owide
NAME          STATUS   ROLES                  AGE   VERSION         INTERNAL-IP     EXTERNAL-IP      OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
k3s-node-01   Ready    control-plane,master   19m   v1.23.14+k3s1   10.0.20.12      42.193.XXX.XXX   Ubuntu 20.04 LTS     5.4.0-96-generic    docker://20.10.13
k3s-node-02   Ready    <none>                 14m   v1.23.14+k3s1   45.63.YYY.YYY   45.63.YYY.YYY    Ubuntu 20.04.3 LTS   5.4.0-131-generic   docker://20.10.11

About metrics-server Issues#

The metrics-server cannot retrieve metrics because the preferred value for kubelet-preferred-address-types is InternalIP, while the internal IP of the cloud server is an internal IP, which varies across different cloud providers and cannot communicate.

The issue is that metrics-server cannot obtain core metrics such as CPU and memory utilization, requiring manual intervention for configuration. In the newly released version v1.23.14+k3s1, it has been corrected that enabling the flannel-external-ip=true option will dynamically adjust the priority order of -kubelet-preferred-address-types=ExternalIP,InternalIP,Hostname.

Below, I will elaborate on how this feature adjustment affects the K3s cluster:

In versions v1.23.13+k3s1 and older#

Check the default configuration:

# Cannot obtain core metrics such as CPU and memory utilization
$ kubectl top node
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

$ kubectl top node
NAME          CPU(cores)   CPU%        MEMORY(bytes)   MEMORY%
k3s-node-01   133m         3%          2232Mi          56%
k3s-node-02   <unknown>    <unknown>   <unknown>       <unknown>

$ kubectl describe pod/metrics-server-d76965d8f-t2sll -n kube-system | grep -i types
      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname

You need to modify the manifests of metrics-server by using the following command to edit the manifests online:

kubectl -n kube-system edit deploy metrics-server

Adjust the following execution parameters and save:

    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=10250
-       - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
+       - --kubelet-preferred-address-types=ExternalIP
        - --kubelet-insecure-tls
        - --kubelet-use-node-status-port
        - --metric-resolution=15s

After saving, wait for the resources to be rescheduled, and this will allow the metrics-server to use the public IP to communicate with the nodes. Check the core metrics again:

$ kubectl top node
NAME          CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
k3s-node-01   259m         6%     2269Mi          57%
k3s-node-02   203m         20%    534Mi           54%

$ kubectl top pods -A
NAMESPACE     NAME                                      CPU(cores)   MEMORY(bytes)
default       nginx-85b98978db-659cl                    0m           5Mi
default       nginx-85b98978db-tt2hh                    0m           5Mi
default       nginx-85b98978db-zr47g                    0m           2Mi
kube-system   coredns-d76bd69b-k8949                    4m           15Mi
kube-system   local-path-provisioner-6c79684f77-nc2xn   1m           7Mi
kube-system   metrics-server-d76965d8f-t2sll            6m           25Mi

In versions v1.23.14+k3s1 and newer#

  1. Describe pod/metrics-server-, look for ARGS and check those scenarios: -kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname when flannel-external-ip=false
  2. Do the same steps just change flannel-external-ip: true and look for -kubelet-preferred-address-types=ExternalIP,InternalIP,Hostname when flannel-external-ip=true

⚠️ Notes ⚠️#

(Don't ask why, just try it)

  • Security group firewalls need to allow relevant ports
    • TCP 6443: K3s Server port
    • TCP 10250: metrics-server service port, used for communication between K3s Server and Agent to collect metrics; otherwise, core metrics such as CPU and memory utilization cannot be obtained
    • UDP 51820: Open the default port for flannel-backend: wireguard-native, which is used by Flannel's backend with WireGuard
    • TCP 30000-32767: K8s NodePort range, convenient for external debugging
  • Optional startup parameters
    • --tls-san
      • Add other hostnames or IPs as alternative names in the TLS certificate
      • This allows access control and operation of remote clusters through public IP in a public environment
      • Or if deploying multiple servers and using LB for load balancing, you need to retain the public address
    • --disable servicelb
    • --disable traefik
  • Disable unused components to save performance
    • Service Load Balancer
      • K3s provides a load balancer called Klipper Load Balancer, which can use available host ports. It allows the creation of LoadBalancer type Service, but does not include the implementation of LB. Some LB services require cloud providers, such as Amazon EC2. In contrast, K3s service LB allows the use of LB services without a cloud provider.
      • To disable the embedded LB, use the --disable servicelb option when starting each Server.
    • Traefik Ingress Controller
      • Traefik is a modern HTTP reverse proxy and load balancer. By default, Traefik is deployed when starting the Server. The Traefik ingress controller will use ports 80 and 443 on the host (i.e., these ports cannot be used for HostPort or NodePort).
      • To disable it, use the --disable traefik option when starting each server.

Verify K3s Cross-Cloud Cluster and Network#

Verify Cross-Cloud Cluster#

$ kubectl get nodes -owide
NAME          STATUS   ROLES                  AGE   VERSION         INTERNAL-IP     EXTERNAL-IP      OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
k3s-node-01   Ready    control-plane,master   69m   v1.23.14+k3s1   10.0.20.12      42.193.XXX.XXX   Ubuntu 20.04 LTS     5.4.0-96-generic    docker://20.10.13
k3s-node-02   Ready    <none>                 63m   v1.23.14+k3s1   45.63.XXX.XXX   45.63.YYY.YYY    Ubuntu 20.04.3 LTS   5.4.0-131-generic   docker://20.10.11
k3s-node-03   Ready    <none>                 16m   v1.23.14+k3s1   13.22.ZZZ.ZZZ   13.22.ZZZ.ZZZ    Ubuntu 20.04.5 LTS   5.4.0-122-generic   docker://20.10.12
$ kubectl create deploy whoami --image=traefik/whoami --replicas=3
deployment.apps/whoami created

$ kubectl get pod -owide
NAME                      READY   STATUS    RESTARTS   AGE   IP          NODE          NOMINATED NODE   READINESS GATES
whoami-84d974bbd6-57bnt   1/1     Running   0          10m   10.42.1.4   k3s-node-02   <none>           <none>
whoami-84d974bbd6-hlhdq   1/1     Running   0          10m   10.42.2.2   k3s-node-03   <none>           <none>
whoami-84d974bbd6-g894t   1/1     Running   0          10m   10.42.0.6   k3s-node-01   <none>           <none>

$ kubectl create deploy nginx --image=nginx --replicas=3
deployment.apps/nginx created

$ kubectl get pod -owide
NAME                      READY   STATUS    RESTARTS   AGE   IP           NODE          NOMINATED NODE   READINESS GATES
whoami-84d974bbd6-hlhdq   1/1     Running   0          82m   10.42.2.2    k3s-node-03   <none>           <none>
whoami-84d974bbd6-g894t   1/1     Running   0          82m   10.42.0.6    k3s-node-01   <none>           <none>
nginx-85b98978db-ptvcb    1/1     Running   0          32s   10.42.1.5    k3s-node-02   <none>           <none>
nginx-85b98978db-m2nlm    1/1     Running   0          32s   10.42.2.3    k3s-node-03   <none>           <none>
nginx-85b98978db-fs8gk    1/1     Running   0          32s   10.42.0.17   k3s-node-01   <none>           <none>

Verify Cross-Cloud Network#

Use the built-in CoreDNS, Service, and Pod to debug the network and verify whether the network is reachable between different nodes.

Before starting, quickly create a Service named whoami:

$ kubectl expose deploy whoami --type LoadBalancer --port 80 --external-ip 42.193.XXX.XXX
service/whoami exposed

$ kubectl get svc -owide
NAME         TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)        AGE   SELECTOR
kubernetes   ClusterIP      10.43.0.1      <none>           443/TCP        75m   <none>
whoami       LoadBalancer   10.43.77.192   42.193.XXX.XXX   80:32064/TCP   12s   app=whoami

$ kubectl describe svc whoami
Name:                     whoami
Namespace:                default
Labels:                   app=whoami
Annotations:              <none>
Selector:                 app=whoami
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.43.77.192
IPs:                      10.43.77.192
External IPs:             42.193.XXX.XXX
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  32064/TCP
Endpoints:                10.42.0.6:80,10.42.1.4:80,10.42.2.2:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

How to test the load balancing effect of the Service?#

Since the IP addresses of Service and Pod are all internal network segments of the Kubernetes cluster, we need to use kubectl exec to enter the Pod (or SSH into any node of the cluster) and then use tools like curl to access the Service.

Thanks to the built-in CoreDNS of the cluster, we can access the corresponding Service and Pod internally via domain names:

  • The fully qualified domain name of the Service object is "object.namespace.svc.cluster.local", but often the latter part can be omitted, and just writing "object.namespace" or even "object" is sufficient, as it defaults to the namespace where the object is located (in this case, default)
    • For example, whoami, whoami.default, etc.
  • Kubernetes also assigns a domain name to each Pod, in the form of "IP address.namespace.pod.cluster.local", but the . in the IP address needs to be replaced with -
    • For example, 10.42.2.2 corresponds to the domain name 10-42-2-2.default.pod

This way, we no longer need to worry about the IP addresses of Service and Pod objects; we just need to know their names and can access the backend services using DNS.

Access via External Network#

$ curl http://42.193.XXX.XXX:32064
Hostname: whoami-84d974bbd6-57bnt
IP: 127.0.0.1
IP: 10.42.1.4
RemoteAddr: 10.42.0.0:42897
GET / HTTP/1.1
Host: 42.193.XXX.XXX:32064
User-Agent: curl/7.68.0
Accept: */*

$ curl http://42.193.XXX.XXX:32064
Hostname: whoami-84d974bbd6-hlhdq
IP: 127.0.0.1
IP: 10.42.2.2
RemoteAddr: 10.42.0.0:3478
GET / HTTP/1.1
Host: 42.193.XXX.XXX:32064
User-Agent: curl/7.68.0
Accept: */*

$ curl http://42.193.XXX.XXX:32064
Hostname: whoami-84d974bbd6-g894t
IP: 127.0.0.1
IP: 10.42.0.6
RemoteAddr: 10.42.0.1:3279
GET / HTTP/1.1
Host: 42.193.XXX.XXX:32064
User-Agent: curl/7.68.0
Accept: */*

By accessing the Service via the external network, after multiple requests, it can be observed that the main node [http://42.193.XXX.XXX:32064](http://42.193.XXX.XXX:32064) serves as the access entry, successfully load balancing to Pods on different nodes and responding correctly.

Accessing Service CLUSTER-IP from Each Node in the Cluster#

root@k3s-node-01:~# curl 10.43.77.192:80
Hostname: whoami-84d974bbd6-g894t
IP: 127.0.0.1
IP: 10.42.0.6
RemoteAddr: 10.42.0.1:22291
GET / HTTP/1.1
Host: 10.43.77.192
User-Agent: curl/7.68.0
Accept: */*

root@k3s-node-01:~# curl 10.43.77.192:80
Hostname: whoami-84d974bbd6-57bnt
IP: 127.0.0.1
IP: 10.42.1.4
RemoteAddr: 10.42.0.0:23957
GET / HTTP/1.1
Host: 10.43.77.192
User-Agent: curl/7.68.0
Accept: */*

root@k3s-node-01:~# curl 10.43.77.192:80
Hostname: whoami-84d974bbd6-hlhdq
IP: 127.0.0.1
IP: 10.42.2.2
RemoteAddr: 10.42.0.0:26130
GET / HTTP/1.1
Host: 10.43.77.192
User-Agent: curl/7.68.0
Accept: */*

By directly accessing the Service from the nodes, after multiple requests, it can also successfully load balance to Pods on different nodes and respond correctly.

Accessing Service and Pod from Within the Cluster#

$ kubectl exec -it nginx-85b98978db-ptvcb -- sh
# curl whoami
Hostname: whoami-84d974bbd6-g894t
IP: 127.0.0.1
IP: 10.42.0.6
RemoteAddr: 10.42.1.5:36010
GET / HTTP/1.1
Host: whoami
User-Agent: curl/7.74.0
Accept: */*

# curl whoami.default
Hostname: whoami-84d974bbd6-57bnt
IP: 127.0.0.1
IP: 10.42.1.4
RemoteAddr: 10.42.1.5:33050
GET / HTTP/1.1
Host: whoami.default
User-Agent: curl/7.74.0
Accept: */*

# curl whoami.default.svc.cluster.local
Hostname: whoami-84d974bbd6-hlhdq
IP: 127.0.0.1
IP: 10.42.2.2
RemoteAddr: 10.42.1.5:57358
GET / HTTP/1.1
Host: whoami.default.svc.cluster.local
User-Agent: curl/7.74.0
Accept: */*

Through various network verifications, it is proven that the multi-cloud networking environment using Flannel integrated with WireGuard is functional and can be used confidently.


Original address: https://y0ngb1n.github.io/a/setup-k3s-cluster-multicloud-with-wireguard.html

If you find the content useful, feel free to like and share it with your friends; thank you in advance.

If you want to see updates on subsequent content faster, please hit "like", "share", or "favorite". These free encouragements will influence the update speed of future content.


Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.