6、安装 Kubernetes
该文档用来维护在不同的操作系统安装 kubernets,也会兼容考虑 x86 和 aarch64 CPU架构,主要是指导项目上安装 k8s 集群,不同的混合架构需要注意的地方在步骤中会有体现。另外,k8s 更新很快,维护不一定及时,仅供参考。
注意:无明确说明,默认所有节点都执行。
6.1、环境配置
6.1.1、主机名称配置
master节点
# hostnamectl set-hostname master
node1节点
# hostnamectl set-hostname node01
node2节点
# hostnamectl set-hostname node02
6.1.2、主机名与 IP 解析
# vi /etc/hosts
192.168.103.101 master k8s.talkedu
192.168.103.102 node01
192.168.103.103 node02
6.1.3、系统配置
6.1.3.1、关闭 swap
关闭 swap 分区
# swapoff -a
# sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
6.1.3.2、关闭防火墙
关闭防火墙 firewalld
# systemctl disable firewalld
# systemctl stop firewalld
# firewall-cmd --state
关闭 selinux
# setenforce 0
# sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
6.1.3.3、时间同步
时间同步
要求可以联网
# yum install chrony
# systemctl start chronyd
# systemctl enable chronyd
# chronyc sources -v
6.1.3.4、内核转发vs网桥过滤
加载 br_netfilter 模块
# cat > /etc/sysconfig/modules/br_netfilter.module <<EOF
modprobe -- br_netfilter
EOF
# chmod 755 /etc/sysconfig/modules/br_netfilter.module && /etc/sysconfig/modules/br_netfilter.module
# lsmod | grep br_netfilter
添加网桥过滤及内核转发配置文件
# cat <<EOF >/etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness=0
EOF
# sysctl -p /etc/sysctl.d/k8s.conf
6.1.3.5、安装 ipset 和 ipvsadm
安装 ipset 及 ipvsadm
# yum -y install ipset ipvsadm
配置 ipvsadm 模块加载方式
# cat > /etc/sysconfig/modules/ipvs.module <<EOF
modprobe -- ip_vs
modprobe -- ip_vs_sh
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- nf_conntrack
EOF
# chmod 755 /etc/sysconfig/modules/ipvs.module && /etc/sysconfig/modules/ipvs.module
6.2、安装 docker
建议为 docker 的数据目录准备一个单独的硬盘,专门存储容器化数据,也可以是 lvm 存储,以便于拓展。
# mkdir -pv /data/docker
# ln -sv /data/docker /var/lib/docker
准备 docker 安装源。
# wget -O /etc/yum.repos.d/docker-ce.repo https://mirrors.huaweicloud.com/docker-ce/linux/centos/docker-ce.repo
# sed -i 's+download.docker.com+mirrors.huaweicloud.com/docker-ce+' /etc/yum.repos.d/docker-ce.repo
-
注意:这个源里面会自动探测本地系统的版本和 CPU 架构,但是官方只会支持主流的操作系统和 CPU 架构,如果这里不支持你的系统和 CPU 架构,可以偿试手动修改,如下:
openEuler的源: baseurl=https://download.docker.com/linux/centos/$releasever/$basearch/stable 修改为: baseurl=https://download.docker.com/linux/centos/8/$basearch/stable
安装 docker:
查看官方支持的docker版本,选一个合适的版本
# yum list docker-ce --showduplicates | sort -r
因为后面k8s使用的是v1.28.2版本,docker-ce选择v24.0.9-1.el8或以上版本
# yum install docker-ce-24.0.9-1.el8 docker-ce-cli-24.0.9-1.el8 containerd.io
配置 docker 镜像加速:
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-level": "warn",
"log-opts": {
"max-size": "100m",
"max-file": "5"
},
"storage-driver": "overlay2",
"registry-mirrors":["https://docker.410006.xyz"],
"data-root": "/data/docker"
}
启动 docker:
# systemctl daemon-reload
# systemctl start docker
# systemctl enable docker
6.3、安装 cri-dockerd
k8s 调用容器运行时经历了两种常见的历史架构:
历史架构:Kubernetes --> docker-shim --> Docker Daemon --> Docker Engine --> 容器进程。
- 从 Kubernetes v1.20 开始已弃用,完全移除于 v1.24。
现代架构:Kubernetes --> containerd --> containerd-shim --> 容器进程。
- containerd 提供了一个内置的 CRI 插件(从 containerd v1.1 开始),能够直接与 Kubernetes 交互,完全符合 CRI 标准。
- CRI-O 是一个专为 Kubernetes 设计的容器运行时,原生支持 CRI,简化了容器运行时的实现逻辑。
cri-dockerd 属于第一种历史架构,选择这种架构,主要为了兼容以前的项目,与 docker-shim 区别:
-
docker-shim 是 Kubernetes 早期实现的一部分,它作为 Kubernetes 与 Docker 守护进程之间的适配层,直接与 Docker Engine 通信。
-
cri-dockerd 是 Docker 官方为了支持 Kubernetes 使用 Docker 作为容器运行时而开发的一个适配器。它提供了与 Kubernetes 的 CRI 接口兼容的功能,使得 Kubernetes 仍然可以通过 Docker 来启动和管理容器。
项目地址:https://github.com/Mirantis/cri-dockerd
# wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.16/cri-dockerd-0.3.16.arm64.tgz
# tar xf cri-dockerd-0.3.16.arm64.tgz
# cp cri-dockerd/cri-dockerd /usr/bin/
# chmod +x /usr/bin/cri-dockerd
配置启动文件:
# cat <<"EOF" > /usr/lib/systemd/system/cri-docker.service
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket
[Service]
Type=notify
ExecStart=/usr/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
生成 socket 文件:
# cat <<"EOF" > /usr/lib/systemd/system/cri-docker.socket
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
启动 cri-docker:
# systemctl daemon-reload
# systemctl start cri-docker.service
# systemctl enable
# systemctl enable cri-docker.service
# systemctl status cri-docker.service
6.4、安装 k8s
6.4.1、准备
本次安装 k8s 版本为 1.28.2,配置 yum 源:
# cat <<EOF | tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/rpm/
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/rpm/repodata/repomd.xml.key
EOF
# yum makecache
# yum list kubeadm kubelet kubectl --showduplicates | sort -r
# yum install kubeadm-1.28.2 kubelet-1.28.2 kubectl-1.28.2
注意:kubelet 服务由 kubeadm init 初始化时配置并拉起,不要手动配置,不然可能存在配置异常,服务起不来。
6.4.2、master 初始化
可以通过 kubeadm init 指定命令行参数初始化,也可以通过配置文件初始化,这里选择配置文件:
# vim kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.103.101 # masterIP
bindPort: 6443
nodeRegistration:
criSocket: unix:///run/cri-dockerd.sock
imagePullPolicy: IfNotPresent
name: master
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
controlPlaneEndpoint: "k8s.talkedu:6443"
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.28.2
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: "10.244.0.0/16"
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
查看需要使用的镜像列表,若无问题,将得到如下列表:
# kubeadm config images list --config kubeadm.yaml
registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2
registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.2
registry.aliyuncs.com/google_containers/kube-proxy:v1.28.2
registry.aliyuncs.com/google_containers/pause:3.9
registry.aliyuncs.com/google_containers/etcd:3.5.9-0
registry.aliyuncs.com/google_containers/coredns:v1.10.1
提前下载镜像到本地:
# kubeadm config images pull --config kubeadm.yaml
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.28.2
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.9
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.9-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.10.1
集群初始化:
# kubeadm init --config=kubeadm.yaml
......
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join k8s.talkedu:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:4edd6dc747bc908fd624c765c9a05f44b1aa1bd795dd0dccd879f73e554fcebc \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join k8s.talkedu:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:4edd6dc747bc908fd624c765c9a05f44b1aa1bd795dd0dccd879f73e554fcebc
-
如果初始化失败,需要重置初始化内容
kubeadm reset --cri-socket unix:///var/run/cri-dockerd.sock
验证 kubelet 服务有没有被拉起:
# systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Mon 2025-01-13 18:26:18 CST; 14h ago
Docs: https://kubernetes.io/docs/
Main PID: 38985 (kubelet)
Tasks: 58 (limit: 402004)
Memory: 59.4M ()
CGroup: /system.slice/kubelet.service
└─38985 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=>
Jan 14 08:57:34 master kubelet[38985]: E0114 08:57:34.134173 38985 kubelet.go:2855] "Container runtime network >
配置 kubectl 命令:
# mkdir -p $HOME/.kube
# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
# chown $(id -u):$(id -g) $HOME/.kube/config
查看 node 状态,因为没有安装网络插件,所以 node 的状态还是 NotReady:
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master NotReady control-plane 14h v1.28.2
安装 calico 网络插件:
- calico 官方会在 k8s 的多个版本上进行测试,请确保版本对应,参考 https://docs.tigera.io/calico/3.28/getting-started/kubernetes/requirements
- 按照官方步骤安装时不要把 kubectl create 换成 kubectl apply(有个小bug),官方步骤是在线安装,如果你网络不行,可以下载到本地安装,参考 https://docs.tigera.io/calico/3.28/getting-started/kubernetes/quickstart#install-calico
- 读懂 install-calico 的安装步骤再进行,我们之前安装 k8s 时指定了 pod 网络,请将 custom-resources.yaml 中的 pod cidr 换成我们之前初始化文件 kubeadm.yaml 中指定的 cidr
10.244.0.0/16
。
# kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.2/manifests/tigera-operator.yaml
# kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.2/manifests/custom-resources.yaml
检查 calico 是否安装成功:
# watch kubectl get pods -n calico-system
去掉控制节点的 taint,允许 pod 高度到 master:
# kubectl taint nodes --all node-role.kubernetes.io/control-plane-
至此,管理节点就绪:
# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane 15h v1.28.2 192.168.103.101 <none> openEuler 24.03 (LTS) 6.6.0-28.0.0.34.oe2403.aarch64 docker://24.0.9
6.4.3、node 新增
kubectl 命令自动补齐:
# yum install bash-completion -y
# source /usr/share/bash-completion/bash_completion
# source <(kubectl completion bash)
# kubectl completion bash >/etc/bash_completion.d/kubectl
打印集群加入命令,在 master 节点上执行命令:
# kubeadm token create --print-join-command
kubeadm join k8s.talkedu:6443 --token c2qh08.mkbf4b1lcu3twgfa --discovery-token-ca-cert-hash sha256:4edd6dc747bc908fd624c7d5c9a05f44b1aa1bd795dd0dccd879f73e554fcebc
- 默认情况下,
kubeadm token
创建的 token 是 有效期为 24 小时。 - node 节点加入时,在上述命令后面增加
--cri-socket unix:///run/cri-dockerd.sock
,因为我们前面安装 k8s 时没有选择默认的 containerd 或者 cri-o,而是 cri-dockerd。
# kubeadm join k8s.talkedu:6443 --token c2qh08.mkbf4b1lcu3twgfa --discovery-token-ca-cert-hash sha256:4edd6dc747bc908fd624c765c9a05f44b1aa1bd795dd0dccd879f73e554fcebc --cri-socket unix:///run/cri-dockerd.sock
查看 kubelet 服务是否被拉起:
# systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Tue 2025-01-14 11:47:49 CST; 15s ago
Docs: https://kubernetes.io/docs/
Main PID: 31819 (kubelet)
Tasks: 34 (limit: 402004)
Memory: 38.5M ()
CGroup: /system.slice/kubelet.service
└─31819 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=>
Jan 14 11:47:53 node01 kubelet[31819]: W0114 11:47:53.907806 31819 driver-call.go:149] FlexVolume: driver call >