ubuntu20.04 安装 kubernetes

准备工作

机器准备

本地起了三台虚拟机,分别为

  • k8s-master 172.16.230.10
  • k8s-node01 172.16.230.11
  • k8s-node02 172.16.230.12

修改apt 源

将三台机器的源全部改为国内源:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# 备份源
sudo mv /etc/apt/sources.list /etc/apt/sources.list.bak

# 一键更新替换163的源 实测这个比较快
sudo bash -c "cat << EOF > /etc/apt/sources.list && apt update
deb http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
EOF"


# 一键替换阿里的源

sudo bash -c "cat << EOF > /etc/apt/sources.list && apt update
deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
EOF"

修改hostname

1
2
3
4
5
6
7
8
9
10
# master节点
hostnamectl --static set-hostname k8s-master

# node 节点

hostnamectl --static set-hostname k8s-node01
hostnamectl --static set-hostname k8s-node02

# 执行完毕后重启或者执行下面命令 生效
hostname $hostname

禁用swap 分区

临时禁用

1
2
3
4
// 注释掉有swap的行
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
// 关闭swap
swapoff -a

禁用防火墙

1
2
3
4
5
systemctl stop ufw
systemctl disable ufw
# 查看防火墙状态
root@k8s-node02:~# ufw status
Status: inactive

安装docker

一键安装docker脚本install_docker.sh

1
2
3
4
5
6
7
8
9
10
11
#!/bin/bash
apt update
apt install apt-transport-https ca-certificates curl gnupg-agent software-properties-common
sudo curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
add-apt-repository \
"deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \
$(lsb_release -cs) \
stable"
apt update
apt install docker-ce docker-ce-cli containerd.io
docker --version

或者一步一步执行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# 1. 移除自带的docker
sudo apt-get remove docker docker-engine docker-ce docker.io

# 2. 更新软件
sudo apt-get update

# 3. 安装必要的工具
sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common

# 4 添加官方的GPG秘钥
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
// 或者使用阿里云的:
sudo curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -

# 5. 设置stable存储库
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
// 或者使用aliyun:
add-apt-repository \
"deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \
$(lsb_release -cs) \
stable"

# 6. 更新软件源
sudo apt-get update

# 7. 查看docker可用版本
apt-cache madison docker-ce

查看docker版本大致如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
root@k8s-master:~# apt-cache madison docker-ce
docker-ce | 5:20.10.17~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.16~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.15~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.14~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.13~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.12~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.11~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.10~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.9~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.8~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.7~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.6~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.5~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.4~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.3~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.2~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.1~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:20.10.0~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:19.03.15~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:19.03.14~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:19.03.13~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:19.03.12~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:19.03.11~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:19.03.10~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
docker-ce | 5:19.03.9~3-0~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages

第二栏为版本号,这里选择版本号为:5:19.03.15~3-0~ubuntu-focal。将版本号放到 docker-ce= 后面,执行命令如下所示:

1
sudo apt-get install docker-ce=5:19.03.15~3-0~ubuntu-focal

安装完后检查 docker 的状态:

1
systemctl status docker

配置docker:

1
2
3
4
5
6
7
8
9
10
11
cat <<EOF | sudo tee /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"registry-mirrors": ["https://6rf3psgj.mirror.aliyuncs.com"],
"storage-driver": "overlay2"
}
EOF

其中,registry-mirrors 添加国内的 docker 镜像源,这里使用阿里云的。获取方法:登录阿里云后,转到 cr.console.aliyun.com/cn-guangzhou... ,点击左边菜单栏的「镜像工具」-「镜像加速」即可看到镜像源地址。
最后,重新启动 Docker 并在启动时启用:

1
2
3
4
5
6
# 开启自启动
sudo systemctl enable docker
# 加载docker 配置
sudo systemctl daemon-reload
# 重启docker
sudo systemctl restart docker

查看docker 信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
root@k8s-master:~# docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
scan: Docker Scan (Docker Inc., v0.17.0)

Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 19.03.15
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
runc version: v1.1.4-0-g5fd4c4d
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.4.0-125-generic
Operating System: Ubuntu 20.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.81GiB
Name: k8s-master
ID: YBJ7:B3NK:EN3P:4TTH:P2ZF:THUX:PW6K:4P3S:TLRR:ZT4Q:QIPQ:UJI3
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Registry Mirrors:
https://6rf3psgj.mirror.aliyuncs.com/
Live Restore Enabled: false

WARNING: No swap limit support

安装 kubeadm kubelet kubectl

安装

先安装依赖和配置:

1
2
3
4
5
6
7
8
9
10
# 安装系统工具
apt-get update && apt-get install -y apt-transport-https

# 安装 GPG 证书
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -

# 写入软件源;注意:我们用系统代号为 bionic,但目前阿里云不支持,所以沿用 16.04 的 xenial
cat << EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF

开始安装:

1
2
apt-get update  
apt-get install -y kubelet kubeadm kubectl

如果想指定版本:

1
apt-get install -y kubelet=1.19.8-00 kubeadm=1.19.8-00 kubectl=1.19.8-00

各组件说明:

  • kubeadm:用来初始化集群的指令。
  • kubelet:在集群中的每个节点上用来启动 Pod 和容器等。
  • kubectl:用来与集群通信的命令行工具。

接着,设置 kubelet 自启动并启动:

1
systemctl enable kubelet && systemctl start kubelet

如果安装错误了,可以使用以下命令重置以下:

1
2
3
kubeadm reset

apt autoremove -y kubelet kubectl kubeadm kubernetes-cni

配置kubeadm

1
2
3
# 导出配置文件
# 在home目录下
kubeadm config print init-defaults --kubeconfig ClusterConfiguration > kubeadm.yml

修改配置文件kubeadm.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# 修改为主节点 ip
advertiseAddress: 172.16.230.10
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: node
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
# 修改镜像源为 阿里云
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
# 修改版本号,必须对应
kubernetesVersion: 1.19.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
# 新增该配置 用于后续 Calico网络插件
podSubnet: 192.168.0.0/16
scheduler: {}

执行下面命令拉取镜像:

1
kubeadm config images pull --config kubeadm.yml

初始化master节点

1
kubeadm init --config=kubeadm.yml --upload-certs | tee kubeadm-init.log

大概等几分钟就执行完毕。输出信息的结尾会有如下信息:

1
2
3
4
Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.16.230.10:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:7250987e25e31d2576e94d2d918b72d90377ef3cabe3e00dddab673821832ea0

后面添加 worker 节点的时候,可通过运行 kubeadm join... 命令添加。
接下来配置 kubectl,执行:

1
2
3
4
5
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

# 修改文件所有者(非 ROOT 用户才需要执行)
chown $(id -u):$(id -g) $HOME/.kube/config

验证是否成功:

1
2
3
4
kubectl get nodes

NAME STATUS ROLES AGE VERSION
k8s-master NotReady master 74s v1.19.8

表示主节点已经配置成功。如果安装失败,重试的话,需先运行: kubeadm reset 清理配置文件。

配置 worker 节点

首先,新开一台机器,重复上面【配置kubeadm】之前的操作,在 worker 节点安装 dockerkubeadmkubectlkubelet,除了 hostname改为不一样的名字,安装完后不用执行启动命令。
其次,在 master 节点上,可以使用之前控制台输出的 join 命令,也可以重新生成 join 命令。步骤如下:

  1. 创建新的 token

    1
    kubeadm token generate

    输出结果示例:18ezo4.huyg5hw5tv0g07kg

  2. 使用上面生成的 token 生成 join 命令:

    1
    kubeadm token create 18ezo4.huyg5hw5tv0g07kg --print-join-command --ttl=0

    ttl 代表 token 有效期,0 为永久有效。可用 kubeadm token list 查看现有的 token,其中下方的ip代表master节点的ip
    运行结果示例:

    1
    kubeadm join 172.16.230.10:6443 --token 18ezo4.huyg5hw5tv0g07kg     --discovery-token-ca-cert-hash sha256:05bc648be3f406d2bde977cdfe1c0dfe4fb50c3d7b62b8f0950c66c0052e27c0 
  3. 复制该命令到子节点上运行,如果运行成功,会有如下提示:

    1
    2
    3
    4
    5
    This node has joined the cluster:
    * Certificate signing request was sent to apiserver and a response was received.
    * The Kubelet was informed of the new secure connection details.

    Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

master 节点上运行:

1
kubectl get nodes

可看到新添加的节点:

1
2
3
4
NAME         STATUS     ROLES    AGE     VERSION
k8s-master NotReady master 6m32s v1.19.8
k8s-node01 NotReady <none> 3m52s v1.19.8
k8s-node02 NotReady <none> 3m28s v1.19.8

按照同样的步骤,可添加更多节点。

安装网络插件cni calico

master 节点运行:kubectl get pod -n kube-system -o wide,输出如下所示:

1
2
3
4
5
6
7
8
9
10
NAME                                 READY   STATUS    RESTARTS   AGE     IP              NODE         NOMINATED NODE   READINESS GATES
coredns-6d56c8448f-l259c 0/1 Pending 0 6m17s <none> <none> <none> <none>
coredns-6d56c8448f-zjzjz 0/1 Pending 0 6m17s <none> <none> <none> <none>
etcd-k8s-master 1/1 Running 0 6m35s 172.16.230.10 k8s-master <none> <none>
kube-apiserver-k8s-master 1/1 Running 0 6m35s 172.16.230.10 k8s-master <none> <none>
kube-controller-manager-k8s-master 1/1 Running 0 6m35s 172.16.230.10 k8s-master <none> <none>
kube-proxy-cpl52 1/1 Running 0 3m43s 172.16.230.12 k8s-node02 <none> <none>
kube-proxy-mw62n 1/1 Running 0 4m7s 172.16.230.11 k8s-node01 <none> <none>
kube-proxy-q795w 1/1 Running 0 6m17s 172.16.230.10 k8s-master <none> <none>
kube-scheduler-k8s-master 1/1 Running 0 6m35s 172.16.230.10 k8s-master <none> <none>

coredns 还未运行,说明缺少网络插件。这里选择 Calico 作为 CNI 插件。 不同版本的kubernetes要安装不同版本的Calico, 我现在安装的是v3.13版本。

依次执行如下命令:

1
kubectl apply -f https://docs.projectcalico.org/archive/v3.13/manifests/calico.yaml

然后查看 pod 状态:

1
watch kubectl get pods --all-namespaces

待到所有pod全部Running后就成功了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system calico-kube-controllers-675b7c9569-97vln 1/1 Running 0 6m40s
kube-system calico-node-49dqn 1/1 Running 0 6m41s
kube-system calico-node-fjtlp 1/1 Running 0 6m41s
kube-system calico-node-t68pw 1/1 Running 0 6m41s
kube-system coredns-6d56c8448f-l259c 1/1 Running 0 15h
kube-system coredns-6d56c8448f-zjzjz 1/1 Running 0 15h
kube-system etcd-k8s-master 1/1 Running 0 15h
kube-system kube-apiserver-k8s-master 1/1 Running 0 15h
kube-system kube-controller-manager-k8s-master 1/1 Running 1 15h
kube-system kube-proxy-cpl52 1/1 Running 0 15h
kube-system kube-proxy-mw62n 1/1 Running 0 15h
kube-system kube-proxy-q795w 1/1 Running 0 15h
kube-system kube-scheduler-k8s-master 1/1 Running 1 15h

查看所有节点:

1
2
3
4
5
root@k8s-master:~# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-master Ready master 15h v1.19.8 172.16.230.10 <none> Ubuntu 20.04.5 LTS 5.4.0-125-generic docker://19.3.15
k8s-node01 Ready <none> 15h v1.19.8 172.16.230.11 <none> Ubuntu 20.04.5 LTS 5.4.0-125-generic docker://19.3.15
k8s-node02 Ready <none> 15h v1.19.8 172.16.230.12 <none> Ubuntu 20.04.5 LTS 5.4.0-125-generic docker://19.3.15

节点已经Ready了。

检查组件

运行kubectl get cs

1
2
3
4
5
6
root@k8s-master:~# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused
scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused
etcd-0 Healthy {"health":"true"}
  • scheduler 为调度服务,主要作用是将 POD 调度到 Node
  • controller-manager 为自动化修复服务,主要作用是 Node 宕机后自动修复 Node 回到正常的工作状态
  • etcd-0 则是熟悉的服务注册与发现

输出结果中,有两个组件是 unhealthy,出现这种情况,是 /etc/kubernetes/manifests 下的 kube-controller-manager.yamlkube-scheduler.yaml 设置的默认端口是 0,在文件中注释掉 - --port=0 就可以了 (在前面加上 #号)。再次运行查看命令,修改后都变为 ok 状态了。

查看文件:cat /etc/kubernetes/manifests/kube-controller-manager.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --allocate-node-cidrs=true
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --cluster-cidr=192.168.0.0/16
- --cluster-name=kubernetes
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --controllers=*,bootstrapsigner,tokencleaner
- --kubeconfig=/etc/kubernetes/controller-manager.conf
- --leader-elect=true
- --node-cidr-mask-size=24
#- --port=0
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --root-ca-file=/etc/kubernetes/pki/ca.crt
- --service-account-private-key-file=/etc/kubernetes/pki/sa.key
- --service-cluster-ip-range=10.96.0.0/12
- --use-service-account-credentials=true
image: registry.aliyuncs.com/google_containers/kube-controller-manager:v1.19.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10257
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: kube-controller-manager
resources:
requests:
cpu: 200m
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /healthz
port: 10257
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /etc/ssl/certs
name: ca-certs
readOnly: true
- mountPath: /etc/ca-certificates
name: etc-ca-certificates
readOnly: true
- mountPath: /etc/pki
name: etc-pki
readOnly: true
- mountPath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
name: flexvolume-dir
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
- mountPath: /etc/kubernetes/controller-manager.conf
name: kubeconfig
readOnly: true
- mountPath: /usr/local/share/ca-certificates
name: usr-local-share-ca-certificates
readOnly: true
- mountPath: /usr/share/ca-certificates
name: usr-share-ca-certificates
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
volumes:
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/ca-certificates
type: DirectoryOrCreate
name: etc-ca-certificates
- hostPath:
path: /etc/pki
type: DirectoryOrCreate
name: etc-pki
- hostPath:
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
type: DirectoryOrCreate
name: flexvolume-dir
- hostPath:
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
- hostPath:
path: /etc/kubernetes/controller-manager.conf
type: FileOrCreate
name: kubeconfig
- hostPath:
path: /usr/local/share/ca-certificates
type: DirectoryOrCreate
name: usr-local-share-ca-certificates
- hostPath:
path: /usr/share/ca-certificates
type: DirectoryOrCreate
name: usr-share-ca-certificates
status: {}

查看文件cat /etc/kubernetes/manifests/kube-scheduler.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
#- --port=0
image: registry.aliyuncs.com/google_containers/kube-scheduler:v1.19.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: kube-scheduler
resources:
requests:
cpu: 100m
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
status: {}

修改之后,应用一下:

1
2
kubectl apply -f kube-controller-manager.yaml
kubectl apply -f kube-scheduler.yaml

再查看一下组件状态:

1
2
3
4
5
6
root@k8s-master:/etc/kubernetes/manifests# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}

已经正常了。

安装dashboard

kubernetesdashboard有很多,这里我选择的是kuboard,安装方式有很多种,我选择的是kubernetes安装。

安装步骤也很简单:

1
2
3
kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
# 您也可以使用下面的指令,唯一的区别是,该指令使用华为云的镜像仓库替代 docker hub 分发 Kuboard 所需要的镜像
# kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3-swr.yaml

等待kuboard准备就绪:

1
watch kubectl get pods -n kuboar

然后访问http://your-node-ip-address:30080,输入初始用户名和密码,并登录

  • 用户名: admin
  • 密码: Kuboard123

安装存储插件csi rook

kubernetes重要的csi插件。该处我选择的是rookv1.8.6版本。

准备工作

先确认系统有一块空闲磁盘,没有被挂载!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
root@k8s-master:~# lsblk -f
NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
fd0
loop0 squashfs 0 100% /snap/snapd/16292
loop1 squashfs 0 100% /snap/core20/1611
loop2 squashfs 0 100% /snap/lxd/22753
loop3 squashfs 0 100% /snap/core20/1623
sda
├─sda1
├─sda2 ext4 36284784-7e9f-4a31-aaf7-cf96db42ef7e 1.5G 6% /boot
├─sda3 LVM2_member vfOvy4-plCm-iMCb-DqmI-ivqU-dkc3-WWXTYH
│ └─ubuntu--vg-ubuntu--lv ext4 659500da-700c-456a-838a-11815444f44c 48.6G 20% /
└─sda4 LVM2_member E3IMGN-TQfe-oNB0-cdRy-T0jc-IpBn-VAtfYE
└─ubuntu--vg-ubuntu--lv ext4 659500da-700c-456a-838a-11815444f44c 48.6G 20% /
sdb ceph_bluestore
sr0 iso9660 VMware Tools 2020-07-17-17-46-47-00
sr1 iso9660 Ubuntu-Server 20.04.5 LTS amd64 2022-08-31-07-37-40-00

如上,我使用的是sdb这块盘,这是已经被rook接管后的样子,接管之前的FSTYPE为空。如果不为空,或者没有空闲的挂载的磁盘,自己百度解决。

注意,挂载的盘不要分区!不要分区!不要分区!因为我这是虚拟机,只部署了一个master,两个node。所以,我给master设置成也可调度了。

先下载源码:

1
2
git clone --single-branch --branch v1.8.6 https://github.com/rook/rook.git
cd rook/deploy/examples/

查看需要的镜像列表,部分镜像无法访问,可以结合GithubAction拉取并推送到dockerhub或国内阿里云仓库:

1
2
3
4
5
6
7
8
9
10
11
root@k8s-master:~/rook/deploy/examples# cat images.txt 
k8s.gcr.io/sig-storage/csi-attacher:v3.4.0
k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.0
k8s.gcr.io/sig-storage/csi-provisioner:v3.1.0
k8s.gcr.io/sig-storage/csi-resizer:v1.4.0
k8s.gcr.io/sig-storage/csi-snapshotter:v5.0.1
quay.io/ceph/ceph:v16.2.7
quay.io/cephcsi/cephcsi:v3.5.1
quay.io/csiaddons/k8s-sidecar:v0.2.1
quay.io/csiaddons/volumereplication-operator:v0.3.0
rook/ceph:v1.8.6

部署operator

还需要修改operator文件,新版本rook默认关闭了自动发现容器的部署,可以找到ROOK_ENABLE_DISCOVERY_DAEMON改成true即可,然后直接应用:

1
kubectl create -f crds.yaml -f common.yaml -f operator.yaml

等待operator完成:

1
2
3
4
5
6
kubectl -n rook-ceph get pods
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-675f59664d-b9nch 1/1 Running 0 32m
rook-discover-4m68r 1/1 Running 0 40m
rook-discover-chscc 1/1 Running 0 40m
rook-discover-mmk69 1/1 Running 0 40m

部署cluster

cluster中有很多的坑,第一个要改的是dashboard部分,关闭ssl,修改端口,不然部署之后也访问不了dashboard

1
2
3
4
5
6
7
8
9
dashboard:
enabled: true
# serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
# urlPrefix: /ceph-dashboard
# serve the dashboard at the given port.
# port: 8443
# serve the dashboard using SSL
ssl: false
port: 8555

修改nodes参数,根据自己机器实际情况修改:

1
2
3
4
5
6
7
nodes:
- name: "k8s-node01"
devices:
- name: "sdb"
- name: "k8s-node2"
devices:
- name: "sdb"

因为虚拟机资源有限,所以mon个数设置为一个:

1
2
mon:
count: 1

开始部署:

1
kubectl create -f cluster.yaml

查看pod情况,时间会很久,估计得半个小时,因为镜像都在国外:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
root@k8s-master:~/rook/deploy/examples# kubectl -n rook-ceph get pods
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-cgxsm 3/3 Running 0 170m
csi-cephfsplugin-provisioner-6d4bd9b669-8nl4z 6/6 Running 0 170m
csi-cephfsplugin-provisioner-6d4bd9b669-gljfg 6/6 Running 4 170m
csi-cephfsplugin-qb44p 3/3 Running 0 170m
csi-cephfsplugin-trkd5 3/3 Running 0 170m
csi-rbdplugin-2spzv 3/3 Running 0 170m
csi-rbdplugin-4c8sc 3/3 Running 0 170m
csi-rbdplugin-provisioner-6bcd78bb89-jf59t 6/6 Running 4 170m
csi-rbdplugin-provisioner-6bcd78bb89-m2m9q 6/6 Running 0 170m
csi-rbdplugin-smpf2 3/3 Running 0 170m
rook-ceph-crashcollector-k8s-master-6b6698cb7f-tr7lx 1/1 Running 0 122m
rook-ceph-crashcollector-k8s-node01-6666667d8d-6jg47 1/1 Running 0 120m
rook-ceph-crashcollector-k8s-node02-b485dd74c-6wvww 1/1 Running 0 120m
rook-ceph-mgr-a-57dd6d89-rtwlv 1/1 Running 0 27m
rook-ceph-mon-a-7b485cc8b5-tllmm 1/1 Running 0 140m
rook-ceph-operator-7b4f6fd594-nfqwq 1/1 Running 0 3h29m
rook-ceph-osd-0-7959cf9f58-ccxzq 1/1 Running 0 120m
rook-ceph-osd-1-5d674c9c7f-6h2sx 1/1 Running 0 120m
rook-ceph-osd-2-854d4c5b6b-s8jfx 1/1 Running 0 120m
rook-ceph-osd-prepare-k8s-master-tzk2v 0/1 Completed 0 19m
rook-ceph-osd-prepare-k8s-node01-ptzcc 0/1 Completed 0 19m
rook-ceph-osd-prepare-k8s-node02-llvws 0/1 Completed 0 19m
rook-ceph-tools-55ddbc9f78-tt4xh 1/1 Running 0 98m
rook-discover-ndw8c 1/1 Running 0 3h28m
rook-discover-p6bq2 1/1 Running 0 3h28m
rook-discover-zgqrq 1/1 Running 0 3h28m
root@k8s-master:~/rook/deploy/examples#

全部Running起来之后,查看一下磁盘:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
root@k8s-master:~/rook/deploy/examples# lsblk -f
NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
fd0
loop0 squashfs 0 100% /snap/snapd/16292
loop1 squashfs 0 100% /snap/core20/1611
loop2 squashfs 0 100% /snap/lxd/22753
loop3 squashfs 0 100% /snap/core20/1623
sda
├─sda1
├─sda2 ext4 36284784-7e9f-4a31-aaf7-cf96db42ef7e 1.5G 6% /boot
├─sda3 LVM2_member vfOvy4-plCm-iMCb-DqmI-ivqU-dkc3-WWXTYH
│ └─ubuntu--vg-ubuntu--lv ext4 659500da-700c-456a-838a-11815444f44c 49G 19% /
└─sda4 LVM2_member E3IMGN-TQfe-oNB0-cdRy-T0jc-IpBn-VAtfYE
└─ubuntu--vg-ubuntu--lv ext4 659500da-700c-456a-838a-11815444f44c 49G 19% /
sdb ceph_bluestore
sr0 iso9660 VMware Tools 2020-07-17-17-46-47-00
sr1 iso9660 Ubuntu-Server 20.04.5 LTS amd64 2022-08-31-07-37-40-00

发现已经被接管了。

rook-ceph客户端验证

1
2
3
4
kubectl create -f toolbox.yaml

# 等容器起来后执行这条命令进入pod内部。
kubectl exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') -n rook-ceph -- bash

进入容器后:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#查看ceph状态
ceph status
cluster:
id: 267cab45-5967-4e5a-ac9e-04bf66ba41d8
health: HEALTH_WARN
mon is allowing insecure global_id reclaim

services:
mon: 1 daemons, quorum a (age 29m)
mgr: a(active, since 28m)
osd: 3 osds: 3 up (since 28m), 3 in (since 28m)

data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 147 GiB / 150 GiB avail
pgs: 1 active+clean
#查看osd状态
ceph osd status
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 k8s-node02 1027M 48.9G 0 0 0 0 exists,up
1 k8s-node01 1027M 48.9G 0 0 0 0 exists,up
2 k8s-node03 1027M 48.9G 0 0 0 0 exists,up
# 查看ceph配置文件
cat /etc/ceph/ceph.conf
[global]
mon_host = 10.254.74.252:6789

[client.admin]
keyring = /etc/ceph/keyring

配置ceph dashboard

这块需要更改一下dashboard-external-https.yaml,内容更改如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: v1
kind: Service
metadata:
name: rook-ceph-mgr-dashboard-external-https
namespace: rook-ceph # namespace:cluster
labels:
app: rook-ceph-mgr
rook_cluster: rook-ceph # namespace:cluster
spec:
ports:
- name: dashboard
port: 8555
protocol: TCP
targetPort: 8555
selector:
app: rook-ceph-mgr
rook_cluster: rook-ceph
sessionAffinity: None
type: NodePort

然后执行一下:

1
kubectl apply -f dashboard-external-https.yaml

查看svc分配的端口:

1
2
3
4
5
6
7
8
root@k8s-master:~/rook/deploy/examples# kubectl get svc -n rook-ceph
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
csi-cephfsplugin-metrics ClusterIP 10.101.41.23 <none> 8080/TCP,8081/TCP 174m
csi-rbdplugin-metrics ClusterIP 10.110.186.150 <none> 8080/TCP,8081/TCP 174m
rook-ceph-mgr ClusterIP 10.110.218.187 <none> 9283/TCP 125m
rook-ceph-mgr-dashboard ClusterIP 10.110.241.111 <none> 8555/TCP 125m
rook-ceph-mgr-dashboard-external-https NodePort 10.97.161.152 <none> 8555:32474/TCP 99m
rook-ceph-mon-a ClusterIP 10.102.70.11 <none> 6789/TCP,3300/TCP 144m

发现分配的nodeport端口为32474。访问nodeIp:32474就可以看见dashboard。用户名为admin,密码可以用一下命令查看:

1
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}"|base64 --decode && echo

rook部署块存储

yaml 部署文件在 deploy/examples/csi/rbd 下:

1
2
3
$ kubectl apply -f  storageclass.yaml
cephblockpool.ceph.rook.io/replicapool created
storageclass.storage.k8s.io/rook-ceph-block created

查看创建的storageclass:

1
2
3
# kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 10d

dashboard上查看创建的存储池:

存储池

然后就可以使用这个存储了,使用方式就是申明pvc的时候加上storageClassName: rook-ceph-block


ubuntu20.04 安装 kubernetes
https://randzz.cn/6e06a0d5d03a/ubuntu20-04-安装-kubernetes/
作者
Ezreal Rao
发布于
2022年9月6日
许可协议