准备工作 机器准备 本地起了三台虚拟机,分别为
k8s-master 172.16.230.10
k8s-node01 172.16.230.11
k8s-node02 172.16.230.12
修改apt 源 将三台机器的源全部改为国内源:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 # 备份源 sudo mv /etc/apt/sources.list /etc/apt/sources.list.bak# 一键更新替换163的源 实测这个比较快 sudo bash -c "cat << EOF > /etc/apt/sources.list && apt update deb http: deb http: deb http: deb http: deb http: deb-src http: deb-src http: deb-src http: deb-src http: deb-src http: EOF" # 一键替换阿里的源 sudo bash -c "cat << EOF > /etc/apt/sources.list && apt update deb http: deb-src http: deb http: deb-src http: deb http: deb-src http: deb http: deb-src http: deb http: deb-src http: EOF"
修改hostname 1 2 3 4 5 6 7 8 9 10 hostnamectl --static set-hostname k8s-master hostnamectl --static set-hostname k8s-node01 hostnamectl --static set-hostname k8s-node02 hostname $hostname
禁用swap 分区 临时禁用
1 2 3 4 // 注释掉有swap的行 sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/ fstab// 关闭swap swapoff -a
禁用防火墙 1 2 3 4 5 systemctl stop ufw systemctl disable ufw root@k8s -node02:~ Status: inactive
安装docker 一键安装docker
脚本install_docker.sh
1 2 3 4 5 6 7 8 9 10 11 #!/bin/bash apt update apt install apt-transport-https ca-certificates curl gnupg-agent software-properties-common sudo curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - add-apt-repository \ "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \ $(lsb_release -cs) \ stable" apt update apt install docker-ce docker-ce-cli containerd.io docker --version
或者一步一步执行
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 # 1. 移除自带的docker sudo apt-get remove docker docker-engine docker-ce docker.io# 2. 更新软件 sudo apt-get update# 3. 安装必要的工具 sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common# 4 添加官方的GPG秘钥 curl -fsSL https: sudo curl -fsSL https:# 5. 设置stable存储库 sudo add -apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" add -apt-repository \ "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \ $(lsb_release -cs) \ stable" # 6. 更新软件源 sudo apt-get update# 7. 查看docker可用版本 apt-cache madison docker-ce
查看docker
版本大致如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 root @k8s-master:~# apt-cache madison docker-ce docker -ce | 5 :20 .10 .17 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .16 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .15 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .14 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .13 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .12 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .11 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .10 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .9 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .8 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .7 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .6 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .5 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .4 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .3 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .2 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .1 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :20 .10 .0 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :19 .03 .15 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :19 .03 .14 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :19 .03 .13 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :19 .03 .12 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :19 .03 .11 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :19 .03 .10 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages docker -ce | 5 :19 .03 .9 ~3 -0 ~ubuntu-focal | https://mirrors.aliyun.com/docker-ce/linux/ubuntu focal/stable amd64 Packages
第二栏为版本号,这里选择版本号为:5:19.03.15~3-0~ubuntu-focal
。将版本号放到 docker-ce=
后面,执行命令如下所示:
1 sudo apt-get install docker-ce=5 :19 .03 .15 ~3 -0 ~ubuntu-focal
安装完后检查 docker
的状态:
配置docker
:
1 2 3 4 5 6 7 8 9 10 11 cat <<EOF | sudo tee /etc/docker/daemon.json { "exec-opts" : ["native.cgroupdriver=systemd" ], "log-driver" : "json-file" , "log-opts" : { "max-size" : "100m" }, "registry-mirrors" : ["https://6rf3psgj.mirror.aliyuncs.com" ], "storage-driver" : "overlay2" } EOF
其中,registry-mirrors
添加国内的 docker
镜像源,这里使用阿里云的。获取方法:登录阿里云后,转到 cr.console.aliyun.com/cn-guangzhou...
,点击左边菜单栏的「镜像工具」-「镜像加速」即可看到镜像源地址。 最后,重新启动 Docker
并在启动时启用:
1 2 3 4 5 6 sudo system ctl enable docker sudo system ctl daemon-reload sudo system ctl restart docker
查看docker 信息
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 root@k 8s-master :~# docker infoClient : Context : default Debug Mode : false Plugins : app : Docker App (Docker Inc., v0.9.1 -beta3) buildx : Docker Buildx (Docker Inc., v0.8.2 -docker) scan : Docker Scan (Docker Inc., v0.17.0 )Server : Containers : 0 Running : 0 Paused : 0 Stopped : 0 Images : 0 Server Version : 19.03 .15 Storage Driver : overlay2 Backing Filesystem : extfs Supports d_type : true Native Overlay Diff : true Logging Driver : json-file Cgroup Driver : systemd Plugins : Volume : local Network : bridge host ipvlan macvlan null overlay Log : awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm : inactive Runtimes : runc Default Runtime : runc Init Binary : docker-init containerd version : 9 cd3357b7fd7218e4aec3eae239db1f68a5a6ec6 runc version : v1.1.4 -0 -g5fd4c4d init version : fec3683 Security Options : apparmor seccomp Profile : default Kernel Version : 5.4 .0 -125 -generic Operating System : Ubuntu 20.04 .5 LTS OSType : linux Architecture : x86_64 CPUs : 2 Total Memory : 3.81 GiB Name : k8s-master ID : YBJ7:B3NK :EN3P :4 TTH :P2ZF :THUX :PW6K :4 P3S :TLRR :ZT4Q :QIPQ :UJI3 Docker Root Dir : /var/lib/docker Debug Mode : false Registry : https ://index.docker.io/v1/ Labels : Experimental : false Insecure Registries : 127.0 .0.0 /8 Registry Mirrors : https ://6 rf3psgj.mirror.aliyuncs.com/ Live Restore Enabled : falseWARNING : No swap limit support
安装 kubeadm kubelet kubectl 安装 先安装依赖和配置:
1 2 3 4 5 6 7 8 9 10 apt-get update && apt-get install -y apt-transport-https curl https://mi rrors.aliyun.com/kubernetes/ apt/doc/ apt-key.gpg | apt-key add - cat << EOF >/etc/ apt/sources.list.d/ kubernetes.list deb https://mi rrors.aliyun.com/kubernetes/ apt/ kubernetes-xenial main EOF
开始安装:
1 2 apt-get update apt-get install -y kubelet kubeadm kubectl
如果想指定版本:
1 apt -get install -y kubelet=1 .19 .8 -00 kubeadm=1 .19 .8 -00 kubectl=1 .19 .8 -00
各组件说明:
kubeadm
:用来初始化集群的指令。
kubelet
:在集群中的每个节点上用来启动 Pod 和容器等。
kubectl
:用来与集群通信的命令行工具。
接着,设置 kubelet
自启动并启动:
1 system ctl enable kubelet && system ctl start kubelet
如果安装错误了,可以使用以下命令重置以下:
1 2 3 kubeadm reset apt auto remove -y kubelet kubectl kubeadm kubernetes-cni
配置kubeadm 1 2 3 kubeadm config print init-defaults --kubeconfig ClusterConfiguration > kubeadm.yml
修改配置文件kubeadm.yml
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 172.16 .230 .10 bindPort: 6443 nodeRegistration: criSocket: unix:///var/run/containerd/containerd.sock imagePullPolicy: IfNotPresent name: node taints: null --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {}dns: {}etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: 1.19 .0 networking: dnsDomain: cluster.local serviceSubnet: 10.96 .0 .0 /12 podSubnet: 192.168 .0 .0 /16 scheduler: {}
执行下面命令拉取镜像:
1 kubeadm config images pull --config kubeadm.yml
初始化master节点
大概等几分钟就执行完毕。输出信息的结尾会有如下信息:
1 2 3 4 Then you can join any number of worker nodes by running the following on each as root: kubeadm join 172.16 .230 .10 :6443
后面添加 worker
节点的时候,可通过运行 kubeadm join...
命令添加。 接下来配置 kubectl,执行:
1 2 3 4 5 mkdir -p $HOME /.kubecp -i /etc/kubernetes/admin.conf $HOME /.kube/configchown $(id -u):$(id -g) $HOME /.kube/config
验证是否成功:
1 2 3 4 kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master NotReady master 74s v1.19.8
表示主节点已经配置成功。如果安装失败,重试的话,需先运行: kubeadm reset
清理配置文件。
配置 worker 节点 首先,新开一台机器,重复上面【配置kubeadm】之前的操作,在 worker 节点安装 docker
, kubeadm
,kubectl
,kubelet
,除了 hostname
改为不一样的名字,安装完后不用执行启动命令。 其次,在 master
节点上,可以使用之前控制台输出的 join
命令,也可以重新生成 join
命令。步骤如下:
创建新的 token
输出结果示例:18ezo4.huyg5hw5tv0g07kg
使用上面生成的 token
生成 join
命令:
1 kubeadm token create 18 ezo4.huyg5hw5tv0g07kg --print -join-command --ttl=0
ttl
代表 token
有效期,0
为永久有效。可用 kubeadm token list
查看现有的 token,其中下方的ip
代表master
节点的ip
。 运行结果示例:
1 kubeadm join 172.16 .230.10 :6443 --token 18 ezo4 .huyg5 hw5 tv0 g07 kg --discovery-token-ca-cert-hash sha256 :05 bc648 be3 f406 d2 bde977 cdfe1 c 0 dfe4 fb50 c 3 d7 b62 b8 f0950 c 66 c 0052e27 c 0
复制该命令到子节点上运行,如果运行成功,会有如下提示:
1 2 3 4 5 This node has joined the cluster: Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
到 master
节点上运行:
可看到新添加的节点:
1 2 3 4 NAME STATUS ROLES AGE VERSIONk8s -master NotReady master 6 m32s v1.19 .8 k8s -node01 NotReady <none> 3 m52s v1.19 .8 k8s -node02 NotReady <none> 3 m28s v1.19 .8
按照同样的步骤,可添加更多节点。
安装网络插件cni calico 在 master
节点运行:kubectl get pod -n kube-system -o wide
,输出如下所示:
1 2 3 4 5 6 7 8 9 10 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATEScoredns -6 d56c8448f-l259c 0 /1 Pending 0 6 m17s <none> <none> <none> <none>coredns -6 d56c8448f-zjzjz 0 /1 Pending 0 6 m17s <none> <none> <none> <none>etcd -k8s-master 1 /1 Running 0 6 m35s 172.16.230.10 k8s-master <none> <none>kube -apiserver-k8s-master 1 /1 Running 0 6 m35s 172.16.230.10 k8s-master <none> <none>kube -controller-manager-k8s-master 1 /1 Running 0 6 m35s 172.16.230.10 k8s-master <none> <none>kube -proxy-cpl52 1 /1 Running 0 3 m43s 172.16.230.12 k8s-node02 <none> <none>kube -proxy-mw62n 1 /1 Running 0 4 m7s 172.16.230.11 k8s-node01 <none> <none>kube -proxy-q795w 1 /1 Running 0 6 m17s 172.16.230.10 k8s-master <none> <none>kube -scheduler-k8s-master 1 /1 Running 0 6 m35s 172.16.230.10 k8s-master <none> <none>
coredns
还未运行,说明缺少网络插件。这里选择 Calico
作为 CNI
插件。 不同版本的kubernetes
要安装不同版本的Calico
, 我现在安装的是v3.13
版本。
依次执行如下命令:
1 kubectl apply -f https:// docs.projectcalico.org/archive/ v3.13 /manifests/ calico.yaml
然后查看 pod
状态:
待到所有pod
全部Running
后就成功了。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 NAMESPACE NAME READY STATUS RESTARTS AGEkube -system calico-kube-controllers-675 b7c9569-97 vln 1 /1 Running 0 6 m40skube -system calico-node-49 dqn 1 /1 Running 0 6 m41skube -system calico-node-fjtlp 1 /1 Running 0 6 m41skube -system calico-node-t68pw 1 /1 Running 0 6 m41skube -system coredns-6 d56c8448f-l259c 1 /1 Running 0 15 hkube -system coredns-6 d56c8448f-zjzjz 1 /1 Running 0 15 hkube -system etcd-k8s-master 1 /1 Running 0 15 hkube -system kube-apiserver-k8s-master 1 /1 Running 0 15 hkube -system kube-controller-manager-k8s-master 1 /1 Running 1 15 hkube -system kube-proxy-cpl52 1 /1 Running 0 15 hkube -system kube-proxy-mw62n 1 /1 Running 0 15 hkube -system kube-proxy-q795w 1 /1 Running 0 15 hkube -system kube-scheduler-k8s-master 1 /1 Running 1 15 h
查看所有节点:
1 2 3 4 5 root @k8s-master:~# kubectl get nodes -o wideNAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIMEk8s -master Ready master 15 h v1.19 .8 172.16.230.10 <none> Ubuntu 20 .04 .5 LTS 5 .4 .0 -125 -generic docker://19 .3 .15 k8s -node01 Ready <none> 15 h v1.19 .8 172.16.230.11 <none> Ubuntu 20 .04 .5 LTS 5 .4 .0 -125 -generic docker://19 .3 .15 k8s -node02 Ready <none> 15 h v1.19 .8 172.16.230.12 <none> Ubuntu 20 .04 .5 LTS 5 .4 .0 -125 -generic docker://19 .3 .15
节点已经Ready
了。
检查组件 运行kubectl get cs
:
1 2 3 4 5 6 root@k8s-master:~# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz" : dial tcp 127.0.0.1:10252: connect: connection refused scheduler Unhealthy Get "http://127.0.0.1:10251/healthz" : dial tcp 127.0.0.1:10251: connect: connection refused etcd-0 Healthy {"health" :"true" }
scheduler
为调度服务,主要作用是将 POD
调度到 Node
controller-manager
为自动化修复服务,主要作用是 Node
宕机后自动修复 Node
回到正常的工作状态
etcd-0
则是熟悉的服务注册与发现
输出结果中,有两个组件是 unhealthy
,出现这种情况,是 /etc/kubernetes/manifests
下的 kube-controller-manager.yaml
和 kube-scheduler.yaml
设置的默认端口是 0,在文件中注释掉 - --port=0
就可以了 (在前面加上 #
号)。再次运行查看命令,修改后都变为 ok
状态了。
查看文件:cat /etc/kubernetes/manifests/kube-controller-manager.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-controller-manager tier: control-plane name: kube-controller-manager namespace: kube-system spec: containers: - command: - kube-controller-manager - --allocate-node-cidrs=true - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf - --bind-address=127.0.0.1 - --client-ca-file=/etc/kubernetes/pki/ca.crt - --cluster-cidr=192.168.0.0/16 - --cluster-name=kubernetes - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key - --controllers=*,bootstrapsigner,tokencleaner - --kubeconfig=/etc/kubernetes/controller-manager.conf - --leader-elect=true - --node-cidr-mask-size=24 - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt - --root-ca-file=/etc/kubernetes/pki/ca.crt - --service-account-private-key-file=/etc/kubernetes/pki/sa.key - --service-cluster-ip-range=10.96.0.0/12 - --use-service-account-credentials=true image: registry.aliyuncs.com/google_containers/kube-controller-manager:v1.19.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0 .0 .1 path: /healthz port: 10257 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 name: kube-controller-manager resources: requests: cpu: 200m startupProbe: failureThreshold: 24 httpGet: host: 127.0 .0 .1 path: /healthz port: 10257 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 volumeMounts: - mountPath: /etc/ssl/certs name: ca-certs readOnly: true - mountPath: /etc/ca-certificates name: etc-ca-certificates readOnly: true - mountPath: /etc/pki name: etc-pki readOnly: true - mountPath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec name: flexvolume-dir - mountPath: /etc/kubernetes/pki name: k8s-certs readOnly: true - mountPath: /etc/kubernetes/controller-manager.conf name: kubeconfig readOnly: true - mountPath: /usr/local/share/ca-certificates name: usr-local-share-ca-certificates readOnly: true - mountPath: /usr/share/ca-certificates name: usr-share-ca-certificates readOnly: true hostNetwork: true priorityClassName: system-node-critical volumes: - hostPath: path: /etc/ssl/certs type: DirectoryOrCreate name: ca-certs - hostPath: path: /etc/ca-certificates type: DirectoryOrCreate name: etc-ca-certificates - hostPath: path: /etc/pki type: DirectoryOrCreate name: etc-pki - hostPath: path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec type: DirectoryOrCreate name: flexvolume-dir - hostPath: path: /etc/kubernetes/pki type: DirectoryOrCreate name: k8s-certs - hostPath: path: /etc/kubernetes/controller-manager.conf type: FileOrCreate name: kubeconfig - hostPath: path: /usr/local/share/ca-certificates type: DirectoryOrCreate name: usr-local-share-ca-certificates - hostPath: path: /usr/share/ca-certificates type: DirectoryOrCreate name: usr-share-ca-certificates status: {}
查看文件cat /etc/kubernetes/manifests/kube-scheduler.yaml
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler namespace: kube-system spec: containers: - command: - kube-scheduler - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf - --bind-address=127.0.0.1 - --kubeconfig=/etc/kubernetes/scheduler.conf - --leader-elect=true image: registry.aliyuncs.com/google_containers/kube-scheduler:v1.19.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0 .0 .1 path: /healthz port: 10259 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 name: kube-scheduler resources: requests: cpu: 100m startupProbe: failureThreshold: 24 httpGet: host: 127.0 .0 .1 path: /healthz port: 10259 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 volumeMounts: - mountPath: /etc/kubernetes/scheduler.conf name: kubeconfig readOnly: true hostNetwork: true priorityClassName: system-node-critical volumes: - hostPath: path: /etc/kubernetes/scheduler.conf type: FileOrCreate name: kubeconfig status: {}
修改之后,应用一下:
1 2 kubectl apply -f kube-controller-manager.yaml kubectl apply -f kube-scheduler.yaml
再查看一下组件状态:
1 2 3 4 5 6 root@k8s-master:/etc/kubernetes/manifests# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health" :"true" }
已经正常了。
安装dashboard kubernetes
的dashboard
有很多,这里我选择的是kuboard ,安装方式有很多种,我选择的是kubernetes
安装。
安装步骤也很简单:
1 2 3 kubectl apply -f https:# 您也可以使用下面的指令,唯一的区别是,该指令使用华为云的镜像仓库替代 docker hub 分发 Kuboard 所需要的镜像 # kubectl apply -f https:
等待kuboard准备就绪:
1 watch kubectl get pods -n kuboar
然后访问http://your-node-ip-address:30080
,输入初始用户名和密码,并登录
用户名: admin
密码: Kuboard123
安装存储插件csi rook kubernetes
重要的csi
插件。该处我选择的是rookv1.8.6
版本。
准备工作 先确认系统有一块空闲磁盘,没有被挂载!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 root@k8s-master:~# lsblk -f NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT fd0 loop0 squashfs 0 100% /snap/snapd/16292 loop1 squashfs 0 100% /snap/core20/1611 loop2 squashfs 0 100% /snap/lxd/22753 loop3 squashfs 0 100% /snap/core20/1623 sda ├─sda1 ├─sda2 ext4 36284784-7 e9f-4 a31-aaf7-cf96db42ef7e 1.5G 6% /boot ├─sda3 LVM2_member vfOvy4-plCm-iMCb-DqmI-ivqU-dkc3-WWXTYH │ └─ubuntu--vg-ubuntu--lv ext4 659500da-700 c-456 a-838 a-11815444 f44c 48.6G 20% / └─sda4 LVM2_member E3IMGN-TQfe-oNB0-cdRy-T0jc-IpBn-VAtfYE └─ubuntu--vg-ubuntu--lv ext4 659500da-700 c-456 a-838 a-11815444 f44c 48.6G 20% / sdb ceph_bluestore sr0 iso9660 VMware Tools 2020-07 -17 -17 -46 -47 -00 sr1 iso9660 Ubuntu-Server 20.04.5 LTS amd64 2022-08 -31 -07 -37 -40 -00
如上,我使用的是sdb
这块盘,这是已经被rook
接管后的样子,接管之前的FSTYPE
为空。如果不为空,或者没有空闲的挂载的磁盘,自己百度解决。
注意,挂载的盘不要分区!不要分区!不要分区!因为我这是虚拟机,只部署了一个master
,两个node
。所以,我给master
设置成也可调度了。
先下载源码:
1 2 git clone --single-branch --branch v1.8.6 https://github.com/rook/rook.git cd rook/deploy /examples/
查看需要的镜像列表,部分镜像无法访问,可以结合GithubAction
拉取并推送到dockerhub
或国内阿里云仓库:
1 2 3 4 5 6 7 8 9 10 11 root @k8s-master:~/rook/deploy/examples# cat images.txt k8s .gcr.io/sig-storage/csi-attacher:v3.4 .0 k8s .gcr.io/sig-storage/csi-node-driver-registrar:v2.5 .0 k8s .gcr.io/sig-storage/csi-provisioner:v3.1 .0 k8s .gcr.io/sig-storage/csi-resizer:v1.4 .0 k8s .gcr.io/sig-storage/csi-snapshotter:v5.0 .1 quay .io/ceph/ceph:v16.2 .7 quay .io/cephcsi/cephcsi:v3.5 .1 quay .io/csiaddons/k8s-sidecar:v0.2 .1 quay .io/csiaddons/volumereplication-operator:v0.3 .0 rook /ceph:v1.8 .6
部署operator 还需要修改operator
文件,新版本rook
默认关闭了自动发现容器的部署,可以找到ROOK_ENABLE_DISCOVERY_DAEMON
改成true
即可,然后直接应用:
1 kubectl create -f crds.yaml -f common.yaml -f operator.yaml
等待operator完成:
1 2 3 4 5 6 kubectl -n rook-ceph get podsNAME READY STATUS RESTARTS AGErook -ceph-operator-675 f59664d-b9nch 1 /1 Running 0 32 mrook -discover-4 m68r 1 /1 Running 0 40 mrook -discover-chscc 1 /1 Running 0 40 mrook -discover-mmk69 1 /1 Running 0 40 m
部署cluster cluster
中有很多的坑,第一个要改的是dashboard
部分,关闭ssl
,修改端口,不然部署之后也访问不了dashboard
:
1 2 3 4 5 6 7 8 9 dashboard: enabled: true ssl: false port: 8555
修改nodes
参数,根据自己机器实际情况修改:
1 2 3 4 5 6 7 nodes: - name: "k8s-node01" devices: - name: "sdb" - name: "k8s-node2" devices: - name: "sdb"
因为虚拟机资源有限,所以mon个数设置为一个:
开始部署:
1 kubectl create -f cluster .yaml
查看pod
情况,时间会很久,估计得半个小时,因为镜像都在国外:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 root @k8s-master:~/rook/deploy/examples# kubectl -n rook-ceph get podsNAME READY STATUS RESTARTS AGEcsi -cephfsplugin-cgxsm 3 /3 Running 0 170 mcsi -cephfsplugin-provisioner-6 d4bd9b669-8 nl4z 6 /6 Running 0 170 mcsi -cephfsplugin-provisioner-6 d4bd9b669-gljfg 6 /6 Running 4 170 mcsi -cephfsplugin-qb44p 3 /3 Running 0 170 mcsi -cephfsplugin-trkd5 3 /3 Running 0 170 mcsi -rbdplugin-2 spzv 3 /3 Running 0 170 mcsi -rbdplugin-4 c8sc 3 /3 Running 0 170 mcsi -rbdplugin-provisioner-6 bcd78bb89-jf59t 6 /6 Running 4 170 mcsi -rbdplugin-provisioner-6 bcd78bb89-m2m9q 6 /6 Running 0 170 mcsi -rbdplugin-smpf2 3 /3 Running 0 170 mrook -ceph-crashcollector-k8s-master-6 b6698cb7f-tr7lx 1 /1 Running 0 122 mrook -ceph-crashcollector-k8s-node01-6666667 d8d-6 jg47 1 /1 Running 0 120 mrook -ceph-crashcollector-k8s-node02-b485dd74c-6 wvww 1 /1 Running 0 120 mrook -ceph-mgr-a-57 dd6d89-rtwlv 1 /1 Running 0 27 mrook -ceph-mon-a-7 b485cc8b5-tllmm 1 /1 Running 0 140 mrook -ceph-operator-7 b4f6fd594-nfqwq 1 /1 Running 0 3 h29mrook -ceph-osd-0 -7959 cf9f58-ccxzq 1 /1 Running 0 120 mrook -ceph-osd-1 -5 d674c9c7f-6 h2sx 1 /1 Running 0 120 mrook -ceph-osd-2 -854 d4c5b6b-s8jfx 1 /1 Running 0 120 mrook -ceph-osd-prepare-k8s-master-tzk2v 0 /1 Completed 0 19 mrook -ceph-osd-prepare-k8s-node01-ptzcc 0 /1 Completed 0 19 mrook -ceph-osd-prepare-k8s-node02-llvws 0 /1 Completed 0 19 mrook -ceph-tools-55 ddbc9f78-tt4xh 1 /1 Running 0 98 mrook -discover-ndw8c 1 /1 Running 0 3 h28mrook -discover-p6bq2 1 /1 Running 0 3 h28mrook -discover-zgqrq 1 /1 Running 0 3 h28mroot @k8s-master:~/rook/deploy/examples#
全部Running
起来之后,查看一下磁盘:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 root@k8s-master:~/rook/deploy/examples# lsblk -f NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT fd0 loop0 squashfs 0 100% /snap/snapd/16292 loop1 squashfs 0 100% /snap/core20/1611 loop2 squashfs 0 100% /snap/lxd/22753 loop3 squashfs 0 100% /snap/core20/1623 sda ├─sda1 ├─sda2 ext4 36284784-7 e9f-4 a31-aaf7-cf96db42ef7e 1.5G 6% /boot ├─sda3 LVM2_member vfOvy4-plCm-iMCb-DqmI-ivqU-dkc3-WWXTYH │ └─ubuntu--vg-ubuntu--lv ext4 659500da-700 c-456 a-838 a-11815444 f44c 49G 19% / └─sda4 LVM2_member E3IMGN-TQfe-oNB0-cdRy-T0jc-IpBn-VAtfYE └─ubuntu--vg-ubuntu--lv ext4 659500da-700 c-456 a-838 a-11815444 f44c 49G 19% / sdb ceph_bluestore sr0 iso9660 VMware Tools 2020-07 -17 -17 -46 -47 -00 sr1 iso9660 Ubuntu-Server 20.04.5 LTS amd64 2022-08 -31 -07 -37 -40 -00
发现已经被接管了。
rook-ceph客户端验证 1 2 3 4 kubectl create -f toolbox.yaml kubectl exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}' ) -n rook-ceph
进入容器后:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ceph status cluster: id: 267cab45-5967-4e5a-ac9e-04bf66ba41d8 health: HEALTH_WARN mon is allowing insecure global_id reclaim services: mon: 1 daemons, quorum a (age 29m) mgr: a(active, since 28m) osd: 3 osds: 3 up (since 28m), 3 in (since 28m) data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 3.0 GiB used, 147 GiB / 150 GiB avail pgs: 1 active+clean ceph osd status ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE 0 k8s-node02 1027M 48.9G 0 0 0 0 exists,up 1 k8s-node01 1027M 48.9G 0 0 0 0 exists,up 2 k8s-node03 1027M 48.9G 0 0 0 0 exists,up cat /etc/ceph/ceph.conf [global] mon_host = 10.254.74.252:6789 [client.admin] keyring = /etc/ceph/keyring
配置ceph dashboard 这块需要更改一下dashboard-external-https.yaml
,内容更改如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 apiVersion: v1 kind: Service metadata: name: rook-ceph-mgr-dashboard-external-https namespace: rook-ceph labels: app: rook-ceph-mgr rook_cluster: rook-ceph spec: ports: - name: dashboard port: 8555 protocol: TCP targetPort: 8555 selector: app: rook-ceph-mgr rook_cluster: rook-ceph sessionAffinity: None type: NodePort
然后执行一下:
1 kubectl apply -f dashboard-external -https.yaml
查看svc分配的端口:
1 2 3 4 5 6 7 8 root @k8s-master:~/rook/deploy/examples# kubectl get svc -n rook-cephNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEcsi -cephfsplugin-metrics ClusterIP 10.101.41.23 <none> 8080 /TCP,8081 /TCP 174 mcsi -rbdplugin-metrics ClusterIP 10.110.186.150 <none> 8080 /TCP,8081 /TCP 174 mrook -ceph-mgr ClusterIP 10.110.218.187 <none> 9283 /TCP 125 mrook -ceph-mgr-dashboard ClusterIP 10.110.241.111 <none> 8555 /TCP 125 mrook -ceph-mgr-dashboard-external-https NodePort 10.97.161.152 <none> 8555 :32474 /TCP 99 mrook -ceph-mon-a ClusterIP 10.102.70.11 <none> 6789 /TCP,3300 /TCP 144 m
发现分配的nodeport
端口为32474。访问nodeIp:32474
就可以看见dashboard
。用户名为admin
,密码可以用一下命令查看:
1 kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath ="{['data']['password']}" |base64 --decode && echo
rook部署块存储 yaml
部署文件在 deploy/examples/csi/rbd
下:
1 2 3 $ kubectl apply -f storageclass.yaml cephblockpool.ceph .rook .io/replicapool created storageclass.storage .k8s .io/rook-ceph-block created
查看创建的storageclass
:
1 2 3 # kubectl get storageclassNAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 10 d
在dashboard
上查看创建的存储池:
然后就可以使用这个存储了,使用方式就是申明pvc
的时候加上storageClassName: rook-ceph-block
。