來到了基本建置的最後一個部分,在本文中,將會演示我在安裝完Kubernetes Cluster之後會做的幾個基本元件的安裝,當然不只是說明怎麼安裝,同時也將我在安裝時碰到的問題與解決方式分享給大家。
雖然這些工具可能有些其他的替代工具,但以新手的角度來說,還是具有參考價值,本篇章節如下:
首先CNI是必要元件,有許多各種不同的CNI可以選擇,本文我們選擇了Calico做為CNI,安裝方式也有二種:Operator & Manifests,此處我使用了manifest來進行。
安裝前要注意各節點Firewall的需求(如下圖)
#------------------------------------------------------
# S1-1. 使用manifest安裝 (master01) (LAB)
#------------------------------------------------------
[master]# wget https://docs.projectcalico.org/manifests/calico.yaml
[master]# vim calico.yaml (確認podnetwork)
# no effect. This should fall within `--cluster-cidr`.
- name: CALICO_IPV4POOL_CIDR
value: "192.168.0.0/16"
[master]# kubectl create -f calico.yaml
[master]# watch kubectl get pods -n kube-systems
※ Question: calico-node not ready
# kubectl describe pod calico-node-ngznh -n kube-system
kubelet (combined from similar events): Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refusedLocate the problem :[master]# kubectl get ds -A
[master]# kubectl get ds -n kube-system calico-node -oyaml | grep -i ip_auto -C 3
[master]# kubectl exec -ti <pod_name> -n calico-system - bash
#/ cat /etc/calico/confd/config/bird.cfg
=> 確認router id是誰
=> 對應實體的interface(Calico的BGP會使用實體網卡來做virtual router)
※ Solution:
[method1] 確認firewall port
[method2] 強迫指定網卡
[master]# kubectl delete -f calico.yaml
[master]# vim calico.yaml
>> search
- name: IP value: "autodetect"
>> add
- name: IP_AUTODETECTION_METHOD
value: "interface=ens192" (多網卡用","分隔)
[master]# kubectl create -f calico.yaml
[master]# kubectl get pods -n kube-system
[Why]因為first-found會去找主機的第一張網卡,但預設是找"eth*",所以VM的網卡為ens就會失敗
(REF: https://www.unixcloudfusion.in/2022/02/solved-caliconode-is-not-ready-bird-is.html)
#------------------------------------------------------
# S1-2. 安裝calicoctl(binary)(master01)
#------------------------------------------------------
[master]# wget https://github.com/projectcalico/calico/releases/download/v3.26.3/calicoctl-linux-amd64 -o calicoctl
[master]# chmod +x calicoctl ; cp calicoctl /usr/local/bin/
[master]# calicoctl version
Client Version: v3.26.3
Git commit: bdb7878af
Cluster Version: v3.26.1
Cluster Type: k8s,bgp,kubeadm,kdd
#------------------------------------------------------
# S1-3. calicoctl verify(master01)
#------------------------------------------------------
[master]# mkdir -p /data/calico
[master]# cd /data/calico ; vim calicoctl.cfg
apiVersion: projectcalico.org/v3
kind: CalicoAPIConfig
metadata:
spec:
datastoreType: 'kubernetes'
kubeconfig: '/etc/kubernetes/admin.conf'
[master]# calicoctl node status
[master]# calicoctl get ipPool --allow-version-mismatch
NAME CIDR SELECTOR
default-ipv4-ippool 192.168.0.0/16 all()
※ node-to-node mesh: 因為規模不大,所以直接使用BGP模式的全節點互聯,3.4.0版本就已可支援到100個以上的節點。
#------------------------------------------------------
# S1-4. 測試
#------------------------------------------------------
[master]# vim nginx-quic.yaml
apiVersion: v1
kind: Namespace
metadata:
name: nginx-quic
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-quic-deployment
namespace: nginx-quic
spec:
selector:
matchLabels:
app: nginx-quic
replicas: 4
template:
metadata:
labels:
app: nginx-quic
spec:
containers:
- name: nginx-quic
image: tinychen777/nginx-quic:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-quic-service
namespace: nginx-quic
spec:
selector:
app: nginx-quic
ports:
- protocol: TCP
port: 8080 # match for service access port
targetPort: 80 # match for pod access port
nodePort: 30088 # match for external access port
type: NodePort
[master]# kubectl create -f nginx-quic.yaml
[master]# kubectl get deployment -o wide -n nginx-quic
[master]# kubectl get service -o wide -n nginx-quic
[master]# kubectl get pods -o wide -n nginx-quic
-----
※ 測試:
[master]# kubectl exec -it <pod> -- bash
# ping 192.168.50.66
64 bytes from 192.168.165.2: icmp_seq=1 ttl=62 time=0.652 ms
64 bytes from 192.168.165.2: icmp_seq=2 ttl=62 time=0.475 ms
64 bytes from 192.168.165.2: icmp_seq=3 ttl=62 time=0.465 ms
# curl 192.168.50.66:80
192.168.35.4:51610
[master]# curl worker02.test.example.poc:30088
[master]# curl worker03.test.example.poc:30088
[master]# curl worker01.test.example.poc:30088
Metric server是Cluster收集核心資料的聚合器,但Kubeadm預設是不會安裝的。
Metric server主要是提供給Dashboard等其他元件使用,依賴於API Aggregator,在安裝之前需要在kube-apiserver中開啟API Aggregator。
限制如下:
#---------------------------------------------------
# S2-1. 檢查是否已開啟API Aggregator
#---------------------------------------------------
[master]# ps -ef | grep apiserver
=> 確認是否有"--enable-aggregator-routing=true"
#---------------------------------------------------
# S2-2. 修改kube-apiserver.yaml,開啟API Aggregator
# 修改後apiserver會自動重新建立生效 (all masters)
#---------------------------------------------------
[master]# vim /etc/kubernetes/manifests/kube-apiserver.yaml
...
spec:
containers:
- command:
...
- --enable-aggregator-routing=true
[master]# ps -ef | grep apiserver
#---------------------------------------------------
# S2-3. 確認是否有裝metric server (master)
#---------------------------------------------------
[master]# kubectl top node
error: Metrics API not available
#---------------------------------------------------
# S2-3. 部署metric server (HA) (master)
#---------------------------------------------------
[master]# wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability-1.21+.yaml
[master]# kubectl create -f high-availability-1.21+.yaml
#---------------------------------------------------
# S2-4. 確認Service
#---------------------------------------------------
[master]# kubectl get svc --all-namespaces | grep metrics-server
kube-system metrics-server ClusterIP 10.104.138.184 <none> 443/TCP 53s
#---------------------------------------------------
# S2-5. 確認API server可不可以連到metric server
#---------------------------------------------------
[master]# kubectl describe svc metrics-server -n kube-system
[master]# ping <Endpoint>
[Question]
Pod not running. describe pod shown "Readiness probe failed: HTTP probe failed with statuscode: 500".
inspect the pod log, it showed "x509: cannot validate certificate for 10.107.88.16 because it doesn't contain any IP SANs" node="worker02.test.example.poc""
[Debug]
(1) [master]# vim high-availability-1.21+.yaml
=> 確認readiness段,使用"httpGet"讓kubelet定時發送http請求到/readyz來確認狀態
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20 << 啟動容器後第一次檢查等待時間20s
periodSeconds: 10 << 每個10s檢查一次
(2) 針對"x509" 官方安裝文件說明:
Kubelet certificate needs to be signed by cluster Certificate Authority (or disable certificate validation by passing --kubelet-insecure-tls to Metrics Server)
[Workround]
測試環境可用"--kubelet-insecure-tls"參數
[master]# vim high-availability-1.21+.yaml
[master]# kubectl create -f high-availability
[Question]
Failed to scrape node" err="Get \"https://10.107.88.17:10250/metrics/resource\": dial tcp 10.107.88.17:10250: connect: no route to host" node="worker03.test.example.poc"
=> check firewall
[Question]`kubectl get apiservice`得到FailedDiscoveryCheck
=> 因為Calico與metrics server有時候連接不是很穩定,將kubernetes服務重啟即可(或重開機)
#---------------------------------------------------
# S2-6. 確認是否Metrics server正確部署
#---------------------------------------------------
[master]# kubectl get apiservice
[master]# kubectl top nodes
#-----------------------------------------------------------
# S3-1. 下載檔案後,進行部署
#-----------------------------------------------------------
[master]# wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
=> 修改成NodePort
[master]# kubectl get pod -n kubernetes-dashboard -o wide
[master]# kubectl get svc -n kubernetes-dashboard
#-----------------------------------------------------------
# S3-2. 登入UI
#-----------------------------------------------------------
https://<node_ip>:31000
登入方式:
[master]# kubectl get secrets -n kubernetes-dashboard
[master]# kubectl get sa -n kubernetes-dashboard
#------------------------------------------------------------------
# 建立serviceaccount
#------------------------------------------------------------------
[master]# vim dashboard-admin.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
[master]# kubectl create -f dashboard-admin.yaml
#-----------------------------------------------------------------
# Create secret & 取得 token
#----------------------------------------------------------------
[master]# vim dashboard-admin-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: admin-user
namespace: kubernetes-dashboard
annotations:
kubernetes.io/service-account.name: "admin-user"
type: kubernetes.io/service-account-token
[master]# kubectl create -f dashboard-admin-secret.yaml
[master]# kubectl get secrets
NAME TYPE DATA AGE
admin-user kubernetes.io/service-account-token 3 4s
kubernetes-dashboard-certs Opaque 0 23m
kubernetes-dashboard-csrf Opaque 1 23m
kubernetes-dashboard-key-holder Opaque 2 23m
[master]# kubectl get secret admin-user -n kubernetes-dashboard -o jsonpath={".data.token"} | base64 -d
=> 將token貼上UI,進行登入
以上就完成了三項我在初始部署完Cluster之後,我會基本接著做的幾件事,當然後續跟據需求還可以再加入幾個工具會更好用。
以下列出我自已還會再加入的幾項:
好了,對於kubernets的新手來說,做完這三篇文章的內容,基本上你就已經有一個可以運行的kubernetes cluster可供使用了。
感謝您的觀看,我們下篇再見。