Once you've gained access to your terminal it might be wise to spend ~1 minute to setup your environment. You could set these:
xalias k=kubectl # will already be pre-configured
export do="--dry-run=client -o yaml" # k create deploy nginx --image=nginx $do
export now="--force --grace-period 0" # k delete pod x $nowThe following settings will already be configured in your real exam environment in ~/.vimrc. But it can never hurt to be able to type these down:
xxxxxxxxxxset tabstop=2set expandtabset shiftwidth=2
More setup suggestions are in the tips section.
You have access to multiple clusters from your main terminal through kubectl contexts. Write all those context names into /opt/course/1/contexts.
Next write a command to display the current context into /opt/course/1/context_default_kubectl.sh, the command should use kubectl.
Finally write a second command doing the same thing into /opt/course/1/context_default_no_kubectl.sh, but without the use of kubectl.
Maybe the fastest way is just to run:
xxxxxxxxxxk config get-contexts # copy manually
k config get-contexts -o name > /opt/course/1/contextsOr using jsonpath:
xxxxxxxxxxk config view -o yaml # overviewk config view -o jsonpath="{.contexts[*].name}"k config view -o jsonpath="{.contexts[*].name}" | tr " " "\n" # new linesk config view -o jsonpath="{.contexts[*].name}" | tr " " "\n" > /opt/course/1/contexts The content should then look like:
xxxxxxxxxx# /opt/course/1/contextsk8s-c1-Hk8s-c2-ACk8s-c3-CCC
Next create the first command:
xxxxxxxxxx# /opt/course/1/context_default_kubectl.shkubectl config current-context
xxxxxxxxxx➜ sh /opt/course/1/context_default_kubectl.shk8s-c1-HAnd the second one:
xxxxxxxxxx# /opt/course/1/context_default_no_kubectl.shcat ~/.kube/config | grep current
xxxxxxxxxx➜ sh /opt/course/1/context_default_no_kubectl.shcurrent-context: k8s-c1-HIn the real exam you might need to filter and find information from bigger lists of resources, hence knowing a little jsonpath and simple bash filtering will be helpful.
The second command could also be improved to:
xxxxxxxxxx# /opt/course/1/context_default_no_kubectl.shcat ~/.kube/config | grep current | sed -e "s/current-context: //"
Use context: kubectl config use-context k8s-c1-H
Create a single Pod of image httpd:2.4.41-alpine in Namespace default. The Pod should be named pod1 and the container should be named pod1-container. This Pod should only be scheduled on controlplane nodes. Do not add new labels to any nodes.
First we find the controlplane node(s) and their taints:
xxxxxxxxxxk get node # find controlplane node
k describe node cluster1-controlplane1 | grep Taint -A1 # get controlplane node taints
k get node cluster1-controlplane1 --show-labels # get controlplane node labels
Next we create the Pod template:
xxxxxxxxxx# check the export on the very top of this document so we can use $dok run pod1 --image=httpd:2.4.41-alpine $do > 2.yaml
vim 2.yamlPerform the necessary changes manually. Use the Kubernetes docs and search for example for tolerations and nodeSelector to find examples:
xxxxxxxxxx# 2.yamlapiVersionv1kindPodmetadata creationTimestampnull labels runpod1 namepod1spec containersimagehttpd2.4.41-alpine namepod1-container # change resources dnsPolicyClusterFirst restartPolicyAlways tolerations# addeffectNoSchedule # add keynode-role.kubernetes.io/control-plane # add nodeSelector# add node-role.kubernetes.io/control-plane"" # addstatusImportant here to add the toleration for running on controlplane nodes, but also the nodeSelector to make sure it only runs on controlplane nodes. If we only specify a toleration the Pod can be scheduled on controlplane or worker nodes.
Now we create it:
xxxxxxxxxxk -f 2.yaml createLet's check if the pod is scheduled:
xxxxxxxxxx➜ k get pod pod1 -o wideNAME READY STATUS RESTARTS ... NODE NOMINATED NODEpod1 1/1 Running 0 ... cluster1-controlplane1 <none>
Use context: kubectl config use-context k8s-c1-H
There are two Pods named o3db-* in Namespace project-c13. C13 Management asked you to scale the Pods down to one replica to save resources.
If we check the Pods we see two replicas:
xxxxxxxxxx➜ k -n project-c13 get pod | grep o3dbo3db-0 1/1 Running 0 52so3db-1 1/1 Running 0 42sFrom their name it looks like these are managed by a StatefulSet. But if we're not sure we could also check for the most common resources which manage Pods:
xxxxxxxxxx➜ k -n project-c13 get deploy,ds,sts | grep o3dbstatefulset.apps/o3db 2/2 2m56sConfirmed, we have to work with a StatefulSet. To find this out we could also look at the Pod labels:
xxxxxxxxxx➜ k -n project-c13 get pod --show-labels | grep o3dbo3db-0 1/1 Running 0 3m29s app=nginx,controller-revision-hash=o3db-5fbd4bb9cc,statefulset.kubernetes.io/pod-name=o3db-0o3db-1 1/1 Running 0 3m19s app=nginx,controller-revision-hash=o3db-5fbd4bb9cc,statefulset.kubernetes.io/pod-name=o3db-1To fulfil the task we simply run:
xxxxxxxxxx➜ k -n project-c13 scale sts o3db --replicas 1statefulset.apps/o3db scaled
➜ k -n project-c13 get sts o3dbNAME READY AGEo3db 1/1 4m39sC13 Management is happy again.
Use context: kubectl config use-context k8s-c1-H
Do the following in Namespace default. Create a single Pod named ready-if-service-ready of image nginx:1.16.1-alpine. Configure a LivenessProbe which simply executes command true. Also configure a ReadinessProbe which does check if the url http://service-am-i-ready:80 is reachable, you can use wget -T2 -O- http://service-am-i-ready:80 for this. Start the Pod and confirm it isn't ready because of the ReadinessProbe.
Create a second Pod named am-i-ready of image nginx:1.16.1-alpine with label id: cross-server-ready. The already existing Service service-am-i-ready should now have that second Pod as endpoint.
Now the first Pod should be in ready state, confirm that.
It's a bit of an anti-pattern for one Pod to check another Pod for being ready using probes, hence the normally available readinessProbe.httpGet doesn't work for absolute remote urls. Still the workaround requested in this task should show how probes and Pod<->Service communication works.
First we create the first Pod:
xxxxxxxxxxk run ready-if-service-ready --image=nginx:1.16.1-alpine $do > 4_pod1.yaml
vim 4_pod1.yamlNext perform the necessary additions manually:
xxxxxxxxxx# 4_pod1.yamlapiVersionv1kindPodmetadata creationTimestampnull labels runready-if-service-ready nameready-if-service-readyspec containersimagenginx1.16.1-alpine nameready-if-service-ready resources livenessProbe# add from here exec command'true' readinessProbe exec commandsh-c'wget -T2 -O- http://service-am-i-ready:80' # to here dnsPolicyClusterFirst restartPolicyAlwaysstatusThen create the Pod:
xxxxxxxxxxk -f 4_pod1.yaml createAnd confirm it's in a non-ready state:
xxxxxxxxxx➜ k get pod ready-if-service-readyNAME READY STATUS RESTARTS AGEready-if-service-ready 0/1 Running 0 7sWe can also check the reason for this using describe:
xxxxxxxxxx➜ k describe pod ready-if-service-ready ... Warning Unhealthy 18s kubelet, cluster1-node1 Readiness probe failed: Connecting to service-am-i-ready:80 (10.109.194.234:80)wget: download timed outNow we create the second Pod:
xxxxxxxxxxk run am-i-ready --image=nginx:1.16.1-alpine --labels="id=cross-server-ready"The already existing Service service-am-i-ready should now have an Endpoint:
xxxxxxxxxxk describe svc service-am-i-readyk get ep # also possibleWhich will result in our first Pod being ready, just give it a minute for the Readiness probe to check again:
xxxxxxxxxx➜ k get pod ready-if-service-readyNAME READY STATUS RESTARTS AGEready-if-service-ready 1/1 Running 0 53sLook at these Pods coworking together!
Use context: kubectl config use-context k8s-c1-H
There are various Pods in all namespaces. Write a command into /opt/course/5/find_pods.sh which lists all Pods sorted by their AGE (metadata.creationTimestamp).
Write a second command into /opt/course/5/find_pods_uid.sh which lists all Pods sorted by field metadata.uid. Use kubectl sorting for both commands.
A good resources here (and for many other things) is the kubectl-cheat-sheet. You can reach it fast when searching for "cheat sheet" in the Kubernetes docs.
xxxxxxxxxx# /opt/course/5/find_pods.shkubectl get pod -A --sort-by=.metadata.creationTimestamp
And to execute:
xxxxxxxxxx➜ sh /opt/course/5/find_pods.shNAMESPACE NAME ... AGEkube-system kube-scheduler-cluster1-controlplane1 ... 63mkube-system etcd-cluster1-controlplane1 ... 63mkube-system kube-apiserver-cluster1-controlplane1 ... 63mkube-system kube-controller-manager-cluster1-controlplane1 ... 63m...For the second command:
xxxxxxxxxx# /opt/course/5/find_pods_uid.shkubectl get pod -A --sort-by=.metadata.uid
And to execute:
xxxxxxxxxx➜ sh /opt/course/5/find_pods_uid.shNAMESPACE NAME ... AGEkube-system coredns-5644d7b6d9-vwm7g ... 68mproject-c13 c13-3cc-runner-heavy-5486d76dd4-ddvlt ... 63mproject-hamster web-hamster-shop-849966f479-278vp ... 63mproject-c13 c13-3cc-web-646b6c8756-qsg4b ... 63m
Use context: kubectl config use-context k8s-c1-H
Create a new PersistentVolume named safari-pv. It should have a capacity of 2Gi, accessMode ReadWriteOnce, hostPath /Volumes/Data and no storageClassName defined.
Next create a new PersistentVolumeClaim in Namespace project-tiger named safari-pvc . It should request 2Gi storage, accessMode ReadWriteOnce and should not define a storageClassName. The PVC should bound to the PV correctly.
Finally create a new Deployment safari in Namespace project-tiger which mounts that volume at /tmp/safari-data. The Pods of that Deployment should be of image httpd:2.4.41-alpine.
xxxxxxxxxxvim 6_pv.yamlFind an example from https://kubernetes.io/docs and alter it:
xxxxxxxxxx# 6_pv.yamlkindPersistentVolumeapiVersionv1metadata namesafari-pvspec capacity storage2Gi accessModesReadWriteOnce hostPath path"/Volumes/Data"Then create it:
xxxxxxxxxxk -f 6_pv.yaml createNext the PersistentVolumeClaim:
xxxxxxxxxxvim 6_pvc.yamlFind an example from https://kubernetes.io/docs and alter it:
xxxxxxxxxx# 6_pvc.yamlkindPersistentVolumeClaimapiVersionv1metadata namesafari-pvc namespaceproject-tigerspec accessModesReadWriteOnce resources requests storage2GiThen create:
xxxxxxxxxxk -f 6_pvc.yaml createAnd check that both have the status Bound:
xxxxxxxxxx➜ k -n project-tiger get pv,pvcNAME CAPACITY ... STATUS CLAIM ...persistentvolume/safari-pv 2Gi ... Bound project-tiger/safari-pvc ...
NAME STATUS VOLUME CAPACITY ...persistentvolumeclaim/safari-pvc Bound safari-pv 2Gi ...Next we create a Deployment and mount that volume:
xxxxxxxxxxk -n project-tiger create deploy safari \ --image=httpd:2.4.41-alpine $do > 6_dep.yaml
vim 6_dep.yamlAlter the yaml to mount the volume:
xxxxxxxxxx# 6_dep.yamlapiVersionapps/v1kindDeploymentmetadata creationTimestampnull labels appsafari namesafari namespaceproject-tigerspec replicas1 selector matchLabels appsafari strategy template metadata creationTimestampnull labels appsafari spec volumes# addnamedata # add persistentVolumeClaim# add claimNamesafari-pvc # add containersimagehttpd2.4.41-alpine namecontainer volumeMounts# addnamedata # add mountPath/tmp/safari-data # addxxxxxxxxxxk -f 6_dep.yaml createWe can confirm it's mounting correctly:
xxxxxxxxxx➜ k -n project-tiger describe pod safari-5cbf46d6d-mjhsb | grep -A2 Mounts: Mounts: /tmp/safari-data from data (rw) # there it is /var/run/secrets/kubernetes.io/serviceaccount from default-token-n2sjj (ro)
Use context: kubectl config use-context k8s-c1-H
The metrics-server has been installed in the cluster. Your college would like to know the kubectl commands to:
show Nodes resource usage
show Pods and their containers resource usage
Please write the commands into /opt/course/7/node.sh and /opt/course/7/pod.sh.
The command we need to use here is top:
xxxxxxxxxx➜ k top -hDisplay Resource (CPU/Memory/Storage) usage.
The top command allows you to see the resource consumption for nodes or pods.
This command requires Metrics Server to be correctly configured and working on the server.
Available Commands: node Display Resource (CPU/Memory/Storage) usage of nodes pod Display Resource (CPU/Memory/Storage) usage of podsWe see that the metrics server provides information about resource usage:
xxxxxxxxxx➜ k top nodeNAME CPU(cores) CPU% MEMORY(bytes) MEMORY% cluster1-controlplane1 178m 8% 1091Mi 57% cluster1-node1 66m 6% 834Mi 44% cluster1-node2 91m 9% 791Mi 41% We create the first file:
xxxxxxxxxx# /opt/course/7/node.shkubectl top node
For the second file we might need to check the docs again:
xxxxxxxxxx➜ k top pod -hDisplay Resource (CPU/Memory/Storage) usage of pods....Namespace in current context is ignored even if specified with --namespace. --containers=false: If present, print usage of containers within a pod. --no-headers=false: If present, print output without headers....With this we can finish this task:
xxxxxxxxxx# /opt/course/7/pod.shkubectl top pod --containers=true
Use context: kubectl config use-context k8s-c1-H
Ssh into the controlplane node with ssh cluster1-controlplane1. Check how the controlplane components kubelet, kube-apiserver, kube-scheduler, kube-controller-manager and etcd are started/installed on the controlplane node.
Also find out the name of the DNS application and how it's started/installed in the cluster.
Write your findings into file /opt/course/8/controlplane-components.txt. The file should be structured like:
xxxxxxxxxx# /opt/course/8/controlplane-components.txtkubelet: [TYPE]kube-apiserver: [TYPE]kube-scheduler: [TYPE]kube-controller-manager: [TYPE]etcd: [TYPE]dns: [TYPE] [NAME]Choices of [TYPE] are: not-installed, process, static-pod, pod
We could start by finding processes of the requested components, especially the kubelet at first:
xxxxxxxxxx➜ ssh cluster1-controlplane1
root@cluster1-controlplane1:~# ps aux | grep kubelet # shows kubelet processWe can see which components are controlled via systemd looking at /usr/lib/systemd directory:
xxxxxxxxxx➜ root@cluster1-controlplane1:~# find /usr/lib/systemd | grep kube/usr/lib/systemd/system/kubelet.service/usr/lib/systemd/system/kubelet.service.d/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
➜ root@cluster1-controlplane1:~# find /usr/lib/systemd | grep etcdThis shows kubelet is controlled via systemd, but no other service named kube nor etcd. It seems that this cluster has been setup using kubeadm, so we check in the default manifests directory:
xxxxxxxxxx➜ root@cluster1-controlplane1:~# find /etc/kubernetes/manifests//etc/kubernetes/manifests//etc/kubernetes/manifests/kube-controller-manager.yaml/etc/kubernetes/manifests/etcd.yaml/etc/kubernetes/manifests/kube-apiserver.yaml/etc/kubernetes/manifests/kube-scheduler.yaml(The kubelet could also have a different manifests directory specified via parameter --pod-manifest-path in it's systemd startup config)
This means the main 4 controlplane services are setup as static Pods. Actually, let's check all Pods running on in the kube-system Namespace on the controlplane node:
xxxxxxxxxx➜ root@cluster1-controlplane1:~# kubectl -n kube-system get pod -o wide | grep controlplane1coredns-5644d7b6d9-c4f68 1/1 Running ... cluster1-controlplane1coredns-5644d7b6d9-t84sc 1/1 Running ... cluster1-controlplane1etcd-cluster1-controlplane1 1/1 Running ... cluster1-controlplane1kube-apiserver-cluster1-controlplane1 1/1 Running ... cluster1-controlplane1kube-controller-manager-cluster1-controlplane1 1/1 Running ... cluster1-controlplane1kube-proxy-q955p 1/1 Running ... cluster1-controlplane1kube-scheduler-cluster1-controlplane1 1/1 Running ... cluster1-controlplane1weave-net-mwj47 2/2 Running ... cluster1-controlplane1There we see the 4 static pods, with -cluster1-controlplane1 as suffix.
We also see that the dns application seems to be coredns, but how is it controlled?
xxxxxxxxxx➜ root@cluster1-controlplane1$ kubectl -n kube-system get dsNAME DESIRED CURRENT ... NODE SELECTOR AGEkube-proxy 3 3 ... kubernetes.io/os=linux 155mweave-net 3 3 ... <none> 155m
➜ root@cluster1-controlplane1$ kubectl -n kube-system get deployNAME READY UP-TO-DATE AVAILABLE AGEcoredns 2/2 2 2 155mSeems like coredns is controlled via a Deployment. We combine our findings in the requested file:
xxxxxxxxxx# /opt/course/8/controlplane-components.txtkubelet: processkube-apiserver: static-podkube-scheduler: static-podkube-controller-manager: static-podetcd: static-poddns: pod coredns
You should be comfortable investigating a running cluster, know different methods on how a cluster and its services can be setup and be able to troubleshoot and find error sources.
Use context: kubectl config use-context k8s-c2-AC
Ssh into the controlplane node with ssh cluster2-controlplane1. Temporarily stop the kube-scheduler, this means in a way that you can start it again afterwards.
Create a single Pod named manual-schedule of image httpd:2.4-alpine, confirm it's created but not scheduled on any node.
Now you're the scheduler and have all its power, manually schedule that Pod on node cluster2-controlplane1. Make sure it's running.
Start the kube-scheduler again and confirm it's running correctly by creating a second Pod named manual-schedule2 of image httpd:2.4-alpine and check if it's running on cluster2-node1.
First we find the controlplane node:
xxxxxxxxxx➜ k get nodeNAME STATUS ROLES AGE VERSIONcluster2-controlplane1 Ready control-plane 26h v1.31.1cluster2-node1 Ready <none> 26h v1.31.1Then we connect and check if the scheduler is running:
xxxxxxxxxx➜ ssh cluster2-controlplane1
➜ root@cluster2-controlplane1:~# kubectl -n kube-system get pod | grep schedulekube-scheduler-cluster2-controlplane1 1/1 Running 0 6sKill the Scheduler (temporarily):
xxxxxxxxxx➜ root@cluster2-controlplane1:~# cd /etc/kubernetes/manifests/
➜ root@cluster2-controlplane1:~# mv kube-scheduler.yaml ..And it should be stopped:
xxxxxxxxxx➜ root@cluster2-controlplane1:~# kubectl -n kube-system get pod | grep schedule
➜ root@cluster2-controlplane1:~#
Now we create the Pod:
xxxxxxxxxxk run manual-schedule --image=httpd:2.4-alpineAnd confirm it has no node assigned:
xxxxxxxxxx➜ k get pod manual-schedule -o wideNAME READY STATUS ... NODE NOMINATED NODEmanual-schedule 0/1 Pending ... <none> <none>
Let's play the scheduler now:
xxxxxxxxxxk get pod manual-schedule -o yaml > 9.yamlxxxxxxxxxx# 9.yamlapiVersionv1kindPodmetadata creationTimestamp"2020-09-04T15:51:02Z" labels runmanual-schedule managedFields... managerkubectl-run operationUpdate time"2020-09-04T15:51:02Z" namemanual-schedule namespacedefault resourceVersion"3515" selfLink/api/v1/namespaces/default/pods/manual-schedule uid8e9d2532-4779-4e63-b5af-feb82c74a935spec nodeNamecluster2-controlplane1 # add the controlplane node name containersimagehttpd2.4-alpine imagePullPolicyIfNotPresent namemanual-schedule resources terminationMessagePath/dev/termination-log terminationMessagePolicyFile volumeMountsmountPath/var/run/secrets/kubernetes.io/serviceaccount namedefault-token-nxnc7 readOnlytrue dnsPolicyClusterFirst...
The only thing a scheduler does, is that it sets the nodeName for a Pod declaration. How it finds the correct node to schedule on, that's a very much complicated matter and takes many variables into account.
As we cannot kubectl apply or kubectl edit , in this case we need to delete and create or replace:
xxxxxxxxxxk -f 9.yaml replace --forceHow does it look?
xxxxxxxxxx➜ k get pod manual-schedule -o wideNAME READY STATUS ... NODE manual-schedule 1/1 Running ... cluster2-controlplane1It looks like our Pod is running on the controlplane now as requested, although no tolerations were specified. Only the scheduler takes tains/tolerations/affinity into account when finding the correct node name. That's why it's still possible to assign Pods manually directly to a controlplane node and skip the scheduler.
xxxxxxxxxx➜ ssh cluster2-controlplane1
➜ root@cluster2-controlplane1:~# cd /etc/kubernetes/manifests/
➜ root@cluster2-controlplane1:~# mv ../kube-scheduler.yaml .Checks it's running:
xxxxxxxxxx➜ root@cluster2-controlplane1:~# kubectl -n kube-system get pod | grep schedulekube-scheduler-cluster2-controlplane1 1/1 Running 0 16sSchedule a second test Pod:
xxxxxxxxxxk run manual-schedule2 --image=httpd:2.4-alpinexxxxxxxxxx➜ k get pod -o wide | grep schedulemanual-schedule 1/1 Running ... cluster2-controlplane1manual-schedule2 1/1 Running ... cluster2-node1Back to normal.
Use context: kubectl config use-context k8s-c1-H
Create a new ServiceAccount processor in Namespace project-hamster. Create a Role and RoleBinding, both named processor as well. These should allow the new SA to only create Secrets and ConfigMaps in that Namespace.
A ClusterRole|Role defines a set of permissions and where it is available, in the whole cluster or just a single Namespace.
A ClusterRoleBinding|RoleBinding connects a set of permissions with an account and defines where it is applied, in the whole cluster or just a single Namespace.
Because of this there are 4 different RBAC combinations and 3 valid ones:
Role + RoleBinding (available in single Namespace, applied in single Namespace)
ClusterRole + ClusterRoleBinding (available cluster-wide, applied cluster-wide)
ClusterRole + RoleBinding (available cluster-wide, applied in single Namespace)
Role + ClusterRoleBinding (NOT POSSIBLE: available in single Namespace, applied cluster-wide)
We first create the ServiceAccount:
xxxxxxxxxx➜ k -n project-hamster create sa processorserviceaccount/processor createdThen for the Role:
xxxxxxxxxxk -n project-hamster create role -h # examplesSo we execute:
xxxxxxxxxxk -n project-hamster create role processor \ --verb=create \ --resource=secret \ --resource=configmapWhich will create a Role like:
xxxxxxxxxx# kubectl -n project-hamster create role processor --verb=create --resource=secret --resource=configmapapiVersionrbac.authorization.k8s.io/v1kindRolemetadata nameprocessor namespaceproject-hamsterrulesapiGroups"" resourcessecretsconfigmaps verbscreateNow we bind the Role to the ServiceAccount:
xxxxxxxxxxk -n project-hamster create rolebinding -h # examplesSo we create it:
xxxxxxxxxxk -n project-hamster create rolebinding processor \ --role processor \ --serviceaccount project-hamster:processorThis will create a RoleBinding like:
xxxxxxxxxx# kubectl -n project-hamster create rolebinding processor --role processor --serviceaccount project-hamster:processorapiVersionrbac.authorization.k8s.io/v1kindRoleBindingmetadata nameprocessor namespaceproject-hamsterroleRef apiGrouprbac.authorization.k8s.io kindRole nameprocessorsubjectskindServiceAccount nameprocessor namespaceproject-hamsterTo test our RBAC setup we can use kubectl auth can-i:
xxxxxxxxxxk auth can-i -h # examplesLike this:
xxxxxxxxxx➜ k -n project-hamster auth can-i create secret --as system:serviceaccount:project-hamster:processoryes
➜ k -n project-hamster auth can-i create configmap --as system:serviceaccount:project-hamster:processoryes
➜ k -n project-hamster auth can-i create pod --as system:serviceaccount:project-hamster:processorno
➜ k -n project-hamster auth can-i delete secret --as system:serviceaccount:project-hamster:processorno
➜ k -n project-hamster auth can-i get configmap --as system:serviceaccount:project-hamster:processornoDone.
Use context: kubectl config use-context k8s-c1-H
Use Namespace project-tiger for the following. Create a DaemonSet named ds-important with image httpd:2.4-alpine and labels id=ds-important and uuid=18426a0b-5f59-4e10-923f-c0e078e82462. The Pods it creates should request 10 millicore cpu and 10 mebibyte memory. The Pods of that DaemonSet should run on all nodes, also controlplanes.
As of now we aren't able to create a DaemonSet directly using kubectl, so we create a Deployment and just change it up:
xxxxxxxxxxk -n project-tiger create deployment --image=httpd:2.4-alpine ds-important $do > 11.yaml
vim 11.yaml(Sure you could also search for a DaemonSet example yaml in the Kubernetes docs and alter it.)
Then we adjust the yaml to:
xxxxxxxxxx# 11.yamlapiVersionapps/v1kindDaemonSet # change from Deployment to Daemonsetmetadata creationTimestampnull labels# add idds-important # add uuid18426a0b-5f59-4e10-923f-c0e078e82462 # add nameds-important namespaceproject-tiger # importantspec #replicas: 1 # remove selector matchLabels idds-important # add uuid18426a0b-5f59-4e10-923f-c0e078e82462 # add #strategy: {} # remove template metadata creationTimestampnull labels idds-important # add uuid18426a0b-5f59-4e10-923f-c0e078e82462 # add spec containersimagehttpd2.4-alpine nameds-important resources requests# add cpu10m # add memory10Mi # add tolerations# addeffectNoSchedule # add keynode-role.kubernetes.io/control-plane # add#status: {} # removeIt was requested that the DaemonSet runs on all nodes, so we need to specify the toleration for this.
Let's confirm:
xxxxxxxxxxk -f 11.yaml createxxxxxxxxxx➜ k -n project-tiger get dsNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGEds-important 3 3 3 3 3 <none> 8sxxxxxxxxxx➜ k -n project-tiger get pod -l id=ds-important -o wideNAME READY STATUS NODEds-important-6pvgm 1/1 Running ... cluster1-node1ds-important-lh5ts 1/1 Running ... cluster1-controlplane1ds-important-qhjcq 1/1 Running ... cluster1-node2
Use context: kubectl config use-context k8s-c1-H
Implement the following in Namespace project-tiger:
Create a Deployment named deploy-important with 3 replicas
The Deployment and its Pods should have label id=very-important
It should have two containers:
First named container1 with image nginx:1.17.6-alpine
Second named container2 with image google/pause
There should only ever be one Pod of that Deployment running on one worker node, use topologyKey: kubernetes.io/hostname for this
ℹ️ Because there are two worker nodes and the Deployment has three replicas the result should be that the third Pod won't be scheduled. In a way it simulates the behaviour of a DaemonSet, but using a Deployment and a fixed number of replicas.
There are two possible ways, one using podAntiAffinity and one using topologySpreadConstraint.
The idea here is that we create a "Inter-pod anti-affinity" which allows us to say a Pod should only be scheduled on a node where another Pod of a specific label (here the same label) is not already running.
Let's begin by creating the Deployment template:
xxxxxxxxxxk -n project-tiger create deployment \ --image=nginx:1.17.6-alpine deploy-important $do > 12.yaml
vim 12.yamlThen change the yaml to:
xxxxxxxxxx# 12.yamlapiVersionapps/v1kindDeploymentmetadata creationTimestampnull labels idvery-important # change namedeploy-important namespaceproject-tiger # importantspec replicas3 # change selector matchLabels idvery-important # change strategy template metadata creationTimestampnull labels idvery-important # change spec containersimagenginx1.17.6-alpine namecontainer1 # change resourcesimagegoogle/pause # add namecontainer2 # add affinity# add podAntiAffinity# add requiredDuringSchedulingIgnoredDuringExecution# addlabelSelector# add matchExpressions# addkeyid # add operatorIn # add values# addvery-important # add topologyKeykubernetes.io/hostname # addstatusSpecify a topologyKey, which is a pre-populated Kubernetes label, you can find this by describing a node.
We can achieve the same with topologySpreadConstraints. Best to try out and play with both.
xxxxxxxxxx# 12.yamlapiVersionapps/v1kindDeploymentmetadata creationTimestampnull labels idvery-important # change namedeploy-important namespaceproject-tiger # importantspec replicas3 # change selector matchLabels idvery-important # change strategy template metadata creationTimestampnull labels idvery-important # change spec containersimagenginx1.17.6-alpine namecontainer1 # change resourcesimagegoogle/pause # add namecontainer2 # add topologySpreadConstraints# addmaxSkew1 # add topologyKeykubernetes.io/hostname # add whenUnsatisfiableDoNotSchedule # add labelSelector# add matchLabels# add idvery-important # addstatus
Let's run it:
xxxxxxxxxxk -f 12.yaml createThen we check the Deployment status where it shows 2/3 ready count:
xxxxxxxxxx➜ k -n project-tiger get deploy -l id=very-importantNAME READY UP-TO-DATE AVAILABLE AGEdeploy-important 2/3 3 2 2m35sAnd running the following we see one Pod on each worker node and one not scheduled.
xxxxxxxxxx➜ k -n project-tiger get pod -o wide -l id=very-importantNAME READY STATUS ... NODE deploy-important-58db9db6fc-9ljpw 2/2 Running ... cluster1-node1deploy-important-58db9db6fc-lnxdb 0/2 Pending ... <none> deploy-important-58db9db6fc-p2rz8 2/2 Running ... cluster1-node2If we kubectl describe the Pod deploy-important-58db9db6fc-lnxdb it will show us the reason for not scheduling is our implemented podAntiAffinity ruling:
xxxxxxxxxxWarning FailedScheduling 63s (x3 over 65s) default-scheduler 0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/control-plane: }, that the pod didn't tolerate, 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules.
Or our topologySpreadConstraints:
xxxxxxxxxxWarning FailedScheduling 16s default-scheduler 0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/control-plane: }, that the pod didn't tolerate, 2 node(s) didn't match pod topology spread constraints.
Use context: kubectl config use-context k8s-c1-H
Create a Pod named multi-container-playground in Namespace default with three containers, named c1, c2 and c3. There should be a volume attached to that Pod and mounted into every container, but the volume shouldn't be persisted or shared with other Pods.
Container c1 should be of image nginx:1.17.6-alpine and have the name of the node where its Pod is running available as environment variable MY_NODE_NAME.
Container c2 should be of image busybox:1.31.1 and write the output of the date command every second in the shared volume into file date.log. You can use while true; do date >> /your/vol/path/date.log; sleep 1; done for this.
Container c3 should be of image busybox:1.31.1 and constantly send the content of file date.log from the shared volume to stdout. You can use tail -f /your/vol/path/date.log for this.
Check the logs of container c3 to confirm correct setup.
First we create the Pod template:
xxxxxxxxxxk run multi-container-playground --image=nginx:1.17.6-alpine $do > 13.yaml
vim 13.yamlAnd add the other containers and the commands they should execute:
xxxxxxxxxx# 13.yamlapiVersionv1kindPodmetadata creationTimestampnull labels runmulti-container-playground namemulti-container-playgroundspec containersimagenginx1.17.6-alpine namec1 # change resources env# addnameMY_NODE_NAME # add valueFrom# add fieldRef# add fieldPathspec.nodeName # add volumeMounts# addnamevol # add mountPath/vol # addimagebusybox1.31.1 # add namec2 # add command"sh" "-c" "while true; do date >> /vol/date.log; sleep 1; done" # add volumeMounts# addnamevol # add mountPath/vol # addimagebusybox1.31.1 # add namec3 # add command"sh" "-c" "tail -f /vol/date.log" # add volumeMounts# addnamevol # add mountPath/vol # add dnsPolicyClusterFirst restartPolicyAlways volumes# addnamevol # add emptyDir # addstatusxxxxxxxxxxk -f 13.yaml createOh boy, lot's of requested things. We check if everything is good with the Pod:
xxxxxxxxxx➜ k get pod multi-container-playgroundNAME READY STATUS RESTARTS AGEmulti-container-playground 3/3 Running 0 95sGood, then we check if container c1 has the requested node name as env variable:
xxxxxxxxxx➜ k exec multi-container-playground -c c1 -- env | grep MYMY_NODE_NAME=cluster1-node2And finally we check the logging:
xxxxxxxxxx➜ k logs multi-container-playground -c c3Sat Dec 7 16:05:10 UTC 2077Sat Dec 7 16:05:11 UTC 2077Sat Dec 7 16:05:12 UTC 2077Sat Dec 7 16:05:13 UTC 2077Sat Dec 7 16:05:14 UTC 2077Sat Dec 7 16:05:15 UTC 2077Sat Dec 7 16:05:16 UTC 2077
Use context: kubectl config use-context k8s-c1-H
You're ask to find out following information about the cluster k8s-c1-H:
How many controlplane nodes are available?
How many worker nodes are available?
What is the Service CIDR?
Which Networking (or CNI Plugin) is configured and where is its config file?
Which suffix will static pods have that run on cluster1-node1?
Write your answers into file /opt/course/14/cluster-info, structured like this:
xxxxxxxxxx# /opt/course/14/cluster-info1: [ANSWER]2: [ANSWER]3: [ANSWER]4: [ANSWER]5: [ANSWER]
xxxxxxxxxx➜ k get nodeNAME STATUS ROLES AGE VERSIONcluster1-controlplane1 Ready control-plane 27h v1.31.1cluster1-node1 Ready <none> 27h v1.31.1cluster1-node2 Ready <none> 27h v1.31.1We see one controlplane and two workers.
xxxxxxxxxx➜ ssh cluster1-controlplane1
➜ root@cluster1-controlplane1:~# cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep range - --service-cluster-ip-range=10.96.0.0/12
xxxxxxxxxx➜ root@cluster1-controlplane1:~# find /etc/cni/net.d//etc/cni/net.d//etc/cni/net.d/10-weave.conflist
➜ root@cluster1-controlplane1:~# cat /etc/cni/net.d/10-weave.conflist{ "cniVersion": "0.3.0", "name": "weave",...By default the kubelet looks into /etc/cni/net.d to discover the CNI plugins. This will be the same on every controlplane and worker nodes.
The suffix is the node hostname with a leading hyphen. It used to be -static in earlier Kubernetes versions.
The resulting /opt/course/14/cluster-info could look like:
xxxxxxxxxx# /opt/course/14/cluster-info# How many controlplane nodes are available?1: 1# How many worker nodes are available?2: 2# What is the Service CIDR?3: 10.96.0.0/12# Which Networking (or CNI Plugin) is configured and where is its config file?4: Weave, /etc/cni/net.d/10-weave.conflist# Which suffix will static pods have that run on cluster1-node1?5: -cluster1-node1
Use context: kubectl config use-context k8s-c2-AC
Write a command into /opt/course/15/cluster_events.sh which shows the latest events in the whole cluster, ordered by time (metadata.creationTimestamp). Use kubectl for it.
Now delete the kube-proxy Pod running on node cluster2-node1 and write the events this caused into /opt/course/15/pod_kill.log.
Finally kill the containerd container of the kube-proxy Pod on node cluster2-node1 and write the events into /opt/course/15/container_kill.log.
Do you notice differences in the events both actions caused?
xxxxxxxxxx# /opt/course/15/cluster_events.shkubectl get events -A --sort-by=.metadata.creationTimestamp
Now we delete the kube-proxy Pod:
xxxxxxxxxxk -n kube-system get pod -o wide | grep proxy # find pod running on cluster2-node1
k -n kube-system delete pod kube-proxy-z64cgNow check the events:
xxxxxxxxxxsh /opt/course/15/cluster_events.shWrite the events the killing caused into /opt/course/15/pod_kill.log:
xxxxxxxxxx# /opt/course/15/pod_kill.logkube-system 9s Normal Killing pod/kube-proxy-jsv7t ...kube-system 3s Normal SuccessfulCreate daemonset/kube-proxy ...kube-system <unknown> Normal Scheduled pod/kube-proxy-m52sx ...default 2s Normal Starting node/cluster2-node1 ...kube-system 2s Normal Created pod/kube-proxy-m52sx ...kube-system 2s Normal Pulled pod/kube-proxy-m52sx ...kube-system 2s Normal Started pod/kube-proxy-m52sx ...
Finally we will try to provoke events by killing the container belonging to the container of the kube-proxy Pod:
xxxxxxxxxx➜ ssh cluster2-node1
➜ root@cluster2-node1:~# crictl ps | grep kube-proxy1e020b43c4423 36c4ebbc9d979 About an hour ago Running kube-proxy ...
➜ root@cluster2-node1:~# crictl rm 1e020b43c44231e020b43c4423
➜ root@cluster2-node1:~# crictl ps | grep kube-proxy0ae4245707910 36c4ebbc9d979 17 seconds ago Running kube-proxy ... We killed the main container (1e020b43c4423), but also noticed that a new container (0ae4245707910) was directly created. Thanks Kubernetes!
Now we see if this caused events again and we write those into the second file:
xxxxxxxxxxsh /opt/course/15/cluster_events.shxxxxxxxxxx# /opt/course/15/container_kill.logkube-system 13s Normal Created pod/kube-proxy-m52sx ...kube-system 13s Normal Pulled pod/kube-proxy-m52sx ...kube-system 13s Normal Started pod/kube-proxy-m52sx ...
Comparing the events we see that when we deleted the whole Pod there were more things to be done, hence more events. For example was the DaemonSet in the game to re-create the missing Pod. Where when we manually killed the main container of the Pod, the Pod would still exist but only its container needed to be re-created, hence less events.
Use context: kubectl config use-context k8s-c1-H
Write the names of all namespaced Kubernetes resources (like Pod, Secret, ConfigMap...) into /opt/course/16/resources.txt.
Find the project-* Namespace with the highest number of Roles defined in it and write its name and amount of Roles into /opt/course/16/crowded-namespace.txt.
Now we can get a list of all resources like:
xxxxxxxxxxk api-resources # shows all
k api-resources -h # help always good
k api-resources --namespaced -o name > /opt/course/16/resources.txtWhich results in the file:
xxxxxxxxxx# /opt/course/16/resources.txtbindingsconfigmapsendpointseventslimitrangespersistentvolumeclaimspodspodtemplatesreplicationcontrollersresourcequotassecretsserviceaccountsservicescontrollerrevisions.appsdaemonsets.appsdeployments.appsreplicasets.appsstatefulsets.appslocalsubjectaccessreviews.authorization.k8s.iohorizontalpodautoscalers.autoscalingcronjobs.batchjobs.batchleases.coordination.k8s.ioevents.events.k8s.ioingresses.extensionsingresses.networking.k8s.ionetworkpolicies.networking.k8s.iopoddisruptionbudgets.policyrolebindings.rbac.authorization.k8s.ioroles.rbac.authorization.k8s.io
xxxxxxxxxx➜ k -n project-c13 get role --no-headers | wc -lNo resources found in project-c13 namespace.0
➜ k -n project-c14 get role --no-headers | wc -l300
➜ k -n project-hamster get role --no-headers | wc -lNo resources found in project-hamster namespace.0
➜ k -n project-snake get role --no-headers | wc -lNo resources found in project-snake namespace.0
➜ k -n project-tiger get role --no-headers | wc -lNo resources found in project-tiger namespace.0Finally we write the name and amount into the file:
xxxxxxxxxx# /opt/course/16/crowded-namespace.txtproject-c14 with 300 resources
Use context: kubectl config use-context k8s-c1-H
In Namespace project-tiger create a Pod named tigers-reunite of image httpd:2.4.41-alpine with labels pod=container and container=pod. Find out on which node the Pod is scheduled. Ssh into that node and find the containerd container belonging to that Pod.
Using command crictl:
Write the ID of the container and the info.runtimeType into /opt/course/17/pod-container.txt
Write the logs of the container into /opt/course/17/pod-container.log
First we create the Pod:
xxxxxxxxxxk -n project-tiger run tigers-reunite \ --image=httpd:2.4.41-alpine \ --labels "pod=container,container=pod"Next we find out the node it's scheduled on:
xxxxxxxxxxk -n project-tiger get pod -o wide
# or fancy:k -n project-tiger get pod tigers-reunite -o jsonpath="{.spec.nodeName}"Then we ssh into that node and and check the container info:
xxxxxxxxxx➜ ssh cluster1-node2
➜ root@cluster1-node2:~# crictl ps | grep tigers-reuniteb01edbe6f89ed 54b0995a63052 5 seconds ago Running tigers-reunite ...
➜ root@cluster1-node2:~# crictl inspect b01edbe6f89ed | grep runtimeType "runtimeType": "io.containerd.runc.v2",Then we fill the requested file (on the main terminal):
xxxxxxxxxx# /opt/course/17/pod-container.txtb01edbe6f89ed io.containerd.runc.v2
Finally we write the container logs in the second file:
xxxxxxxxxxssh cluster1-node2 'crictl logs b01edbe6f89ed' &> /opt/course/17/pod-container.logThe &> in above's command redirects both the standard output and standard error.
You could also simply run crictl logs on the node and copy the content manually, if it's not a lot. The file should look like:
xxxxxxxxxx# /opt/course/17/pod-container.logAH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.44.0.37. Set the 'ServerName' directive globally to suppress this messageAH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.44.0.37. Set the 'ServerName' directive globally to suppress this message[Mon Sep 13 13:32:18.555280 2021] [mpm_event:notice] [pid 1:tid 139929534545224] AH00489: Apache/2.4.41 (Unix) configured -- resuming normal operations[Mon Sep 13 13:32:18.555610 2021] [core:notice] [pid 1:tid 139929534545224] AH00094: Command line: 'httpd -D FOREGROUND'
Use context: kubectl config use-context k8s-c3-CCC
There seems to be an issue with the kubelet not running on cluster3-node1. Fix it and confirm that cluster has node cluster3-node1 available in Ready state afterwards. You should be able to schedule a Pod on cluster3-node1 afterwards.
Write the reason of the issue into /opt/course/18/reason.txt.
The procedure on tasks like these should be to check if the kubelet is running, if not start it, then check its logs and correct errors if there are some.
Always helpful to check if other clusters already have some of the components defined and running, so you can copy and use existing config files. Though in this case it might not need to be necessary.
Check node status:
xxxxxxxxxx➜ k get nodeNAME STATUS ROLES AGE VERSIONcluster3-controlplane1 Ready control-plane 14d v1.31.1cluster3-node1 NotReady <none> 14d v1.31.1First we check if the kubelet is running:
xxxxxxxxxx➜ ssh cluster3-node1
➜ root@cluster3-node1:~# ps aux | grep kubeletroot 29294 0.0 0.2 14856 1016 pts/0 S+ 11:30 0:00 grep --color=auto kubeletNope, so we check if it's configured using systemd as service:
xxxxxxxxxx➜ root@cluster3-node1:~# service kubelet status● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: inactive (dead) (Result: exit-code) since Thu 2024-01-04 13:12:54 UTC; 1h 23min ago Docs: https://kubernetes.io/docs/ Process: 27577 ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=> Main PID: 27577 (code=exited, status=203/EXEC)
Jan 04 13:12:52 cluster3-node1 systemd[1]: kubelet.service: Main process exited, code=exited, status=203/EXECJan 04 13:12:52 cluster3-node1 systemd[1]: kubelet.service: Failed with result 'exit-code'.Jan 04 13:12:54 cluster3-node1 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.Yes, it's configured as a service with config at /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf, but we see it's inactive. Let's try to start it:
xxxxxxxxxx➜ root@cluster3-node1:~# service kubelet start
➜ root@cluster3-node1:~# service kubelet status● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: activating (auto-restart) (Result: exit-code) since Thu 2024-01-04 14:37:02 UTC; 6s ago Docs: https://kubernetes.io/docs/ Process: 27935 ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=> Main PID: 27935 (code=exited, status=203/EXEC)
Jan 04 14:37:02 cluster3-node1 systemd[1]: kubelet.service: Main process exited, code=exited, status=203/EXECJan 04 14:37:02 cluster3-node1 systemd[1]: kubelet.service: Failed with result 'exit-code'.We see it's trying to execute /usr/local/bin/kubelet with some parameters defined in its service config file. A good way to find errors and get more logs is to run the command manually (usually also with its parameters).
xxxxxxxxxx➜ root@cluster3-node1:~# /usr/local/bin/kubelet-bash: /usr/local/bin/kubelet: No such file or directory
➜ root@cluster3-node1:~# whereis kubeletkubelet: /usr/bin/kubeletAnother way would be to see the extended logging of a service like using journalctl -u kubelet.
Well, there we have it, wrong path specified. Correct the path in file /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf and run:
xxxxxxxxxxvim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf # fix binary path
systemctl daemon-reload
service kubelet restart
service kubelet status # should now show runningAlso the node should be available for the api server, give it a bit of time though:
xxxxxxxxxx➜ k get nodeNAME STATUS ROLES AGE VERSIONcluster3-controlplane1 Ready control-plane 14d v1.31.1cluster3-node1 Ready <none> 14d v1.31.1Finally we write the reason into the file:
xxxxxxxxxx# /opt/course/18/reason.txtwrong path to kubelet binary specified in service config
ℹ️ This task can only be solved if questions 18 or 20 have been successfully implemented and the k8s-c3-CCC cluster has a functioning worker node.
Use context: kubectl config use-context k8s-c3-CCC
Do the following in a new Namespace secret. Create a Pod named secret-pod of image busybox:1.31.1 which should keep running for some time.
There is an existing Secret located at /opt/course/19/secret1.yaml, create it in the Namespace secret and mount it readonly into the Pod at /tmp/secret1.
Create a new Secret in Namespace secret called secret2 which should contain user=user1 and pass=1234. These entries should be available inside the Pod's container as environment variables APP_USER and APP_PASS.
Confirm everything is working.
First we create the Namespace and the requested Secrets in it:
xxxxxxxxxxk create ns secret
cp /opt/course/19/secret1.yaml 19_secret1.yaml
vim 19_secret1.yamlWe need to adjust the Namespace for that Secret:
xxxxxxxxxx# 19_secret1.yamlapiVersionv1data haltIyEgL2Jpbi9zaAo...kindSecretmetadata creationTimestampnull namesecret1 namespacesecret # changexxxxxxxxxxk -f 19_secret1.yaml createNext we create the second Secret:
xxxxxxxxxxk -n secret create secret generic secret2 --from-literal=user=user1 --from-literal=pass=1234Now we create the Pod template:
xxxxxxxxxxk -n secret run secret-pod --image=busybox:1.31.1 $do -- sh -c "sleep 5d" > 19.yaml
vim 19.yamlThen make the necessary changes:
xxxxxxxxxx# 19.yamlapiVersionv1kindPodmetadata creationTimestampnull labels runsecret-pod namesecret-pod namespacesecret # addspec containersargssh-csleep 1d imagebusybox1.31.1 namesecret-pod resources env# addnameAPP_USER # add valueFrom# add secretKeyRef# add namesecret2 # add keyuser # addnameAPP_PASS # add valueFrom# add secretKeyRef# add namesecret2 # add keypass # add volumeMounts# addnamesecret1 # add mountPath/tmp/secret1 # add readOnlytrue # add dnsPolicyClusterFirst restartPolicyAlways volumes# addnamesecret1 # add secret# add secretNamesecret1 # addstatusIt might not be necessary in current K8s versions to specify the readOnly: true because it's the default setting anyways.
And execute:
xxxxxxxxxxk -f 19.yaml createFinally we check if all is correct:
xxxxxxxxxx➜ k -n secret exec secret-pod -- env | grep APPAPP_PASS=1234APP_USER=user1xxxxxxxxxx➜ k -n secret exec secret-pod -- find /tmp/secret1/tmp/secret1/tmp/secret1/..data/tmp/secret1/halt/tmp/secret1/..2019_12_08_12_15_39.463036797/tmp/secret1/..2019_12_08_12_15_39.463036797/haltxxxxxxxxxx➜ k -n secret exec secret-pod -- cat /tmp/secret1/halt#! /bin/sh### BEGIN INIT INFO# Provides: halt# Required-Start:# Required-Stop:# Default-Start:# Default-Stop: 0# Short-Description: Execute the halt command.# Description:...All is good.
Use context: kubectl config use-context k8s-c3-CCC
Your coworker said node cluster3-node2 is running an older Kubernetes version and is not even part of the cluster. Update Kubernetes on that node to the exact version that's running on cluster3-controlplane1. Then add this node to the cluster. Use kubeadm for this.
Search in the docs for kubeadm upgrade: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade
xxxxxxxxxx➜ k get nodeNAME STATUS ROLES AGE VERSIONcluster3-controlplane1 Ready control-plane 16h v1.31.1cluster3-node1 NotReady <none> 16h v1.31.1Controlplane node seems to be running Kubernetes 1.31.1. Node cluster3-node1 might not yet be Ready or part of cluster depending on the completion of a previous task. But this task is about node cluster3-node2 so we can continue anyways:
xxxxxxxxxx➜ ssh cluster3-node2
➜ root@cluster3-node2:~# kubectl versionClient Version: v1.30.5Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3The connection to the server localhost:8080 was refused - did you specify the right host or port?
➜ root@cluster3-node2:~# kubelet --versionKubernetes v1.30.5
➜ root@cluster3-node2:~# kubeadm versionkubeadm version: &version.Info{Major:"1", Minor:"31", GitVersion:"v1.31.1", GitCommit:"948afe5ca072329a73c8e79ed5938717a5cb3d21", GitTreeState:"clean", BuildDate:"2024-09-11T21:26:49Z", GoVersion:"go1.22.6", Compiler:"gc", Platform:"linux/amd64"}Above we can see that kubeadm is already installed in the wanted version, so we don't need to install it. Hence we can run:
xxxxxxxxxx➜ root@cluster3-node2:~# kubeadm upgrade nodecouldn't create a Kubernetes client from file "/etc/kubernetes/kubelet.conf": failed to load admin kubeconfig: open /etc/kubernetes/kubelet.conf: no such file or directoryTo see the stack trace of this error execute with --v=5 or higherThis is usually the proper command to upgrade a node. But this error means that this node was never even initialised, so nothing to update here. This will be done later using kubeadm join. For now we can continue with kubelet and kubectl:
xxxxxxxxxx➜ root@cluster3-node2:~# apt updateHit:1 http://ppa.launchpad.net/rmescandon/yq/ubuntu focal InReleaseHit:3 http://us.archive.ubuntu.com/ubuntu focal InRelease Get:4 http://security.ubuntu.com/ubuntu focal-security InRelease [128 kB]Hit:2 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.30/deb InReleaseGet:5 http://us.archive.ubuntu.com/ubuntu focal-updates InRelease [128 kB] Hit:6 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.31/deb InReleaseHit:7 http://us.archive.ubuntu.com/ubuntu focal-backports InRelease Get:8 http://security.ubuntu.com/ubuntu focal-security/main amd64 c-n-f Metadata [14.3 kB]Get:9 http://security.ubuntu.com/ubuntu focal-security/universe amd64 c-n-f Metadata ...241 packages can be upgraded. Run 'apt list --upgradable' to see them.
➜ root@cluster3-node2:~# apt show kubectl -a | grep 1.31Version: 1.31.1-1.1APT-Sources: https://pkgs.k8s.io/core:/stable:/v1.31/deb PackagesVersion: 1.31.0-1.1APT-Sources: https://pkgs.k8s.io/core:/stable:/v1.31/deb Packages
➜ root@cluster3-node2:~# apt install kubectl=1.31.1-1.1 kubelet=1.31.1-1.1Reading package lists... DoneBuilding dependency tree Reading state information... DoneThe following packages will be upgraded: kubectl kubelet2 upgraded, 0 newly installed, 0 to remove and 239 not upgraded.Need to get 26.4 MB of archives.After this operation, 18.3 MB disk space will be freed.Get:1 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.31/deb kubectl 1.31.1-1.1 [11.2 MB]Get:2 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.31/deb kubelet 1.31.1-1.1 [15.2 MB]Fetched 26.4 MB in 1s (43.7 MB/s)(Reading database ... 112531 files and directories currently installed.)Preparing to unpack .../kubectl_1.31.1-1.1_amd64.deb ...Unpacking kubectl (1.31.1-1.1) over (1.30.5-1.1) ...Preparing to unpack .../kubelet_1.31.1-1.1_amd64.deb ...Unpacking kubelet (1.31.1-1.1) over (1.30.5-1.1) ...Setting up kubectl (1.31.1-1.1) ...Setting up kubelet (1.31.1-1.1) ...
➜ root@cluster3-node2:~# kubelet --versionKubernetes v1.31.1Now we're up to date with kubeadm, kubectl and kubelet. Restart the kubelet:
xxxxxxxxxx➜ root@cluster3-node2:~# service kubelet restart
➜ root@cluster3-node2:~# service kubelet status● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: activating (auto-restart) (Result: exit-code) since Fri 2024-09-20 09:12:42 UTC; 3s ago Docs: https://kubernetes.io/docs/ Process: 36422 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_> Main PID: 36422 (code=exited, status=1/FAILURE)
Sep 20 09:12:42 cluster3-node2 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURESep 20 09:12:42 cluster3-node2 systemd[1]: kubelet.service: Failed with result 'exit-code'.These errors occur because we still need to run kubeadm join to join the node into the cluster. Let's do this in the next step.
First we log into the controlplane1 and generate a new TLS bootstrap token, also printing out the join command:
xxxxxxxxxx➜ ssh cluster3-controlplane1
➜ root@cluster3-controlplane1:~# kubeadm token create --print-join-commandkubeadm join 192.168.100.31:6443 --token u9d0wi.hl937rbv168bpfxi --discovery-token-ca-cert-hash sha256:ad62fd26e3e454ac380d006c045fa3665ce20643d79eb0085614a02fa77749a8
➜ root@cluster3-controlplane1:~# kubeadm token listTOKEN TTL EXPIRESd7561d.f08jvu4iavd8h88b 7h 2024-09-20T16:17:09Zu9d0wi.hl937rbv168bpfxi 23h 2024-09-21T09:13:06Zva6b7i.vnomejzayd2jl59n <forever> <never> We see the expiration of 23h for our token, we could adjust this by passing the ttl argument.
Next we connect again to cluster3-node2 and simply execute the join command:
xxxxxxxxxx➜ ssh cluster3-node2
➜ root@cluster3-node2:~# kubeadm join 192.168.100.31:6443 --token u9d0wi.hl937rbv168bpfxi --discovery-token-ca-cert-hash sha256:ad62fd26e3e454ac380d006c045fa3665ce20643d79eb0085614a02fa77749a8
[preflight] Running pre-flight checks [WARNING FileExisting-socat]: socat not found in system path[preflight] Reading configuration from the cluster...[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"[kubelet-start] Starting the kubelet[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s[kubelet-check] The kubelet is healthy after 2.014840474s[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap
This node has joined the cluster:* Certificate signing request was sent to apiserver and a response was received.* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
➜ root@cluster3-node2:~# service kubelet status● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Fri 2024-09-20 09:15:38 UTC; 14s ago Docs: https://kubernetes.io/docs/ Main PID: 37859 (kubelet) Tasks: 10 (limit: 462) Memory: 46.0M CGroup: /system.slice/kubelet.service └─37859 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubecon>...If you have troubles with kubeadm join you might need to run kubeadm reset before.
This looks great though for us. Finally we head back to the main terminal and check the node status:
xxxxxxxxxx➜ k get nodeNAME STATUS ROLES AGE VERSIONcluster3-controlplane1 Ready control-plane 16h v1.31.1cluster3-node1 NotReady <none> 16h v1.31.1cluster3-node2 NotReady <none> 14s v1.31.1Give it a bit of time till the node is ready.
xxxxxxxxxx➜ k get nodeNAME STATUS ROLES AGE VERSIONcluster3-controlplane1 Ready control-plane 16h v1.31.1cluster3-node1 NotReady <none> 16h v1.31.1cluster3-node2 Ready <none> 34s v1.31.1We see cluster3-node2 is now available and up to date.
Use context: kubectl config use-context k8s-c3-CCC
Create a Static Pod named my-static-pod in Namespace default on cluster3-controlplane1. It should be of image nginx:1.16-alpine and have resource requests for 10m CPU and 20Mi memory.
Then create a NodePort Service named static-pod-service which exposes that static Pod on port 80 and check if it has Endpoints and if it's reachable through the cluster3-controlplane1 internal IP address. You can connect to the internal node IPs from your main terminal.
xxxxxxxxxx➜ ssh cluster3-controlplane1
➜ root@cluster3-controlplane1:~# cd /etc/kubernetes/manifests/
➜ root@cluster3-controlplane1:~# kubectl run my-static-pod --image=nginx:1.16-alpine -o yaml --dry-run=client > my-static-pod.yamlThen edit the my-static-pod.yaml to add the requested resource requests:
xxxxxxxxxx# /etc/kubernetes/manifests/my-static-pod.yamlapiVersionv1kindPodmetadata creationTimestampnull labels runmy-static-pod namemy-static-podspec containersimagenginx1.16-alpine namemy-static-pod resources requests cpu10m memory20Mi dnsPolicyClusterFirst restartPolicyAlwaysstatusAnd make sure it's running:
xxxxxxxxxx➜ k get pod -A | grep my-staticNAMESPACE NAME READY STATUS ... AGEdefault my-static-pod-cluster3-controlplane1 1/1 Running ... 22sNow we expose that static Pod:
xxxxxxxxxxk expose pod my-static-pod-cluster3-controlplane1 \ --name static-pod-service \ --type=NodePort \ --port 80This would generate a Service like:
xxxxxxxxxx# kubectl expose pod my-static-pod-cluster3-controlplane1 --name static-pod-service --type=NodePort --port 80apiVersionv1kindServicemetadata creationTimestampnull labels runmy-static-pod namestatic-pod-servicespec portsport80 protocolTCP targetPort80 selector runmy-static-pod typeNodePortstatus loadBalancerThen run and test:
xxxxxxxxxx➜ k get svc,ep -l run=my-static-podNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEservice/static-pod-service NodePort 10.99.168.252 <none> 80:30352/TCP 30s
NAME ENDPOINTS AGEendpoints/static-pod-service 10.32.0.4:80 30sLooking good.
Use context: kubectl config use-context k8s-c2-AC
Check how long the kube-apiserver server certificate is valid on cluster2-controlplane1. Do this with openssl or cfssl. Write the expiration date into /opt/course/22/expiration.
Also run the correct kubeadm command to list the expiration dates and confirm both methods show the same date.
Write the correct kubeadm command that would renew the apiserver server certificate into /opt/course/22/kubeadm-renew-certs.sh.
First let's find that certificate:
xxxxxxxxxx➜ ssh cluster2-controlplane1
➜ root@cluster2-controlplane1:~# find /etc/kubernetes/pki | grep apiserver/etc/kubernetes/pki/apiserver.crt/etc/kubernetes/pki/apiserver-etcd-client.crt/etc/kubernetes/pki/apiserver-etcd-client.key/etc/kubernetes/pki/apiserver-kubelet-client.crt/etc/kubernetes/pki/apiserver.key/etc/kubernetes/pki/apiserver-kubelet-client.keyNext we use openssl to find out the expiration date:
xxxxxxxxxx➜ root@cluster2-controlplane1:~# openssl x509 -noout -text -in /etc/kubernetes/pki/apiserver.crt | grep Validity -A2 Validity Not Before: Dec 20 18:05:20 2022 GMT Not After : Dec 20 18:05:20 2023 GMTThere we have it, so we write it in the required location on our main terminal:
xxxxxxxxxx# /opt/course/22/expirationDec 20 18:05:20 2023 GMT
And we use the feature from kubeadm to get the expiration too:
xxxxxxxxxx➜ root@cluster2-controlplane1:~# kubeadm certs check-expiration | grep apiserverapiserver Jan 14, 2022 18:49 UTC 363d ca no apiserver-etcd-client Jan 14, 2022 18:49 UTC 363d etcd-ca no apiserver-kubelet-client Jan 14, 2022 18:49 UTC 363d ca no Looking good. And finally we write the command that would renew the kube-apiserver certificate into the requested location:
xxxxxxxxxx# /opt/course/22/kubeadm-renew-certs.shkubeadm certs renew apiserver
Use context: kubectl config use-context k8s-c2-AC
Node cluster2-node1 has been added to the cluster using kubeadm and TLS bootstrapping.
Find the "Issuer" and "Extended Key Usage" values of the cluster2-node1:
kubelet client certificate, the one used for outgoing connections to the kube-apiserver.
kubelet server certificate, the one used for incoming connections from the kube-apiserver.
Write the information into file /opt/course/23/certificate-info.txt.
Compare the "Issuer" and "Extended Key Usage" fields of both certificates and make sense of these.
First we check the kubelet client certificate:
xxxxxxxxxx➜ ssh cluster2-node1
➜ root@cluster2-node1:~# openssl x509 -noout -text -in /var/lib/kubelet/pki/kubelet-client-current.pem | grep Issuer Issuer: CN = kubernetes ➜ root@cluster2-node1:~# openssl x509 -noout -text -in /var/lib/kubelet/pki/kubelet-client-current.pem | grep "Extended Key Usage" -A1 X509v3 Extended Key Usage: TLS Web Client AuthenticationNext we check the kubelet server certificate:
xxxxxxxxxx➜ root@cluster2-node1:~# openssl x509 -noout -text -in /var/lib/kubelet/pki/kubelet.crt | grep Issuer Issuer: CN = cluster2-node1-ca@1588186506
➜ root@cluster2-node1:~# openssl x509 -noout -text -in /var/lib/kubelet/pki/kubelet.crt | grep "Extended Key Usage" -A1 X509v3 Extended Key Usage: TLS Web Server AuthenticationWe see that the server certificate was generated on the worker node itself and the client certificate was issued by the Kubernetes api. The "Extended Key Usage" also shows if it's for client or server authentication.
Use context: kubectl config use-context k8s-c1-H
There was a security incident where an intruder was able to access the whole cluster from a single hacked backend Pod.
To prevent this create a NetworkPolicy called np-backend in Namespace project-snake. It should allow the backend-* Pods only to:
connect to db1-* Pods on port 1111
connect to db2-* Pods on port 2222
Use the app label of Pods in your policy.
After implementation, connections from backend-* Pods to vault-* Pods on port 3333 should for example no longer work.
First we look at the existing Pods and their labels:
xxxxxxxxxx➜ k -n project-snake get podNAME READY STATUS RESTARTS AGEbackend-0 1/1 Running 0 8sdb1-0 1/1 Running 0 8sdb2-0 1/1 Running 0 10svault-0 1/1 Running 0 10s
➜ k -n project-snake get pod -L appNAME READY STATUS RESTARTS AGE APPbackend-0 1/1 Running 0 3m15s backenddb1-0 1/1 Running 0 3m15s db1db2-0 1/1 Running 0 3m17s db2vault-0 1/1 Running 0 3m17s vaultWe test the current connection situation and see nothing is restricted:
xxxxxxxxxx➜ k -n project-snake get pod -o wideNAME READY STATUS RESTARTS AGE IP ...backend-0 1/1 Running 0 4m14s 10.44.0.24 ...db1-0 1/1 Running 0 4m14s 10.44.0.25 ...db2-0 1/1 Running 0 4m16s 10.44.0.23 ...vault-0 1/1 Running 0 4m16s 10.44.0.22 ...
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.25:1111database one
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.23:2222database two
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.22:3333vault secret storageNow we create the NP by copying and changing an example from the K8s Docs:
xxxxxxxxxxvim 24_np.yamlxxxxxxxxxx# 24_np.yamlapiVersionnetworking.k8s.io/v1kindNetworkPolicymetadata namenp-backend namespaceproject-snakespec podSelector matchLabels appbackend policyTypesEgress # policy is only about Egress egress# first rule to# first condition "to"podSelector matchLabels appdb1 ports# second condition "port"protocolTCP port1111# second rule to# first condition "to"podSelector matchLabels appdb2 ports# second condition "port"protocolTCP port2222The NP above has two rules with two conditions each, it can be read as:
xxxxxxxxxxallow outgoing traffic if:(destination pod has label app=db1 AND port is 1111)OR(destination pod has label app=db2 AND port is 2222)
Now let's shortly look at a wrong example:
xxxxxxxxxx# WRONGapiVersionnetworking.k8s.io/v1kindNetworkPolicymetadata namenp-backend namespaceproject-snakespec podSelector matchLabels appbackend policyTypesEgress egress# first rule to# first condition "to"podSelector# first "to" possibility matchLabels appdb1podSelector# second "to" possibility matchLabels appdb2 ports# second condition "ports"protocolTCP # first "ports" possibility port1111protocolTCP # second "ports" possibility port2222The NP above has one rule with two conditions and two condition-entries each, it can be read as:
xxxxxxxxxxallow outgoing traffic if:(destination pod has label app=db1 OR destination pod has label app=db2)AND(destination port is 1111 OR destination port is 2222)
Using this NP it would still be possible for backend-* Pods to connect to db2-* Pods on port 1111 for example which should be forbidden.
We create the correct NP:
xxxxxxxxxxk -f 24_np.yaml createAnd test again:
xxxxxxxxxx➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.25:1111database one
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.23:2222database two
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.22:3333^CAlso helpful to use kubectl describe on the NP to see how k8s has interpreted the policy.
Great, looking more secure. Task done.
Use context: kubectl config use-context k8s-c3-CCC
Make a backup of etcd running on cluster3-controlplane1 and save it on the controlplane node at /tmp/etcd-backup.db.
Then create any kind of Pod in the cluster.
Finally restore the backup, confirm the cluster is still working and that the created Pod is no longer with us.
First we log into the controlplane and try to create a snapshop of etcd:
x
➜ ssh cluster3-controlplane1
➜ root@cluster3-controlplane1:~# ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db{"level":"info","ts":"2024-11-07T14:02:17.746254Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/tmp/etcd-backup.db.part"}^CBut it fails or hangs because we need to authenticate ourselves. For the necessary information we can check the etc manifest:
xxxxxxxxxx➜ root@cluster3-controlplane1:~# vim /etc/kubernetes/manifests/etcd.yamlWe only check the etcd.yaml for necessary information we don't change it.
xxxxxxxxxx# /etc/kubernetes/manifests/etcd.yamlapiVersionv1kindPodmetadata creationTimestampnull labels componentetcd tiercontrol-plane nameetcd namespacekube-systemspec containerscommandetcd--advertise-client-urls=https://192.168.100.31:2379--cert-file=/etc/kubernetes/pki/etcd/server.crt # use--client-cert-auth=true--data-dir=/var/lib/etcd--initial-advertise-peer-urls=https://192.168.100.31:2380--initial-cluster=cluster3-controlplane1=https://192.168.100.31:2380--key-file=/etc/kubernetes/pki/etcd/server.key # use--listen-client-urls=https://127.0.0.1:2379,https://192.168.100.31:2379 # use--listen-metrics-urls=http://127.0.0.1:2381--listen-peer-urls=https://192.168.100.31:2380--name=cluster3-controlplane1--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt--peer-client-cert-auth=true--peer-key-file=/etc/kubernetes/pki/etcd/peer.key--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt # use--snapshot-count=10000--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt imagek8s.gcr.io/etcd3.3.15-0 imagePullPolicyIfNotPresent livenessProbe failureThreshold8 httpGet host127.0.0.1 path/health port2381 schemeHTTP initialDelaySeconds15 timeoutSeconds15 nameetcd resources volumeMountsmountPath/var/lib/etcd nameetcd-datamountPath/etc/kubernetes/pki/etcd nameetcd-certs hostNetworktrue priorityClassNamesystem-cluster-critical volumeshostPath path/etc/kubernetes/pki/etcd typeDirectoryOrCreate nameetcd-certshostPath path/var/lib/etcd # important typeDirectoryOrCreate nameetcd-datastatusBut we also know that the api-server is connecting to etcd, so we can check how its manifest is configured:
xxxxxxxxxx➜ root@cluster3-controlplane1:~# cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep etcd - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key - --etcd-servers=https://127.0.0.1:2379We use the authentication information and pass it to etcdctl:
xxxxxxxxxxETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db \--cacert /etc/kubernetes/pki/etcd/ca.crt \--cert /etc/kubernetes/pki/etcd/server.crt \--key /etc/kubernetes/pki/etcd/server.key
ℹ️ Don't use
snapshot statusbecause it can alter the snapshot file and render it invalid.
Now create a Pod in the cluster and wait for it to be running:
xxxxxxxxxx➜ root@cluster3-controlplane1:~# kubectl run test --image=nginxpod/test created
➜ root@cluster3-controlplane1:~# kubectl get pod -l run=test -wNAME READY STATUS RESTARTS AGEtest 1/1 Running 0 60s
ℹ️ If you didn't solve questions 18 or 20 and cluster3 doesn't have a ready worker node then the created pod might stay in a Pending state. This is still ok for this task.
Next we stop all controlplane components:
xxxxxxxxxx➜ root@cluster3-controlplane1:~# cd /etc/kubernetes/manifests/
➜ root@cluster3-controlplane1:/etc/kubernetes/manifests# mv * ..
➜ root@cluster3-controlplane1:/etc/kubernetes/manifests# watch crictl psNow we restore the snapshot into a specific directory:
xxxxxxxxxx➜ root@cluster3-controlplane1:~# ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-backup.db --data-dir /var/lib/etcd-backup --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key
2020-09-04 16:50:19.650804 I | mvcc: restore compact to 99352020-09-04 16:50:19.659095 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32We could specify another host to make the backup from by using etcdctl --endpoints http://IP, but here we just use the default value which is: http://127.0.0.1:2379,http://127.0.0.1:4001.
The restored files are located at the new folder /var/lib/etcd-backup, now we have to tell etcd to use that directory:
xxxxxxxxxx➜ root@cluster3-controlplane1:~# vim /etc/kubernetes/etcd.yamlxxxxxxxxxx# /etc/kubernetes/etcd.yamlapiVersionv1kindPodmetadata creationTimestampnull labels componentetcd tiercontrol-plane nameetcd namespacekube-systemspec...mountPath/etc/kubernetes/pki/etcd nameetcd-certs hostNetworktrue priorityClassNamesystem-cluster-critical volumeshostPath path/etc/kubernetes/pki/etcd typeDirectoryOrCreate nameetcd-certshostPath path/var/lib/etcd-backup # change typeDirectoryOrCreate nameetcd-datastatusNow we move all controlplane yaml again into the manifest directory. Give it some time (up to several minutes) for etcd to restart and for the api-server to be reachable again:
xxxxxxxxxx➜ root@cluster3-controlplane1:/etc/kubernetes/manifests# mv ../*.yaml .
➜ root@cluster3-controlplane1:/etc/kubernetes/manifests# watch crictl psThen we check again for the Pod:
xxxxxxxxxx➜ root@cluster3-controlplane1:~# kubectl get pod -l run=testNo resources found in default namespace.Awesome, backup and restore worked as our pod is gone.
Use context: kubectl config use-context k8s-c1-H
Check all available Pods in the Namespace project-c13 and find the names of those that would probably be terminated first if the nodes run out of resources (cpu or memory) to schedule all Pods. Write the Pod names into /opt/course/e1/pods-not-stable.txt.
When available cpu or memory resources on the nodes reach their limit, Kubernetes will look for Pods that are using more resources than they requested. These will be the first candidates for termination. If some Pods containers have no resource requests/limits set, then by default those are considered to use more than requested.
Kubernetes assigns Quality of Service classes to Pods based on the defined resources and limits, read more here: https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod
Hence we should look for Pods without resource requests defined, we can do this with a manual approach:
xxxxxxxxxxk -n project-c13 describe pod | less -p Requests # describe all pods and highlight RequestsOr we do:
xxxxxxxxxxk -n project-c13 describe pod | egrep "^(Name:| Requests:)" -A1We see that the Pods of Deployment c13-3cc-runner-heavy don't have any resources requests specified. Hence our answer would be:
xxxxxxxxxx# /opt/course/e1/pods-not-stable.txtc13-3cc-runner-heavy-65588d7d6-djtv9mapc13-3cc-runner-heavy-65588d7d6-v8kf5mapc13-3cc-runner-heavy-65588d7d6-wwpb4mapo3db-0o3db-1 # maybe not existing if already removed via previous scenario
To automate this process you could use jsonpath like this:
xxxxxxxxxx➜ k -n project-c13 get pod -o jsonpath="{range .items[*]} {.metadata.name}{.spec.containers[*].resources}{'\n'}"
c13-2x3-api-86784557bd-cgs8gmap[requests:map[cpu:50m memory:20Mi]] c13-2x3-api-86784557bd-lnxvjmap[requests:map[cpu:50m memory:20Mi]] c13-2x3-api-86784557bd-mnp77map[requests:map[cpu:50m memory:20Mi]] c13-2x3-web-769c989898-6hbgtmap[requests:map[cpu:50m memory:10Mi]] c13-2x3-web-769c989898-g57nqmap[requests:map[cpu:50m memory:10Mi]] c13-2x3-web-769c989898-hfd5vmap[requests:map[cpu:50m memory:10Mi]] c13-2x3-web-769c989898-jfx64map[requests:map[cpu:50m memory:10Mi]] c13-2x3-web-769c989898-r89mgmap[requests:map[cpu:50m memory:10Mi]] c13-2x3-web-769c989898-wtgxlmap[requests:map[cpu:50m memory:10Mi]] c13-3cc-runner-98c8b5469-dzqhrmap[requests:map[cpu:30m memory:10Mi]] c13-3cc-runner-98c8b5469-hbtdvmap[requests:map[cpu:30m memory:10Mi]] c13-3cc-runner-98c8b5469-n9lswmap[requests:map[cpu:30m memory:10Mi]] c13-3cc-runner-heavy-65588d7d6-djtv9map[] c13-3cc-runner-heavy-65588d7d6-v8kf5map[] c13-3cc-runner-heavy-65588d7d6-wwpb4map[] c13-3cc-web-675456bcd-glpq6map[requests:map[cpu:50m memory:10Mi]] c13-3cc-web-675456bcd-knlpxmap[requests:map[cpu:50m memory:10Mi]] c13-3cc-web-675456bcd-nfhp9map[requests:map[cpu:50m memory:10Mi]] c13-3cc-web-675456bcd-twn7mmap[requests:map[cpu:50m memory:10Mi]] o3db-0{} o3db-1{}This lists all Pod names and their requests/limits, hence we see the three Pods without those defined.
Or we look for the Quality of Service classes:
xxxxxxxxxx➜ k get pods -n project-c13 -o jsonpath="{range .items[*]}{.metadata.name} {.status.qosClass}{'\n'}"
c13-2x3-api-86784557bd-cgs8g Burstablec13-2x3-api-86784557bd-lnxvj Burstablec13-2x3-api-86784557bd-mnp77 Burstablec13-2x3-web-769c989898-6hbgt Burstablec13-2x3-web-769c989898-g57nq Burstablec13-2x3-web-769c989898-hfd5v Burstablec13-2x3-web-769c989898-jfx64 Burstablec13-2x3-web-769c989898-r89mg Burstablec13-2x3-web-769c989898-wtgxl Burstablec13-3cc-runner-98c8b5469-dzqhr Burstablec13-3cc-runner-98c8b5469-hbtdv Burstablec13-3cc-runner-98c8b5469-n9lsw Burstablec13-3cc-runner-heavy-65588d7d6-djtv9 BestEffortc13-3cc-runner-heavy-65588d7d6-v8kf5 BestEffortc13-3cc-runner-heavy-65588d7d6-wwpb4 BestEffortc13-3cc-web-675456bcd-glpq6 Burstablec13-3cc-web-675456bcd-knlpx Burstablec13-3cc-web-675456bcd-nfhp9 Burstablec13-3cc-web-675456bcd-twn7m Burstableo3db-0 BestEfforto3db-1 BestEffortHere we see three with BestEffort, which Pods get that don't have any memory or cpu limits or requests defined.
A good practice is to always set resource requests and limits. If you don't know the values your containers should have you can find this out using metric tools like Prometheus. You can also use kubectl top pod or even kubectl exec into the container and use top and similar tools.
Use context: kubectl config use-context k8s-c1-H
There is an existing ServiceAccount secret-reader in Namespace project-hamster. Create a Pod of image curlimages/curl:7.65.3 named tmp-api-contact which uses this ServiceAccount. Make sure the container keeps running.
Exec into the Pod and use curl to access the Kubernetes Api of that cluster manually, listing all available secrets. You can ignore insecure https connection. Write the command(s) for this into file /opt/course/e4/list-secrets.sh.
https://kubernetes.io/docs/tasks/run-application/access-api-from-pod
It's important to understand how the Kubernetes API works. For this it helps connecting to the api manually, for example using curl. You can find information fast by search in the Kubernetes docs for "curl api" for example.
First we create our Pod:
xxxxxxxxxx➜ k run tmp-api-contact --image=curlimages/curl:7.65.3 $do --command > e2.yaml -- sh -c 'sleep 1d'
➜ vim e2.yamlAdd the service account name and Namespace:
xxxxxxxxxx# e2.yamlapiVersionv1kindPodmetadata creationTimestampnull labels runtmp-api-contact nametmp-api-contact namespaceproject-hamster # addspec serviceAccountNamesecret-reader # add containerscommandsh-csleep 1d imagecurlimages/curl7.65.3 nametmp-api-contact resources dnsPolicyClusterFirst restartPolicyAlwaysstatusThen run and exec into:
xxxxxxxxxx➜ k -f 6.yaml create
➜ k -n project-hamster exec tmp-api-contact -it -- shOnce on the container we can try to connect to the api using curl, the api is usually available via the Service named kubernetes in Namespace default (You should know how dns resolution works across Namespaces.). Else we can find the endpoint IP via environment variables running env.
So now we can do:
xxxxxxxxxxcurl https://kubernetes.defaultcurl -k https://kubernetes.default # ignore insecure as allowed in ticket descriptioncurl -k https://kubernetes.default/api/v1/secrets # should show Forbidden 403The last command shows 403 forbidden, this is because we are not passing any authorisation information with us. The Kubernetes Api Server thinks we are connecting as system:anonymous. We want to change this and connect using the Pods ServiceAccount named secret-reader.
We find the the token in the mounted folder at /var/run/secrets/kubernetes.io/serviceaccount, so we do:
xxxxxxxxxx➜ TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
➜ curl -k https://kubernetes.default/api/v1/secrets -H "Authorization: Bearer ${TOKEN}" % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0{ "kind": "SecretList", "apiVersion": "v1", "metadata": { "selfLink": "/api/v1/secrets", "resourceVersion": "10697" }, "items": [ { "metadata": { "name": "default-token-5zjbd", "namespace": "default", "selfLink": "/api/v1/namespaces/default/secrets/default-token-5zjbd", "uid": "315dbfd9-d235-482b-8bfc-c6167e7c1461", "resourceVersion": "342",...Now we're able to list all Secrets, registering as the ServiceAccount secret-reader under which our Pod is running.
To use encrypted https connection we can run:
xxxxxxxxxxCACERT=/var/run/secrets/kubernetes.io/serviceaccount/ca.crtcurl --cacert ${CACERT} https://kubernetes.default/api/v1/secrets -H "Authorization: Bearer ${TOKEN}"For troubleshooting we could also check if the ServiceAccount is actually able to list Secrets using:
xxxxxxxxxx➜ k auth can-i get secret --as system:serviceaccount:project-hamster:secret-readeryesFinally write the commands into the requested location:
x# /opt/course/e2/list-secrets.shTOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)curl -k https://kubernetes.default/api/v1/secrets -H "Authorization: Bearer ${TOKEN}"