Basic processes and methods

  1. Query the status of the pod, for pod Pending scenarios:

    1
    
    kubectl describe <pod-name> -n <namespace>
    
  2. Get exception events in the cluster as a supplement to troubleshooting the cause of pod Pending.

    1
    
    kubectl get events -n <namespace> --sort-by .lastTimestamp [-w]
    
  3. Get the pod’s log, for pod Error or CrashLoopBack scenarios.

    1
    
    kubectl logs <pod-name> -n <namespace> [name-of-container, if multiple] [-f]
    

If the pod is already running and the existing logs do not directly indicate a problem, it is necessary to go into the pod container for further testing, for example to verify the status of a running process, the configuration, or to check the container’s network connection.

How to access the container

  1. via the exec command: kubectl exec -it <podName> sh.
  2. via ephemeral container (requires Kubernetes v1.23 or higher).
  3. To execute common exclusion tools and commands in the absence of program binaries in the image:
    1. get the PID of the container on the host: docker ps | grep k8s_<containerName>_<podName> | awk '{print $1}' | xargs docker inspect --format '{{ .State.Pid }}'
    2. execute the tools installed on the host in the container’s network namespace: nsenter -t <PID> -n bash
    3. exit the container’s namespace (i.e. exit the shell running in that namespace): exit

How to access the node

  1. login to the node via ssh;
  2. Use journalctl -xeu kubelet to view the kubelet logs for scenarios where the node is NotReady.

Common problems and how to troubleshoot them

kubectl execution result exception

Phenomenon:

Execution of any kubectl command outputs the following result:

  • Error from server (InternalError): an error on the server ("") has prevented the request from succeeding
  • etcdserver: leader changed

Cause:

  • kubectl and apiserver authentication failed
  • apiserver exception
    • apiserver exception usually caused by etcd
  • etcd exceptions
    • frequent master switching due to an odd number of election node topologies
    • poor disk performance leading to high latency and even frequent master switching

How to troubleshoot.

  1. check that the configuration file currently used by kubectl is correct: kubectl config view

    1. check that the configuration file ~/.kube/config is generated by the cluster.
    2. Check that the server address and access protocol in the configuration file are correct.
  2. Both apiserver and etcd are static pods. If you cannot use the kubectl command, you can check the container logs directly with a container runtime such as docker.

    1
    2
    3
    4
    
    # Get the container_id corresponding to apiserver and etcd
    docker ps -a | grep -e k8s_kube-apiserver -e k8s_etcd
    # Get the log
    docker logs <container_id> [-f]
    
  3. Testing the performance of the file system used by etcd

    1
    
    fio --rw=write --ioengine=sync --fdatasync=1 --filename=<etcd-data-dir>/iotest --size=22m --bs=2300 --name=etcdio-bench
    

    This test focuses on the fsync performance of the file system used by etcd.

    1
    2
    3
    4
    5
    6
    7
    8
    
    fsync/fdatasync/sync_file_range:
    sync (usec): min=534, max=15766, avg=1273.08, stdev=1084.70
    sync percentiles (usec):
    | 1.00th=[ 553], 5.00th=[ 578], 10.00th=[ 594], 20.00th=[ 627],
    | 30.00th=[ 709], 40.00th=[ 750], 50.00th=[ 783], 60.00th=[ 1549],
    | 70.00th=[ 1729], 80.00th=[ 1991], 90.00th=[ 2180], 95.00th=[ 2278],
    | 99.00th=[ 2376], 99.50th=[ 9634], 99.90th=[15795], 99.95th=[15795],
    | 99.99th=[15795]
    

    Test results where the above fsync 99.00th value is greater than 10000 usec (i.e. P99 > 10ms) are usually not considered to meet the conditions of use and it is recommended that the disk hardware be replaced.

DNS resolution exception

Phenomenon:

The application cannot connect to the kubernetes apiserver or other dependent services such as DBs, proxies, etc. The error message usually includes port 53 access timeout, can’t resolve host, etc.

Cause and troubleshooting:

  • Unable to access the backend pod corresponding to the kube-dns service
    • The dns configuration (/etc/resolv.conf) in the application container is incorrect, preventing normal parsing requests (some older distributions and the libc inventory are in this problem) Check that the /etc/resolv.conf file in the container contains the correct nameserver and search path, e.g:

      1
      2
      3
      4
      
      $ cat /etc/resolv.conf
      nameserver 11.96.0.10
      search qfusion-admin.svc.cluster.local svc.cluster.local cluster.local
      options ndots:5
      
    • The service’s load balancing function is abnormal and outbound requests cannot DNAT to the actual coreDNS backend pod

      1. Check the running status of the kube-proxy pod.

      2. Check that the node where the application container is located can match the service properly and complete the DNAT.

        1. Check if the node routing table has a default route or a service routing rule, otherwise the packet will not be DNAT.

        2. Check if there is a DNAT rule for the service, using iptables as an example:

          1
          
          iptables-save | grep <serviceName>
          

          For each port of each service, there should be 1 rule and a KUBE-SVC-<hash> chain in the KUBE-SERVICES.

          les will vary depending on your specific configuration (including node ports and load balancers).

        3. Queries the current NAT record.

          1
          2
          3
          
          conntrack -L | grep <dns-ip>
          # destination
          conntrack -L -d 10.32.0.1
          

        See https://kubernetes.io/docs/tasks/debug/debug-application/debug-service/ for details.

    • Cross-node container network exception, request cannot reach the node where the DNS pod is located

      1. To check the operational status of the components associated with the CNI plug-in, execute the health check or status check command that comes with the CNI plug-in, using cilium as an example:

        1
        2
        
        cilium status [--verbose]
        kubectl -n kube-system exec -it cilium-xrd4d -- cilium-health status
        
      2. Check the connectivity of nodes and pods between the application container and coredns.

        • Use nc to test four layers of connectivity between two nodes and pods.

          1
          2
          3
          4
          
          # host1
          nc -l 9999
          # host2
          nc -vz <host1> 9999
          
        • If the container network (in most cases) uses a vxlan scheme, test that port 8472, the host used by vxlan by default, is blocked.

          • Use netstat -lnup | grep 8472 or nc -lu 8472 to check if port 8472 UDP is already occupied by another program.

          • Use tcpdump to capture packets on port 8472 to determine if they can be reached over the network. Example:

            1
            
            tcpdump -i p4p1 dst port 8472 -c 1 -Xvv -nn
            
          • Test if 8472 UDP is available

            1
            2
            3
            4
            
            #  on the server side
            iperf -s -p 8472 -u
            # on the client side
            iperf -c 172.28.128.103 -u -p 8472 -b 1K
            
    • MTU problems, where an intermediate device in the network topology sets a smaller MTU than the sending NIC, resulting in packets exceeding that size being dropped, as evidenced by the ability to ping through, with some requests for the same protocol and port going through and others not.

      1. Get the MTU of the sender’s NIC:

        1
        2
        3
        
        ifconfig
        # or
        ip -4 -s address
        
      2. Use the ping command to test the MTU of the target host

        1
        
        ping -M do -s 1472 [dest IP]
        

        The ICMP header uses 8 bytes, the Ethernet header uses 20 bytes, and the ping command sends an additional 1472 bytes to test whether the MTU of 1500 bytes reaches the target host.

      3. Modify the MTU of the NIC on the sending side.

        1
        2
        
        echo MTU=1450 >> /etc/sysconfig/network-scripts/ifcfg-eth0 #File name change to NIC name
        systemctl restart network
        
    • The kube-dns corresponding backend pod exception resulted in a failure to respond to a parsed request or did not contain a dns record

      1. kube-dns service does not exist

        Check the status of the service and endpoint in the kube-system namespace:

        1
        2
        
        kubectl -n kube-system get svc | grep dns
        kubectl -n kube-system get ep | grep dns
        
      2. coreDNS pod status exception

        Use the kubectl describe and kubectl logs commands described above to obtain the pod status and logs of coreDNS.

      3. coreDNS cannot respond to resolution or does not contain the specified dns record

        Go into the application container and another arbitrary container and test whether coreDNS can resolve other containers, nodes and domains.

        • Request port 53 of the DNS server to determine if the DNS server is accessible and listening: nc -vz <ip-of-dns> 53
        • Test domain resolution: dig example.com @<ip-of-dns> +trace (+trace outputs trace information to distinguish if caching is being used)
        • Other domain name resolution tools: nslookup, host

More information on troubleshooting dns resolution can be found at https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/.

TLS certificate exceptions

The key concept of a tls certificate is that multiple certificates form a valid chain of trust, from the server/leaf certificate on the host to multiple intermediate certificates and finally to the root/CA certificate. It is through the chain of trust formed that an encrypted communication session between client and server can begin.

Phenomenon: The expected request could not be completed for ssl/tls reasons, the error message usually contains ssl/tls handshake error

Cause:

  • root/CA certificate not added to trusted certificate
  • The server/CA certificate has expired or is not yet valid (usually due to system time inconsistencies resulting in the generation of a certificate that is not yet valid)

Troubleshooting method:

  • Use openssl to obtain the certificate information for the specified service:

    1
    
    openssl s_client -connect <fqdn/ip>:<port>
    
  • Use openssl to get the server-side certificate for the specified service (in the case of [localhost:6443](http://localhost:6443)) and print it in a readable format.

    1
    
    openssl s_client -showcerts -connect localhost:6443 </dev/null 2>/dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text
    

    You can usually check the Issuer, Validity and Subject in the output.

  • If you encounter x509: certificate signed by unknown authority (e.g. the docker CLI throws this error when accessing an internal https image repository), the target server is using an untrusted issued certificate (e.g. a self issued certificate). You need to install the CA certificate or change the configuration related to the trusted certificate according to the application documentation.

Routing and kernel parameters misconfigured

Phenomenon:

Access to some stateful normal service inside or outside the cluster cannot be completed, the error message usually contains Unreachable or Connection timeout.

Cause:

  • There is no routing entry corresponding to the access address in the routing table of the pod or host where the application resides, and no default route is configured.
  • Kernel parameters related to routing were not configured correctly or were reset by another service after being turned on during the installation of kubernetes.

To troubleshoot:

  • Use netcat or curl to test if the server can be accessed over a four-layer network:

    1
    2
    
    nc -vz <ip/fqdn> <port>
    curl -kL telnet://<ip/fqdn>:<port> -vvv
    
  • Simple test of network latency using curl:

    1
    
    curl -s -w 'Total time: %{time_total}s\n' http://example.com
    
  • Query the routing rules for the specified IP:

    1
    
    ip route get 10.101.203.141
    
  • Routing path for trace requests:

    1
    
    traceroute 10.101.203.141
    

    traceroute uses ICMP protocol ECHO packets by default, which some devices may not respond to.

  • net.ipv4.ip_forward: A core parameter that allows Linux to forward traffic between network interfaces, most CNIs require this parameter to be enabled for inter-pod access.

    1
    2
    3
    4
    5
    
    sysctl net.ipv4.ip_forward
    # this will turn things back on a live server
    sysctl -w net.ipv4.ip_forward=1
    # on Centos this will make the setting apply after reboot
    echo net.ipv4.ip_forward=1 >> /etc/sysconf.d/10-ipv4-forwarding-on.conf
    
  • net.bridge.bridge-nf-call-iptables: allows the use of iptables rules for Linux Bridge devices, some Linux Bridge-based CNIs require this parameter to be enabled to enable the container to request NAT on external networks.

    1
    2
    3
    4
    5
    
    sysctl net.bridge.bridge-nf-call-iptables
    modprobe br_netfilter
    # turn the iptables setting on
    sysctl -w net.bridge.bridge-nf-call-iptables=1
    echo net.bridge.bridge-nf-call-iptables=1 >> /etc/sysconf.d/10-bridge-nf-call-iptables.conf
    
  • net.ipv4.conf.all.rp_filter: some CNIs (such as Cilium) require this parameter to be enabled for inter-pod access, but some distributions of systemd may override this parameter.

    1
    2
    3
    4
    5
    6
    
    cat <<EOF > /etc/sysctl.d/99-override_cilium_rp_filter.conf
    net.ipv4.conf.all.rp_filter = 0
    net.ipv4.conf.default.rp_filter = 0
    net.ipv4.conf.lxc*.rp_filter = 0
    EOF
    systemctl restart systemd-sysctl
    

Firewall

Phenomenon:

Requests are blocked by some layer 4 or 7 firewall rules, usually manifesting as access to the node but not to some ports; some requests cannot be completed and the error message usually includes Connection Reset.

Cause:

  • Access to certain ports or specific protocols is blocked by the Layer 4 firewall.
  • Layer 7 firewalls audit the destination path of the request and the content of the request body.

To troubleshoot:

  1. nc -vz to test for Layer 4 connectivity and curl -kL to test for Layer 7 connectivity; if a service is connected at Layer 4 but not at Layer 7, it may have a Layer 7 firewall;

  2. use tcpdump for packet capture analysis, focusing on TCP packets with an ACK of RST or Reject;

  3. check that iptables has a reject rule, -reject-with-icmp-host-unreachable may also appear to have a routing rule but the request returns no route to host;

    1
    
    iptables -nvL | grep -i reject
    

Configuration errors and program bugs

Performance:

Some applications do not start properly or run as intended, the error message usually contains syntax error

Cause:

A blank space or tab causes an error in the configuration file format

If the above troubleshooting steps do not reveal the problem and you find a suspicious message in the application container logs, you can report or discuss the problem directly with the developer, but be careful not to jump to conclusions.