Linux nslookup command

We often check the DNS information of an IP or domain name at work to see which machine the current IP is bound to, or if the current domain name resolves properly. This is where the nslookup command comes into play. Command Details The nslookup command is mainly used to query the DNS information of a domain name. nslookup has two working modes: “interactive mode” and “non-interactive mode”. Enter nslookup directly in the command line, and you will enter the interactive mode without entering any parameters.

Cilium Masquerading Troubleshooting

Background When upgrading cilium v1.8.1 to v1.11.1, the business pod reported a mysql authorization error, and after checking, we found that the clientIP of mysql server is the nodeIP of the business pod, not the default podIP, because mysql server only authorized the current K8s cluster The mysql server only authorizes the pod cidr of the current K8s cluster, so it reports an authorization error. The contradiction is that when using cilium v1.

Side effects of nodePort

Problem phenomenon One day I encountered a problem: when accessing a web service, some requests failed, and Connection Refused was returned. For some reason, the web service is a hostNetwork type, which is the same network namespace as the host, and when I logged into Node, I could see that the listening socket was still there, but when I made a direct curl request, it returned Connection Refused. Why? Usually Connection Refused means that the listening socket is not open and the corresponding port number is not opened, the kernel will return icmp-port-unreachable, but now it is obvious that the port is open.

Is Golang's empty array nil?

When reading kubernetes code, sometimes you will see some code compares arrays to nil. 1 2 3 4 5 6 7 8 9 // bindAPIUpdate gets the cached bindings and PVCs to provision in podBindingCache // and makes the API update for those PVs/PVCs. func (b *volumeBinder) bindAPIUpdate(podName string, bindings []*bindingInfo, claimsToProvision []*v1.PersistentVolumeClaim) error { if bindings == nil { return fmt.Errorf("failed to get cached bindings for pod %q", podName) } if claimsToProvision == nil { return fmt.

Tracing nginx ingress maximum open file count issue

Problem phenomenon Our kubernetes ingress controller is using ingress-nginx from kubernetes and we recently encountered a " Too many open files" problem. 1 2 3 2019/09/19 09:47:56 [warn] 26281#26281: *97238945 a client request body is buffered to a temporary file /var/lib/nginx/body/0000269456, client: 1.1.1.1, server: xxx.ieevee.com, request: "POST /api/v1/xxx HTTP/1.1", host: "xxx.ieevee.com" 2019/09/19 09:47:56 [crit] 26281#26281: accept4() failed (24: Too many open files) 2019/09/19 09:47:56 [crit] 26281#26281: *97238948 open() "/var/lib/nginx/body/0000269457" failed

The disappearing Prometheus indicator

Problem phenomenon We have some GPU machines and need to count GPU related information, the data is taken from Prometheus, but one day we suddenly found that some labels of some GPU node metrics are empty. Here are the metrics of normal nodes. 1 container_accelerator_duty_cycle{acc_id="GPU-d8384090-81d2-eda5-ed02-8137eb037460",container_name="nvidia-device-plugin-ctr",endpoint="https-metrics",id="/kubepods/besteffort/podd917e00f-a779-11e9-b971-6805ca7f5b2a/38a5decf96e9f007fcb0059d79017ea3b3c29ff4c9a81b7fec86cf63c06baf53",image="sha256:7354b8a316796fba0463a68da79d79eb654d741241ca3c4f62a1ef24d8e11718",instance="10.0.0.1:10250",job="kubelet",make="nvidia",model="Tesla P40",name="k8s_nvidia-device-plugin-ctr_nvidia-device-plugin-daemonset-sn2cg_lambda_d917e00f-a779-11e9-b971-6805ca7f5b2a_1",namespace="ieevee",node="x1.ieevee.com",pod_name="nvidia-device-plugin-daemonset-sn2cg",service="kubelet"} 0 Here are the metrics of the exception node. 1 container_accelerator_duty_cycle{acc_id="GPU-a7b535d0-d6ca-022c-5b23-1bff863646a4",container_name="",endpoint="https-metrics",id="/kubepods/besteffort/pod8bb25662-de9a-11e9-84e7-f8f21e04010c/cde3858becb05366e71f230e876204be586662f274dcb4a6e2b75ea404f2d5a9",instance="10.0.0.2:10250",job="kubelet",make="nvidia",model="Tesla V100-PCIE-16GB",name="",namespace="",pod_name=""} You can see that the data taken out by Prometheus, the container_name, name, namespace, and pod_name in the label are all empty.

Setting up the shared memory of a kubernetes Pod

Problem description Users can use shared memory to do some for communication (vs golang’s “Shared memory by communication”). On a kvm or physical machine, the size of shared memory available to the user is about half of the total memory. Below is the shared memory on my pve machine, /dev/shm is the size of the shared memory. 1 2 3 4 5 6 7 8 root@pve:~# free -h total used free shared buff/cache available Mem: 47Gi 33Gi 3.

How to access Kubernetes Pods from outside the cluster

A pod running on a kubernetes cluster is easy to access from within the cluster, most simply, through the pod’s ip, or through the corresponding svc. However, outside the cluster, the pod ip of the flannel-based kubernetes cluster is not accessible from outside the cluster because it is an internal address. To solve this problem, kubernetes provides several methods as follows. hostNetwork: true When hostNetwork is true, the container will use the network of the host node, so the container’s services can be accessed from outside the cluster as node-ip + port, as long as you know which node the container is running on.

Secure access to Homelab services with Kubernetes Ingress + LetEncrypt

Requirements Overview Previously, some services hosted on Kubernetes at home, such as portal, emby, weave scope, etc., were accessed using service ip, which was slightly troublesome to access, mainly because the ip had to be remembered. Kubernetes provides Ingress to solve the problem of load balancer type service (vip consumption, L7 load feature, etc.). For this requirement, you can set up pan domain on godaddy with type A, for example *.

In-depth analysis of the election mechanism in kubernetes

Overview In Kubernetes, kube-controller-manager, kube-scheduler, and the underlying implementation of controller-rumtime using Operator all support leader election in highly available systems. This article will focus on understanding how the leader election in controller-rumtime (the underlying implementation is client-go) is implemented in the kubernetes controller. Background When running kube-controller-manager, there are some parameters provided to cm for leader election, you can refer to the official documentation parameters to understand the parameters.

How to get the caller's function name, filename, and line number in a Go function

Background When we add business logs to our application code, regardless of the level of logging, in addition to the information that we actively pass to Logger for it to log, it is also very important to know which function printed the line and where it is located, otherwise it is likely to be like looking for a needle in a haystack when troubleshooting. For logging, it is important to record the function name and line number of the caller of the Logger method.

Usage of Grafana Loki Query Language LogQL

Inspired by PromQL, Loki also has its own query language, called LogQL, which is like a distributed grep that aggregates views of logs. Like PromQL, LogQL is filtered using tags and operators, and has two main types of query functions. Query to return log line contents Calculating relevant metrics in the log stream by filtering rules Log queries A basic log query consists of two parts. log stream selector log pipeline Due to the design of Loki, all LogQL queries must contain a Log Stream selector.

How to automatically set worker_processes for nginx containers

Problem description When containerizing nginx, there is a common problem: How do I automatically set the number of nginx worker processes? In the nginx.conf configuration file of the official nginx container image, there is a worker process configuration. 1 worker_processes 1; It will configure nginx to start only 1 worker. this works well when the nginx container is 1 core. When we want nginx to give a higher configuration, for example 4c or 16c, we need to make sure that nginx can also start the corresponding number of worker processes.

Encrypt and save the docker login password

When using a image repository in an enterprise, you usually need to enable authentication, and the authentication credentials may be a common account for users in the enterprise. However, after docker login, the username and password after base64 will be saved in .docker/config.json, so that on some servers used by many people, there will be a problem of account leakage. Is there a solution for this? docker provides credentials store, which means that passwords are stored in an external credentials store.

A problem caused by a Go upgrade :" http2: no cached connection was available"

Direct phenomenon: I compiled kube-controller-manager with Go 1.13, and after running it for a while, I found that the controller did not work, and when I checked the logs, I found that it printed “http2: no cached connection was available “. 1 2 3 4 5 6 7 I0328 09:48:59.925056 1 round_trippers.go:383] GET https://10.220.14.10:8443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager I0328 09:48:59.925085 1 round_trippers.go:390] Request Headers: I0328 09:48:59.925094 1 round_trippers.go:393] User-Agent: kube-controller-manager/v1.11.1 (linux/amd64) kubernetes/b1b2997/leader-election I0328 09:48:59.925102 1 round_trippers.

Find and delete large files that have been opened but deleted

In the daily operation and maintenance process, we often need to deal with disk space issues, when we receive alarms, the first time we will go to find those large files, generally such as centos, the large file may be /var/log/messages. But sometimes, there is a situation where you can’t find the big files, and when you look for them by du, the size of the statistics doesn’t correspond to the space occupied by df.

How to set up a Pod to run on a specific node

1. Specify the Node by nodeSelector when creating the load Add a label to the node 1 kubectl label node node2 project=A Specify the nodeSelector to create the workload 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: nginx-nodeselector spec: replicas: 1 selector: matchLabels: app: nginx-nodeselector template: metadata: labels: app: nginx-nodeselector spec: nodeSelector: project: A containers: - name: nginx image: nginx EOF View Workload

cgroup cpu subsystem

Overview cgroups are control groups, which are responsible for controlling a range of resources for processes on linux, such as CPU, Memory, Huge Pages, and so on. CPU, Memory, Huge Pages, etc. cgroups are divided into modules by subsystems, and each resource is implemented by a subsystem. The cgroup provides calls to the outside world by means of a file system, and can be combined in a hierarchical way. This

Hardware knowledge: how to choose a monitor?

When buying a laptop, usually only focus on CPU and memory, SSD, appearance and what not, usually less attention to the monitor. Also some of the terms in the propaganda about the monitor also unknown feeling, but is it true? So took the time to do some collection and collation. The size of the monitor The general laptop screen size is more fixed several mainstream sizes, generally 13.3 inches, 14

Linux Basics: Display Manager

Display Manager (DM) is a program that provides graphical login capabilities for Linux distributions. It controls user sessions and manages user authentication. Display Manager will start the display server and load the desktop environment as soon as you enter your username and password. The display manager is usually synonymous with the login screen. It is, after all, the visible part. However, the visible login screen, also called the welcome page (greeter), is only part of the display manager.