How to handle multi-line logs with regular expressions

Log collection with multi-line logs is always a headache, developers are not willing to output logs as JSON, so they have to re-structure the logs when collecting them. Since log collectors are implemented in different ways and standards, how to handle multi-line logs will be different for different collectors. For example, if we use Fluentd as our log collector, we can use the multiline parser to handle multi-line logs. The multiline parser uses the formatN and format_firstline parameters to parse the logs.

unexpected error getting claim reference: selfLink was empty, can't make reference

The error log is as follows: 1 2 3 4 5 6 7 I0620 07:19:39.552037 1 leaderelection.go:185] attempting to acquire leader lease cephfs/ceph.com-cephfs... E0620 07:19:39.610071 1 event.go:259] Could not construct reference to: '&v1.Endpoints{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"ceph.com-cephfs", GenerateName:"", Namespace:"cephfs", SelfLink:"", UID:"4f4ead7b-c097-4074-b56c-76c6888ceed7", ResourceVersion:"209907", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63791306379, loc:(*time.Location)(0x19b4b00)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"cephfs-provisioner-688cc75-xgd5g_58e8a7a6-f069-11ec-aba4-c627c8b0022e\",\"leaseDurationSeconds\":15,\"acquireTime\":\"2022-06-20T07:19:39Z\",\"renewTime\":\"2022-06-20T07:19:39Z\",\"leaderTransitions\":0}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Subsets:[]v1.EndpointSubset(nil)}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'LeaderElection' 'cephfs-provisioner-688cc75-xgd5g_58e8a7a6-f069-11ec-aba4-c627c8b0022e became leader' I0620 07:19:39.

How to track network traffic

Background Through the monitoring has found certain scenarios in the traffic is relatively large, not reasonable, need to know which processes are these traffic are accessing which services trigger. Method locate which processes are triggering the traffic. locate which IPs are mainly responsible for accessing the traffic Locate the specific ports that are having the larger traffic. Tools nethogs/iftop/tcptrack Locate the process 1 sudo nethogs From the above graph you can see the total traffic, and the size of the traffic for each process.

How much overhead is actually required for process/thread context switching?

Processes are a very familiar concept, and we have probably heard of process context switching overhead. So let’s think about a question today, just how much CPU time is consumed by a process context switch? As threads are said to be lighter than processes, will their context switches save much CPU time than process switches? With these questions in mind, let’s get to the point. 1. Processes and process switching Processes are one of the great inventions of operating systems, shielding applications from hardware details such as CPU scheduling and memory management, and abstracting the concept of a process so that applications can concentrate on implementing their own business logic and perform many tasks “simultaneously” on a limited CPU.

The cost of context switching

Concepts Process switching, soft interrupts, kernel state user state switching, CPU hyperthreading switching Kernel state user state switch: still in a thread, just from the user state into the kernel state for safety and other factors need more instructions, system calls specific more what to do see: https://github.com/torvalds/linux/blob/v5.2/arch/x86/entry/entry_64.S#L145 Soft interrupts: such as network packet arrival, triggering ksoftirqd (one per core) process to handle, is a kind of process switching Process

Multi-clock solves the time redirection problem of snowflake algorithm

Distributed ID generation algorithms are used to generate globally unique ID identifiers in distributed systems, and one well-known algorithm is the snowflake algorithm proposed by twitter, which generates a 64-bit globally unique integer each time, with a very clever underlying idea. 1 2 3 0 1010......101 1010101010 101010101010 \_/ \___________/ \________/ \__________/ 1 2 3 4 The first bit is not used 41-bit millisecond timestamp 10-bit machine ID 12-bit serial

5 Suggestions for Error Handling and Log Printing in Golang

The Golang language syntax has a very distinctive design of error handling mechanism, which is based on the defensive programming idea. But today’s article will not discuss the syntax design of Golang error handling. Instead, today I would like to think about how error logging should be handled and printed in Golang. 5 suggestions for error handling and log printing in Golang use the error stack approach. use logical stack information instead of the code call stack.

How to use SetMemoryLimit?

Go 1.19 finally implements SetMemoryLimit, Go’s GC doesn’t provide as many parameters to adjust as Java, there is only one parameter GOGC, so it’s exciting to add a parameter that can adjust GC. Those who have been following Go performance will know that there are two hacker ways to tune Go GC in recent years: ballast: ballast technique. This technique uses a “false” memory footprint to make it harder for

Authentication and authorization in kubernetes

1. Overview There are two types of users in kubernetes: service account and regular user. These two user types correspond to two usage scenarios. The service account is provided to the pods running in the cluster, and when these pods want to communicate with the apiserver, it is the serviceaccount that is used for authentication and authorization. Service accounts are stored in the k8s cluster and are RBAC-based and can be bound to roles to have specific permissions for specific resources.

A high latency problem caused by misaligned versions of go-redis and redis server

The company had multiple go-redis clients and multiple versions of the redis cluster. When conducting a business stress test, we found that even if we only access the redis interface, the latency can be as high as a second, which is very counterintuitive. We use different versions of go-redis and different versions of redis cluster to do a simple stress test. redis commands are simple get, kv size is one

The Complete Guide to Migrating Storage Across StorageClass in Kubernetes

Thinking Let’s first think about what few things need to be done to switch StorageClass. First you need to reduce the number of copies of the application to 0, then create a new PVC, copy the data from the old PV to the new PV, then let the application use the new PV and expand the copies to the original number, and finally delete the old PV. This whole process also prevents Kubernetes from deleting the PV when the PVC is deleted.

Wishbone Bus Protocol

Background I have recently been studying how to introduce the Wishbone bus protocol into my Principles of Computer Composition course, so I took the opportunity to learn about the Wishbone protocol. Bus What is a bus? The bus is usually used to connect CPU and peripherals. For better compatibility and reusability, it is thought that a unified protocol can be designed, where the CPU implementation is the party that initiates the request (also known as master) and the peripheral implementation is the party that receives the request (also known as slave), so that if you want to add peripherals or replace the CPU implementation, it becomes simpler and reduces a lot of adaptation workload.

K8s Clustering Stability: Source Code Analysis of LIST Requests, Performance evaluation and tuning for large-scale base service deployments

For unstructured data storage systems, LIST operations are usually very heavyweight, not only consuming a lot of disk IO, network bandwidth and CPU, but also affecting other requests in the same time period (especially the response latency demanding master selection requests), which is a major cluster stability killer. For example, for Ceph object storage, each LIST bucket request needs to go to multiple disks to retrieve all the data of

The process of apiserver processing requests

1. Overview The apiserver of k8s is the hub of communication for all components, and its importance is self-explanatory. apiserver can provide HTTP-based services to the outside world, so what steps does a request go through from issuing to processing? The following is a brief description of the entire process based on the code so that you can get a general impression of the process. Since the code structure of

TrueTime and Atomic Clocks

If you are concerned about distributed databases, I believe you have more or less heard of Google’s distributed database Spanner, and how Spanner uses atomic clocks to make a set of TrueTime to achieve distributed transactions across data centers. Many people have the impression that Google is so rich that it can afford to use atomic clocks as a sophisticated high-end device. This statement cannot be said to be entirely wrong, but at least it is not completely accurate.

The boot process of containerd

The overall architecture diagram is as follows. Start with github.com/urfave/cli, with command configCommand: related to containerd configuration publishCommand: event related ociHook: provides preStart, preStop, etc. container hook If no subcommand is executed, the default action is executed Load the configuration file Create top-level folder: root = “/var/lib/containerd” state = “/run/containerd” Create /var/lib/containerd/tmpmounts Clean up the temporary mount points under tmpmounts Create and initialize the containerd server Apply the configuration settings to the server process.

calico IPIP Analysis

Overview When all hosts in a cluster are on the same Layer 2, calico cni can make all Pod networks interoperate by just routing. However, a pure Layer 2 environment is not always possible in many scenarios, so when hosts interoperate with each other at only Layer 3, the calico IPIP (full name IP in IP) mode can be used. IP in IP is an IP tunneling protocol whose core

Great use of proxy_arp in calico

Overview proxy_arp is a configuration of the NIC that, when enabled, will use its own MAC address to respond to ARP requests from non-self IPs. A common use is when two hosts’ IPs are in the same network segment, but the Layer 2 is not working, you can use an additional host as a proxy, and turn on proxy_arp on this host’s NIC to act as an intermediate proxy to open the network.

Microsoft released a lightweight variant of Windows 11: support for running Win32 applications

Microsoft has low-key released a new lightweight variant of Windows 11, named “Microsoft Validation OS”. According to the introduction, this new lightweight operating system is command-line based and is not intended for the average end-user; rather, it is designed for hardware or software vendors, developers and technicians to help diagnose, mitigate and fix various problems. As such, it is more of a recovery environment of sorts. Microsoft’s official description states.

Kubernetes Evolution in Multiple Server Rooms

1. Matching application architecture with business development and operation and maintenance capabilities In industry conferences and documentation blogs, we often see various excellent solutions, but if we directly copy them to our own business, we often hit a wall. Because these technical solutions are incubated in specific business scenarios, different business forms, different business scales, and different business development stages will affect the implementation of the technology. On the other hand, applications need to be maintained by people, and a suitable platform needs to be built to assist in the management of the application life cycle, which requires matching operation and maintenance capabilities.