Introduction to Karmada

Before we start talking about Resource Interpreter Webhook, we need to have some understanding of Karmada’s infrastructure and how to distribute applications, but that part has been mentioned in previous blogs, so we won’t go over it again in this post.

An example: Creating an nginx application

Let’s start with the simplest example, creating and distributing an nginx application in Karmada; the first step is to prepare the nginx resource template, which is the native K8s Deployment and does not require any changes.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx
        name: nginx

Prepare a PropagationPolicy to control which clusters nginx is distributed to.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: nginx-propagation
spec:
  resourceSelectors:
    - apiVersion: apps/v1
      kind: Deployment
      name: nginx
  placement:
    clusterAffinity:
      clusterNames:
        - member1
        - member2

Here we will distribute it directly to the member1 and member2 clusters.

Distribute to member1 and member2 clusters

The member1 and member2 clusters each have an nginx Deployment with 2 copies, so there are 4 Pods of this resource.

The above example is very simple, just create Deployment directly in the member cluster based on the template as is, but as you know Karmada supports some more advanced replica scheduling policies, such as the following example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
replicaScheduling:
  replicaDivisionPreference: Weighted
  replicaSchedulingType: Divided
  weightPreference:
    staticWeightList:
      - targetCluster:
          clusterNames:
            - member1
        weight: 1
      - targetCluster:
          clusterNames:
            - member2
        weight: 1

After the rule is applied, it involves dynamic adjustment of the number of resource replicas on each cluster, and then Karmada needs to add a step to modify the number of replicas when creating Deployment on member clusters.

For a K8s core resource like Deployment, we can directly write code to modify the number of copies because its structure is deterministic, but what if I have a CRD that functions like Deployment? Can Karmada modify its replica count correctly if I also need replica count scheduling? The answer is no, so Karmada introduces a new feature to enable deep support for Custom Resources (CRD).

Resource Interpreter Webhook

To solve the above mentioned problem, Karmada introduces Resource Interpreter Webhook, which implements a complete custom resource distribution capability by intervening in the phases from ResourceTemplate to ResourceBinding to Work to Resource.

Resource Interpreter Webhook

From one stage to another, we will pass through one or more predefined interfaces, and we will implement operations such as modifying the number of copies in these steps; the user needs to add a separate webhook server that implements the corresponding interface, and Karmada will call the server to complete the operation through the configuration when the corresponding step is executed.

In the following, we will select four representative hook points to introduce them one by one, and then use the following CRD as an example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// Workload is a simple Deployment.
type Workload struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    // Spec represents the specification of the desired behavior.
    // +required
    Spec WorkloadSpec `json:"spec"`

    // Status represents most recently observed status of the Workload.
    // +optional
    Status WorkloadStatus `json:"status,omitempty"`
}

// WorkloadSpec is the specification of the desired behavior of the Workload.
type WorkloadSpec struct {
    // Number of desired pods. This is a pointer to distinguish between explicit
    // zero and not specified. Defaults to 1.
    // +optional
    Replicas *int32 `json:"replicas,omitempty"`

    // Template describes the pods that will be created.
    Template corev1.PodTemplateSpec `json:"template" protobuf:"bytes,3,opt,name=template"`

    // Paused indicates that the deployment is paused.
    // Note: both user and controllers might set this field.
    // +optional
    Paused bool `json:"paused,omitempty"`
}

// WorkloadStatus represents most recently observed status of the Workload.
type WorkloadStatus struct {
    // ReadyReplicas represents the total number of ready pods targeted by this Workload.
    // +optional
    ReadyReplicas int32 `json:"readyReplicas,omitempty"`
}

It is very similar to Deployment and is used to demonstrate how Karmada supports such resources for advanced features such as copy number scheduling.

InterpretReplica

InterpretReplica

This hook point occurs during the process from ResourceTemplate to ResourceBinding, for resource objects with replica capabilities, such as custom resources like Deployment, and implements this interface to tell Karmada the number of replicas of the corresponding resource.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
apiVersion: workload.example.io/v1alpha1
kind: Workload
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - image: nginx
          name: nginx

For our example Workload resource, the implementation is also very simple, just return the value of the number of copies in the webhook server.

1
2
3
4
5
func (e *workloadInterpreter) responseWithExploreReplica(workload *workloadv1alpha1.Workload) interpreter.Response {
    res := interpreter.Succeeded("")
    res.Replicas = workload.Spec.Replicas
    return res
}

Note: All examples are taken from the official Karmada documentation.

ReviseReplica

ReviseReplica

The hook point occurs during the process from ResourceBinding to Work, where you need to modify the number of replicas of a resource object that has replica capabilities according to the request sent by Karmada, which calculates the number of replicas needed for each cluster through its scheduling policy. All you need to do is to assign the final calculated value to your CR object (because Karmada does not know the structure of the CRD).

1
2
3
4
5
6
7
8
9
func (e *workloadInterpreter) responseWithExploreReviseReplica(workload *workloadv1alpha1.Workload, req interpreter.Request) interpreter.Response {
    wantedWorkload := workload.DeepCopy()
    wantedWorkload.Spec.Replicas = req.DesiredReplicas
    marshaledBytes, err := json.Marshal(wantedWorkload)
    if err != nil {
        return interpreter.Errored(http.StatusInternalServerError, err)
    }
    return interpreter.PatchResponseFromRaw(req.Object.Raw, marshaledBytes)
}

The core code is also only the assignment line.

Workload implements copy number scheduling

Going back to our original question, after understanding the InterpretReplica and ReviseReplica hook points, you can implement a custom resource scheduling by replica count, implement the InterpretReplica hook point to tell Karmada the total number of replicas of the resource, implement the ReviseReplica hook point to modify the number of copies of the object, and then configure a PropagationPolicy, the same way as for resources such as Deployment.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: nginx-workload-propagation
spec:
  resourceSelectors:
    - apiVersion: workload.example.io/v1alpha1
      kind: Workload
      name: nginx
  placement:
    clusterAffinity:
      clusterNames:
        - member1
        - member2
    replicaScheduling:
      replicaDivisionPreference: Weighted
      replicaSchedulingType: Divided
      weightPreference:
        staticWeightList:
          - targetCluster:
              clusterNames:
                - member1
            weight: 2
          - targetCluster:
              clusterNames:
                - member2
            weight: 1

The effect is as follows.

Workload implements copy number scheduling

Retain

Retain

This hook occurs during the process from Work to Resource, and can be used to tell Karmada to keep certain fields in case the contents of spec are updated separately in the member cluster.

1
2
3
4
5
6
7
8
9
apiVersion: workload.example.io/v1alpha1
kind: Workload
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 3
  paused: false

Take paused as an example, the function of this field is to suspend the workload, the controller of member cluster will update this field separately, the Retain hook is to better collaborate with the controller of member cluster, you can use this hook to tell Karmada which fields are not to be need to be updated and kept.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
func (e *workloadInterpreter) responseWithExploreRetaining(desiredWorkload *workloadv1alpha1.Workload, req interpreter.Request) interpreter.Response {
    if req.ObservedObject == nil {
        err := fmt.Errorf("nil observedObject in exploreReview with operation type: %s", req.Operation)
        return interpreter.Errored(http.StatusBadRequest, err)
    }
    observerWorkload := &workloadv1alpha1.Workload{}
    err := e.decoder.DecodeRaw(*req.ObservedObject, observerWorkload)
    if err != nil {
        return interpreter.Errored(http.StatusBadRequest, err)
    }

    // Suppose we want to retain the `.spec.paused` field of the actual observed workload object in member cluster,
    // and prevent from being overwritten by karmada controller-plane.
    wantedWorkload := desiredWorkload.DeepCopy()
    wantedWorkload.Spec.Paused = observerWorkload.Spec.Paused
    marshaledBytes, err := json.Marshal(wantedWorkload)
    if err != nil {
        return interpreter.Errored(http.StatusInternalServerError, err)
    }
    return interpreter.PatchResponseFromRaw(req.Object.Raw, marshaledBytes)
}

The core code is a single line that updates the Paused field of wantedWorkload to the contents of the previous version.

AggregateStatus

AggregateStatus

This hook point occurs during the process from ResourceBinding to ResourceTemplate. For resource types that need to aggregate status information to the Resource Template, the status information of the Resource Template can be updated by implementing this interface.

Karmada collects the status information of each cluster Resouce into ResourceBinding.

ResourceBinding

the AggregateStatus hook needs to do is to update the status information in the ResourceBinding to the Resource Template.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
func (e *workloadInterpreter) responseWithExploreAggregateStatus(workload *workloadv1alpha1.Workload, req interpreter.Request) interpreter.Response {
    wantedWorkload := workload.DeepCopy()
    var readyReplicas int32
    for _, item := range req.AggregatedStatus {
        if item.Status == nil {
            continue
        }
        status := &workloadv1alpha1.WorkloadStatus{}
        if err := json.Unmarshal(item.Status.Raw, status); err != nil {
            return interpreter.Errored(http.StatusInternalServerError, err)
        }
        readyReplicas += status.ReadyReplicas
    }
    wantedWorkload.Status.ReadyReplicas = readyReplicas
    marshaledBytes, err := json.Marshal(wantedWorkload)
    if err != nil {
        return interpreter.Errored(http.StatusInternalServerError, err)
    }
    return interpreter.PatchResponseFromRaw(req.Object.Raw, marshaledBytes)
}

The logic is also very simple, based on the status information in the ResourceBinding to calculate (aggregate) the total status information of the resource and then update it to the Resource Template; the effect is similar to Deployment, you can directly query the status information of the resource in all clusters after aggregation.

kubecetl get deploy

Ref