'Kubernetes: Exclude Node from default scheduling

Is it possible to create a node-pool that the scheduler will ignore by default but that can be targeted by node-selector?



Solution 1:[1]

If your node-pool has a static size or at least it's not auto-scaling then this is easy to accomplish.

First, taint the nodes in that pool:

kubectl taint node \
  `kubectl get node -l cloud.google.com/gke-nodepool=my-pool -o name` \
  dedicated=my-pool:NoSchedule

Kubernetes version >= 1.6

Then add affinity and tolerations values under spec: in your Pod(templates) that need to be able to run on these nodes:

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: dedicated
            operator: In
            values: ["my-pool"]
  tolerations: 
  - key: "key"
    operator: "Equal"
    value: "value"
    effect: "NoSchedule"

Pre 1.6

Then add these annotations to your Pod(templates) that need to be able to run on these nodes:

annotations:
  scheduler.alpha.kubernetes.io/tolerations: >
    [{"key":"dedicated", "value":"my-pool"}]
  scheduler.alpha.kubernetes.io/affinity: >
    {
      "nodeAffinity": {
        "requiredDuringSchedulingIgnoredDuringExecution": {
          "nodeSelectorTerms": [
            {
              "matchExpressions": [
                {
                  "key": "dedicated",
                  "operator": "In",
                  "values": ["my-pool"]
                }
              ]
            }
          ]
        }
      }
    }

See the design doc for more information.

Autoscaling group of nodes

You need to add the --register-with-taints parameter to kubelet:

Register the node with the given list of taints (comma separated <key>=<value>:<effect>). No-op if register-node is false.

In another answer I gave some examples on how to persist that setting. GKE now also has specific support for tainting node pools

Solution 2:[2]

Now GKE supports node taints. Node taints will be applied to all nodes during creation and will be persisted. So you don't need to run kubectl taint command. Please check https://cloud.google.com/container-engine/docs/node-taints for more information on this.

Solution 3:[3]

For those on Kubernetes 1.6 without alpha support enabled, you'll need to use the new "beta" level fields. The equivalent to the above accepted answer is what I've created below. This is based on the following article in the docs: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

apiVersion: v1
kind: Pod
metadata:
  name: with-node-affinity
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: dedicated
            operator: In
            values: ["my-pool"]
  tolerations: 
  - key: "key"
    operator: "Equal"
    value: "value"
    effect: "NoSchedule"
  containers:
  - name: with-node-affinity
    image: gcr.io/google_containers/pause:2.0

Solution 4:[4]

A full working example that used nodeSelector instead of affinity for anyone interested.

apiVersion: v1
kind: Service
metadata:
  name: ilg-banana
  namespace: fruits
spec:
  ports:
    - port: 80
      targetPort: "ilg-banana-port"
  # 3. so that this right here can expose them
  selector:
    app: ilg-banana
    env: fruits
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ilg-banana
  namespace: fruits
spec:
  selector:
    # 1. This right here should always*
    matchLabels:
      app: ilg-banana
      env: fruits
  replicas: 1
  template:
    metadata:
      # 2. *match this right here
      labels:
        app: ilg-banana
        env: fruits
    spec:
      containers:
        - name: ilg-banana
          image: hashicorp/http-echo
          args:
            - "-text=ilg-banana"
          ports:
            - name: ilg-banana-port
              containerPort: 5678
          resources:
            requests:
              memory: "64Mi"
              cpu: "250m"
            limits:
              memory: "128Mi"
              cpu: "500m"
      # This right here allows the node with the set label to pick this pod up
      nodeSelector:
        kubernetes.io/hostname: scraper-node-1
      tolerations:
        - key: "dedicated"
          value: "my-pool"
          effect: "NoSchedule"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Ajit Kumar
Solution 3 Aaron
Solution 4 Anis Benna