This page describes the Kubernetes clusters provided by CSCS and provides step-by-step instructions on how to access and interact with CSCS Kubernetes clusters.
Architecture
All Kubernetes clusters are managed by Rancher https://www.rancher.com and run RKE2 https://github.com/rancher/rke2 (Rancher Kubernetes Engine 2)
Access to the Kubernetes API can be given in two main ways:
- direct access over the Internet, by exposing a Virtual IP (source IPs can be restricted);
- through a CSCS jump host (such as ela.cscs.ch) and proxying API calls through Rancher.
You can check which one are you using by checking the current-context in the kubeconfig file.
For dedicated clusters, Role-based access control (RBAC) cluster-admin role (aka super-user role) is granted cluster-wide using ClusterRoleBindings.
Periodic Kubernetes version updates are performed on clusters by CSCS. Those operational tasks are communicated and agreed with the cluster admin and executed in non-production environment at first to validate the changes. CSCS updates the base applications included in the clusters, so active involvement from the cluster admin is requested for everything else.
Access
To interact with a Kubernetes cluster, the kubectl CLI is needed. Find more about it here https://kubernetes.io/docs/tasks/tools/#kubectl. This is already available in case the jump host is used.
Step-by-step guide
- Retrieve the kubeconfig file:
- If you have a CSCS account, you may be able to access https://rancher.cscs.ch and download the kubeconfig file for the cluster(s) you own;
- If you don't have a CSCS account, ask for a local user on Rancher and a kubeconfig file tied to that local user will be provided.
- Save the kubeconfig in the default location or set the KUBECONFIG environment variable:
mv mykubeconfig.yaml ~/.kube/config or export KUBECONFIG=/home/user/kubeconfig.yaml
- Test that you can access the cluster by printing a list of cluster nodes:
kubectl get node
Please beware the file contains credentials to the cluster, so make sure you store it in a safe place.
Base installed applications
The clusters provided by CSCS usually include a set of useful tools that are pre-configured and ready to use.
ceph-csi
The Ceph Container Storage Interface is installed to permit dynamic Persistent Storage provisioning. Persistent Volume Claims can be created using different Storage Classes. In general, CSCS provides 2 storage classes:
- HDD (for large amount of data)
- NVMe (high-performance requirements, such as Databases)
external-dns
This application enabled the creation of DNS entries on a specific subdomain which depends on the scenario.
It watches for Ingresses or annotated Services of type LoadBalancer and creates the related A DNS entry on the DNS server.
For example, this is possible
kubectl annotate service nginx "external-dns.alpha.kubernetes.io/hostname=nginx.mycluster.tds.cscs.ch."
Note: be sure to set a name under the configured subdomain.
More information: https://github.com/kubernetes-sigs/external-dns
cert-manager
This application, integrated with external-dns, can provide valid certificates signed by Let's Encrypt. Use the "letsencrypt" ClusterIssuer configured which can be used to create certificates, for example:
apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: echo spec: secretName: echo commonName: echo.mycluster.tds.cscs.ch dnsNames: - echo.mycluster.tds.cscs.ch issuerRef: kind: ClusterIssuer name: letsencrypt
This will create a secret containing the valid certificate signed by Let's Encrypt
More information: https://cert-manager.io
metallb
This is a system application that enables the usage of services of type LoadBalancer, meaning that they get assigned a public IP.
Note: The pool of public IPs available to metallb is generally very limited (just a few), therefore prefer using ingresses where it's possible. In order to get those IPs, request them to CSCS.
More information: https://metallb.universe.tf
ingress-nginx
This ingress controller provides a default IngressClass "nginx" which can be used to expose HTTP or HTTPS services.
More information: https://docs.nginx.com/nginx-ingress-controller
external-secrets
This can be used to retrieve secrets stored on external secret management systems such as Hashicorp Vault.
More information: https://external-secrets.io/
observability
This installs the ECK operator and defines beats which export logs and metrics to the CSCS central log system.