FAQ

This section contains the answers to the most frequently asked questions by the community (Slack, GitHub, etc.).

General

Cluster limits

The official Kubernetes documentation presents some general best practices and considerations for large clusters, defining some cluster limits. Since Liqo still relies on Kubernetes, most of its limits are still present in Liqo. Hence, we do not enforce any limitations in Liqo, it’s just something that comes from general experience in using Kubernetes. Some limitations of K8s, though, do not apply. For instance, the limitation of 110 pods per node is not enforced on Liqo virtual nodes, as they abstract an entire remote cluster and not a single node. The same consideration applies to the maximum number of nodes (5000) since all the remote nodes are hidden by a single virtual node. You can find additional information here.

Why DaemonSets pods (e.g., Kube-Proxy, CNI pods) scheduled on virtual nodes are in OffloadingBackOff?

The virtual nodes generated by Liqo have a taint that prevents pods from being scheduled on a virtual node (so on the remote cluster) unless the pod is a created in an offloaded namespace.

However, the majority of DaemonsSets (e.g., Kube-Proxy, CNI pods) have pretty wide tolerations to make sure that their pods are created in each node of the cluster. As a result, a pod is forced to be scheduled on a Liqo virtual node, which rejects it and puts the pod in the OffloadingBackOff state. While this is generally not an issue, during upgrades of those applications, the pod in the OffloadingBackOff state might be interpreted as an upgrade failure.

A strategy to prevent the scheduling of a DaemonSet pod on Liqo virtual nodes is using NodeAffinity. For example, you might add the following under the .spec.template.spec.affinity field of your DaemonSet:

nodeAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
    nodeSelectorTerms:
    - matchExpressions:
      - key: liqo.io/type
        operator: DoesNotExist

This ensures that a pod is not created on any nodes with the liqo.io/type label.

Installation

Upgrade the Liqo version installed on a cluster

Unfortunately, this feature is not currently fully supported. At the moment, upgrading through liqoctl install or helm update will update manifests and Docker images (excluding the virtual-kubelet one as it is created dynamically by the controller-manager), but it will not update any CRD-related changes (see this issue for further details). The easiest way is to unpeer all existing clusters and then uninstall and reinstall Liqo on all clusters (make sure to have the same Liqo version on all peered clusters).

How to install Liqo on DigitalOcean

The installation of Liqo on a Digital Ocean’s cluster does not work out of the box. The problem is related to the liqo-gateway service and DigitalOcean load balancer health check (which does not support a health check based on UDP). This issue presents a step-by-step solution to overcome this problem.

Peering

How to force unpeer a cluster?

It is highly recommended to first unpeer all existing foreignclusters before upgrading/uninstalling Liqo. If using liqoctl unpeer command does not fix the problem (probably due to some cluster setup misconfiguration), you can try to manually unpeer the cluster by force deleting all Liqo resources associated with that ForeignCluster. To do this, force delete all resources (look also in the tenant namespace) with the following types (possibly in this order):

  • NamespaceMaps

  • ResourceSlices

Make sure to also manually remove possible finalizers. At this point, you should be able to delete the ForeignCluster. If there are no peerings left you can uninstall Liqo if needed.

Warning

This is a not recommended solution, use this only as a last resort if no other viable options are possible. Future upgrades will make it easier to unpeer a cluster or uninstall Liqo.

Is it possible to peer clusters using an ingress?

It is possible to use an ingress to expose the liqo-auth service instead of a NodePort/LoadBalancer using Helm values. Make sure to set auth.ingress.enable to true and configure the rest of the values in auth.ingress according to your needs.

Note

The liqo-gateway service can’t be exposed through a common ingress (proxies like nginx which works with HTTP only) because it uses UDP.

Network

Debug gateway-to-gateway communication issues

Follow these steps only if you are receiving an error in the connection resources. Run the following command to check the status of the connections:

kubectl get connection -A

Check the UDP service

Liqo exposes the gateway server using a UDP service.

In the majority of the cases, the issue is related to the missing support for UDP services on a cloud provider or in your on-premise environment.

You can manually test if your UDP LoadBalancers or NodePort services are working correctly by creating a dummy UDP echo server:

# echo-server.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: echo-server
  template:
    metadata:
      labels:
        app: echo-server
    spec:
      containers:
      - name: echo-server
        image: ghcr.io/liqotech/udpecho
        ports:
        - containerPort: 5000
          protocol: UDP
---
apiVersion: v1
kind: Service
metadata:
  name: echo-server-lb
spec:
  selector:
    app: echo-server
  type: LoadBalancer
  ports:
  - protocol: UDP
    port: 5000
    targetPort: 5000
---
apiVersion: v1
kind: Service
metadata:
  name: echo-server-np
spec:
  selector:
    app: echo-server
  type: NodePort
  ports:
  - protocol: UDP
    port: 5000
    targetPort: 5000

Save this file and apply the manifests to create the echo server and expose it:

kubectl apply -f echo-server.yaml

Now you can test the UDP service exposed by the echo server using the following command:

nc -u <IP> <PORT>

In case you want to test a LoadBalancer service, replace <IP> and <PORT> with the values of the echo-server-lb service. Otherwise, if you are testing the NodePort connectivity, replace <IP> with the IP of one of your nodes and <PORT> with the NodePort value of the echo-server-np service.

After you have run the command, you can type a message and press Enter. If you see the message echoed back in upper case, the UDP service is working correctly.

Debug pod-to-pod communication issues

These steps are intended to be used to get information about network issues between two clusters, to share with maintainers when asking for help.

Before starting check the connection resources on your clusters using kubectl get connection -A. If you get an error in their status, refer to the Debug gateway-to-gateway communication issues section.

Warning

It’s strongly recommended to use 2 clusters with different pod CIDRs for debugging.

Deploy debug pods

Create 2 namespaces, one in each cluster, and deploy a debug pod in each namespace. You don’t need to offload them.

# Run these commands on both clusters
kubectl create ns liqo-debug
kubectl create deployment nginx --image=nginx -n liqo-debug

Enter in the debug pod

Run an interactive shell in the debug pod to test the connectivity between the 2 clusters.

# Run these commands on both clusters
kubectl exec -it deployments/nginx -n liqo-debug -- /bin/bash

Now install the required tools to test the connectivity.

apt update
apt install iputils-ping -y 

Get the remote pod IP

We need to obtain the IPs we need to ping to test the connectivity.

If you are using 2 different pod CIDRs, you can use the original pod IPs.

kubectl get pods -n liqo-debug -o wide

If you are using the same pod CIDR, you need to remap the IPs of the pods.

If you have 2 clusters called cluster A and cluster B, to remap the pod IP on cluster B you need to: get the configuration resource on cluster A related to cluster B:

kubectl get configuration -A

Now take the REMAPPED POD CIDR value, keep the network part of the CIDR and replace the host part with the one of the pod you want to reach on cluster B.

If you want a more detailed explanation, you can find an example of remapping here.

Sniff the traffic inside the gateway

In your tenant namespace, you can find a pod called gw-<CLUSTER_ID>. This pod routes the traffic between the clusters.

In order to check if the traffic is correctly routed, you can sniff the traffic inside the gateway pod.

Let’s start opening a shell in a gateway pod.

kubectl exec -it gw-<CLUSTER_ID> -n liqo-gateway -- /bin/bash

Now you can use tcpdump to sniff the traffic.

tcpdump -tnl -i any icmp

Test the connectivity

Now you can test the connectivity between the 2 pods.

Run the following command in the shell of one of the two debug pods targeting the IP of the other debug pod.

ping -c1 <REMOTE_POD_IP>

Now check the packets in the gateway pod, and share the output with the maintainers.