Resources sharing and reservation
Liqo enables a cluster, named consumer, to acquire resources from another remote cluster, named provider.
This happens through an explicitly negotiated slice of the provider cluster: the ResourceSlice resource carries the request, the Quota resource enforces it on the provider, and the VirtualNode resource exposes it on the consumer.
However, some questions arise, such as how many resources are desired (by the consumer), how resources are reserved (by the provider), how can we guarantee that the consumer does not consume more resources than planned, and more.
This section of the documentation presents a detailed view about how resource reservation is handled with Liqo, and the possible limitations of the current approach. Particularly, it describes how that slice is reserved end-to-end, from both the provider and the consumer point of view: how to configure the default slice size, how to request a specific amount, how strictly the provider enforces it at runtime, and how to suspend or reclaim a reservation that is no longer needed.
Overview
A reservation in Liqo is the result of an exchange between two clusters:
The consumer declares how much it would like to obtain by creating a
ResourceSlicein its tenant namespace, with the requested quantities inspec.resources.The provider decides how much it actually grants by writing the accepted quantities back into
status.resourcesof the same object (the slice is replicated across clusters by the crd-replicator).
Once the slice is Accepted, two derived resources materialize the reservation:
On the provider, a
Quota(offloading.liqo.io/Quota) is created in the tenant namespace, mirroring the accepted resources and the desired enforcement strictness.On the consumer, a
VirtualNodeis created withspec.resourceQuota.hardset to the same quantities, and the corresponding KubernetesNodeadvertises that capacity to the local scheduler.
By default, a single ResourceSlice is shared across all of the consumer’s offloaded namespaces: pods consume from the same Quota regardless of which offloaded namespace they live in.
When multiple VirtualNodes are created (one per ResourceSlice), pods consume the quota based on the VirtualNode they are scheduled on, not by namespace.
The diagram below summarizes the flow and makes explicit on which cluster each object is created.
From that point on, the reservation has two enforcement points:
the consumer’s Kubernetes scheduler, which sees the Liqo virtual
Nodeas a node with finite capacity;the provider’s
ShadowPodadmission webhook, which rejects offloaded pods that would push the consumer past itsQuota. This double mechanism prevents consumers from using more resources than what was negotiated with the provider cluster.
Warning
To make sure the consumer cluster does not exceed the quota of shared resources, the offloaded pods need to be created with the resources limits set.
Check the Resource quota enforcement section to know more about how to configure Liqo to enforce the quota of resources that the provider has granted to the consumer cluster.
Request a slice after peering
After a peering has been established, a consumer can change the amount of resources granted by either editing an existing ResourceSlice or creating a new one.
By editing an existing ResourceSlice
The amount of resources granted by an existing slice can be increased or decreased by updating its spec.resources.
The provider re-evaluates the request, the Quota is updated to match status.resources, and the VirtualNode reflects the new capacity.
If the provider rejects the new request — only possible with a custom class controller, since the default class always accepts — the ResourceSlice’s Resources condition transitions to Denied, the existing Quota is left unchanged, and the VirtualNode continues to expose the previously granted capacity.
Warning
Reducing the granted resources below what the consumer is currently using will not evict pods immediately: existing offloaded pods continue to run, but new pods that would exceed the new Quota will be denied at admission.
For an immediate effect, see the suspending and reclaiming a reservation section below.
After peering, with a separate ResourceSlice
When the peering already exists, additional capacity can be requested by creating a new ResourceSlice.
This is also the right path when you want a separate slice and a separate VirtualNode for a distinct pool of resources — for example, a dedicated GPU pool kept apart from the main CPU/memory slice — rather than enlarging the original slice:
apiVersion: authentication.liqo.io/v1beta1
kind: ResourceSlice
metadata:
name: gpu-pool
namespace: liqo-tenant-cool-firefly
labels:
liqo.io/remote-cluster-id: cool-firefly
liqo.io/replication: "true"
spec:
class: default
providerClusterID: cool-firefly
resources:
cpu: "4"
memory: 8Gi
pods: "30"
nvidia.com/gpu: "2"
Each ResourceSlice is associated with one Quota on the provider, and with one or more VirtualNodes on the consumer (one by default); multiple slices toward the same provider originate multiple virtual nodes, which can be also useful to expose heterogeneous resources (for example, separate ARM and x86 pools — see the multiple virtual nodes section).
Inspect the reservation
On the consumer, the resulting Liqo virtual Node exposes the granted capacity to the local scheduler as a regular node.
liqoctl info peer shows the info about the shared resource and of the peering status:
─ Resource slices ──────────────────────────────────────────────
─ gpu-pool ──────────────────────────────────────────────────
✔ Resource slice accepted
Action: Create
─ Resources ──────────────────────────────────────────────
cpu: 4
memory: 8Gi
pods: 30
nvidia.com/gpu: 2
Pro tip
For lower-level debugging, the underlying ResourceSlice (on the consumer) and Quota (on the provider) custom resources can also be inspected directly with kubectl.
On the consumer, the ResourceSlice shows what was requested and whether it was accepted:
NAMESPACE NAME AUTHENTICATION RESOURCES AGE
liqo-tenant-cool-firefly gpu-pool Accepted Accepted 21s
On the provider, the Quota shows what is being enforced:
NAMESPACE NAME ENFORCEMENT CORDONED AGE
liqo-tenant-wispy-firefly gpu-pool-c34af51dd912 None 35s
Resource quota enforcement
A Quota defines an upper bound, but whether the consumer can ever exceed it at runtime depends on two further options on the provider, both under controllerManager.config:
enableResourceEnforcement(defaulttrue): turns on theShadowPodvalidating webhook that rejects new offloaded pods if they would push the consumer past itsQuota. However, this is not on its own a strict guarantee: iflimitsare higher thanrequestsor if neither is set at all, the consumer can still consume more than the quota at runtime.defaultLimitsEnforcement(defaultNone): controls how strictly the webhook interprets therequestsandlimitsof each offloaded pod. It can take three values:None(default): offloaded pods may have neitherrequestsnorlimitsset, so the consumer might end up using more than the resources negotiated via theResourceSlice.Soft: offloaded pods must declarerequests; pre-allocated resources stay within the quota, but if a pod uses more than itsrequests(becauselimitsare higher or unset), actual usage can drift over.Hard: offloaded pods must declare bothrequestsandlimits, withlimits == requests. This is the strictest mode and the only one that turns the per-peering quota into a runtime guarantee, since the kubelet on the provider will throttle or kill containers that try to exceed their declaredlimits.
The selected mode is propagated to every Quota created on the provider via the Quota.spec.limitsEnforcement field. Both options can be set at installation time or updated later. For example, to set defaultLimitsEnforcement to Hard:
liqoctl install [...ARGS] --set controllerManager.config.defaultLimitsEnforcement=Hard
Warning
Switching to Hard enforcement after workloads are already being offloaded will cause new ShadowPod admissions to fail for any pod that does not declare matching requests and limits.
Verify that the workloads being offloaded already comply, or set sensible defaults via a mutating policy, before applying this change to a production provider.
Custom resource allocation policies
By default, the ResourceSlice class controller shipped with Liqo accepts every incoming request and only fills in the missing keys with offloading.defaultNodeResources. This behavior can be customized by defining a custom ResourceSlice class to implement stricter or cluster-wide policies (for example, capping the total accepted slices at a fraction of cluster capacity, or per-tenant quotas).
A reusable starting point is provided by the resource-slice-class-controller-template repository, and the general mechanics of class controllers are described in the Custom ResourceSlice classes section of the offloading-in-depth page.
Warning
The class is chosen by the consumer, and the built-in default class controller is always running alongside any custom one — Liqo does not provide a built-in way to disable it.
A consumer can therefore bypass a strict custom controller simply by selecting class: default (or omitting the class), which routes the request to the lenient built-in controller.
To actually enforce a stricter policy at the provider, the custom controller must be paired with an external mechanism that prevents the lenient path from being used — for example, a Kubernetes admission webhook on the provider that rejects or rewrites the spec.class of incoming ResourceSlice objects.
Suspend and reclaim a reservation
Liqo exposes two operations to act on an active reservation without unpeering: cordon (stop new allocations, keep existing ones) and drain (reject new allocations and revoke existing ones).
Both operations can be applied at the granularity of a single ResourceSlice or of an entire Tenant (which covers all slices of that consumer).
Cordon a ResourceSlice
Cordoning a slice prevents the provider from accepting new resource requests on that slice while leaving existing offloaded workloads running:
Internally, this adds the liqo.io/cordoned-resource annotation to the slice, which causes the Quota to be marked cordoned: true.
The ShadowPod admission webhook then rejects any new offloaded pod from that consumer.
The reverse operation is liqoctl uncordon resourceslice.
Cordon a Tenant
Cordoning a tenant has the same effect as cordoning all of its slices, plus it stops the provider from accepting new ResourceSlice objects from that consumer:
This sets spec.tenantCondition to Cordoned on the provider’s Tenant (authentication.liqo.io/Tenant) for that consumer.
The RemoteResourceSliceReconciler honors the condition by leaving previously-accepted slices untouched and denying new ones.
Drain a Tenant
Draining is the strongest operation: it suspends new allocations and invalidates existing ones, so that the provider stops admitting offloaded pods for that consumer altogether:
This sets spec.tenantCondition to Drained on the same Tenant; the RemoteResourceSliceReconciler then denies the resources of all existing slices.
The reverse operation is liqoctl uncordon tenant, which restores the tenant to Active.
The full reference for these commands is available on the liqoctl cordon, liqoctl drain, and liqoctl uncordon pages.
Important
Cordon and drain are administrative operations on the provider: they affect what the provider is willing to grant.
Cordoning a tenant on the provider does not unpeer the consumer nor remove its VirtualNode.
In other words, pods scheduled on the virtual node may still appear Running until they are evicted or the slice is fully released.
Limitations
A few aspects of the current design are worth keeping in mind when designing a reservation policy:
The default
ResourceSliceclass controller does not perform a cluster-wide capacity check: it accepts every request and may therefore grant more resources than the provider physically has, leaving the final arbitration to the standard Kubernetes scheduler on the provider. Cross-peering reservation requires a custom class controller paired with an admission webhook (or equivalent mechanism) that prevents consumers from selecting the lenient default class — Liqo does not ship that mechanism.Reducing the granted resources on a slice does not evict pods that are already running; cordon and drain are the supported way to actively reclaim capacity.
What Liqo does not
Liqo provides the basic mechanisms to negotiate and reserve resources, but a full-functional service may require the cooperation between Liqo and external components that take care of some preconditions that are out of scope for Liqo. This section lists some information and/or actions that may be required, and that must be carried out outside Liqo; skipping them may cause the reservation to fail or to complete with incorrect outcomes.
Discover what to ask for. The consumer must learn from the provider (out of band) which amounts and resource types are actually available before creating a
ResourceSlice. Liqo has no inventory advertisement and the default class (on the provider side) accepts any request, so an improper reservation is automatically accepted by Liqo, and only manifests later with pods that may not be able to start (hence, inPendingstate) on the provider cluster.Exchange cluster identities and peering credentials. Cluster IDs and a peering kubeconfig (typically generated on the provider with
liqoctl generate peering-user) must be exchanged through a secure channel beforeliqoctl peerruns. Liqo provides no directory or discovery service across clusters.Authorize the peering relationship. Someone with organizational authority must decide that the consumer is allowed to peer at all. Liqo’s handshake verifies who a peer is, not whether they should be admitted; there is no built-in approval workflow or admission queue at the organization level. A custom
ResourceSliceclass controller can implement such a workflow by accepting or denying slices based on organizational policy.Choose the provider’s reservation posture. The provider operator must set the Helm chart values
offloading.defaultNodeResourcesandcontrollerManager.config.defaultLimitsEnforcementbefore any consumer peers. Both can be updated later via a Helm upgrade:defaultLimitsEnforcementthen propagates to existingQuotaobjects on the next reconciliation, whiledefaultNodeResourcesonly affects future slice negotiations — slices already accepted retain the amounts they were granted. The chart ships defaults suited to demos (4 CPU, 8Gi,Noneenforcement); using them unchanged in production typically grants more than the operator intends and enforces nothing at runtime.Agree on what each resource key physically represents. Kubernetes resource keys (
cpu,memory,nvidia.com/gpu, …) are opaque strings — a limitation of how resource allocation worked in Kubernetes prior to Dynamic Resource Allocation (DRA), which Liqo inherits. Whatnvidia.com/gpu=2actually means — A100s versus H100s, full cards versus MIG slices, with or without NVLink — must be agreed out of band. Such a mismatch — for example, the consumer asking fornvidia.com/gpu=2expecting two H100 cards while the provider grants two A100 cards under the same key — is accepted at negotiation and only surfaces as wrong performance or runtime failure once workloads are scheduled.Ensure offloaded pods comply with the enforcement mode. When
defaultLimitsEnforcementisSoftorHard, the consumer’s pods must declarerequests(Soft) or matchingrequestsandlimits(Hard) before being offloaded (see Resource quota enforcement for the per-mode semantics). TheShadowPodadmission webhook rejects non-compliant pods at the provider, leaving the local pod inPendingstate; making pod specs compliant is the consumer application teams’ responsibility, not Liqo’s.Provision the capacity advertised by the provider. The provider must actually have the physical or cloud capacity to honor what
offloading.defaultNodeResourcesand grantedResourceSliceobjects promise. The default class controller does not verify against real capacity, so over-promising surfaces only at runtime with pods resulting in aPendingstate; cluster sizing — whether manual, viacluster-autoscaleror Karpenter, or through cloud quota requests — is the provider operator’s job.