Fine Tuning Error Detection
Fleet monitors the status field of deployed resources to determine whether a Bundle is healthy or in error. In certain cases, Fleet may interpret a condition in the status field as an error, even if it is expected or harmless.
You can adjust this behavior in two ways:
- Ignore conditions in
fleet.yaml - Customize error mappings with environment variables
You should rarely need to configure readiness detection in Fleet with environment variables. If you do, open an issue or submit a pull request to help improve the default readiness detection.
Ignore conditions in fleet.yaml​
Use the ignore.conditions setting in the fleet.yaml file to tell Fleet to ignore specific conditions.
# from https://fleet.rancher.io/ref-fleet-yaml
# Ignore fields when monitoring a Bundle. This can be used when Fleet thinks
# some conditions in Custom Resources makes the Bundle to be in an error state
# when it shouldn't.
ignore:
# Conditions to be ignored
conditions:
# In this example a condition will be ignored if it contains
# {"type": "Active", "status", "False"}
- type: Active
status: "False"
This method is useful when a custom resource or controller sets conditions that cause Fleet to mark a Bundle as failed, even though the resource is healthy.
Configure error mapping with environment variables​
In Fleet v0.13, error detection was enhanced to give you more control. You can use the environment variable CATTLE_WRANGLER_CHECK_GVK_ERROR_MAPPING to customize how resource conditions are interpreted.
This variable lets you define, by Group,Version,Kind (GVK), which condition values should be treated as errors or explicitly not treated as errors.
Set this variable in your Fleet Helm chart deployment (values.yaml) using extraEnv. The value must be JSON.
# Extra environment variables passed to the fleet pods.
# extraEnv:
# - name: OCI_STORAGE
# value: "false"
This setting is global to all Fleet controllers and applies to every GitRepo. If you need to adjust error handling only for a specific Bundle, use the ignoreConditions option in fleet.yaml instead.
Merging behavior​
When you override mappings with CATTLE_WRANGLER_CHECK_GVK_ERROR_MAPPING:
- New Conditions are merged with predefined conditions.
- Condition values are replaced for any condition you redefine.
For example, consider the Default mapping:
HelmChart.Failed=["True"]
This means Failed=True is treated as an error.
When you override with:
HelmChart.Failed=["False"]HelmChart.Ready=["False"]
This results in
Failed=["False"]replaces the default mapping. This meansFailed=Falseis now treated as an error.Ready=["False"]is added, soReady=Falseis also treated as an error.- Other conditions unchanged.
Disable error interpretation example​
Assume that every value of Failed was previously interpreted as an error, for example:
{ "type": "Failed", "status": ["True", "False"] }
You can narrow this mapping to treat only Failed=True as an error by setting:
[
{
"gvk": "sample.cattle.io/v1, Kind=Sample",
"conditionMappings": [
{ "type": "Failed", "status": ["True"] }
]
}
]
This configuration means only Failed=True is treated as an error. Failed=False is no longer considered an error.
You can also disable errors for any value of Failed by
{ "type": "Failed", "status": [""] }
This configuration ensures that no value of Failed is treated as an error.
Overriding conditions only affects the default error mappings (refer to Default error mappings). Fleet may still mark a resource as an error because other checks, such as those from the kstatus library, continue to run after your customization.
Enable error interpretation example​
[
{
"gvk": "sample.cattle.io/v1, Kind=Sample",
"conditionMappings": [
{ "type": "Failed", "status": ["True"] }
]
}
]
Here, Failed=True is treated as an error.
Default error mappings​
Fleet adds default error mappings to interpret certain resource conditions in the status field as errors. These mappings are applied besides to other readiness checks, such as those performed by the Kubernetes kstatus library.
The following default mappings apply:
- HelmChart (
helm.cattle.io/v1)JobCreated: NeitherTruenorFalseis considered an error.Failed:Trueis considered an error.
- Node (
v1)OutOfDisk:Trueis considered an error.MemoryPressure:Trueis considered an error.DiskPressure:Trueis considered an error.NetworkUnavailable:Trueis considered an error.
- Deployment (
apps/v1)ReplicaFailure:Trueis considered an error.Progressing:Falseis considered an error.
- ReplicaSet (
apps/v1)ReplicaFailure:Trueis considered an error.
Fallback mapping​
If a resource does not match the listed GVKs, Fleet applies a fallback mapping:
-
Any
GroupandVersionwith any kindStalled:Trueis considered an error.Failed:Trueis considered an error.