Solving 503 Errors in Kubernetes 1.25+

image

A Tale of Woe

After upgrading to Kubernetes 1.25+, we started getting 503 errors trying to access many of our services from outside the cluster. We use an Ingress for each service and an NGINX Ingress Controller. A couple of the services worked, but most didn’t. Googling wasn’t much help. Eventually, we found that the working Ingresses had an annotation that the failing ones didn’t. Taking a shot in the dark, we added the attribute and that fixed the problem.

A few months later after a restart of the Ingress Controller Deployment in Kubernetes, the problem came back. Mysteriously, the apps that were running fine started failing with 503. In my younger days, I would have remembered the problem from before, but we ended up attacking the problem anew, taking much longer than we should have. Somehow the annotation we added months before the fix the issue had been removed from the working Ingresses. As before, Googling wasn’t much help nor was live support from Microsoft.

Even Googling for the details of the fix today, "nginx.ingress.kubernetes.io/service-upstream" "503", doesn’t return useful results, so I’m writing this post to help me remember, and hopefully help others.

The Solution

To fix the problem, we added the following annotation to the ingress and the 503 errors went away:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/service-upstream: "true"

At the time we simply used k9s or OpenLens to manually set the annotation. For the permanent fix, we updated the Helm chart for the ingress add the annotation. How the annotations were lost after a controller restart is yet to be determined.

Mass Updates of Annotations

In the event we get this yet again, I’ve created a PowerShell gist here that will add annotations on all the ingresses (or any other resource). Here’s an example of using it to add the annotation to fix the 503 errors:

.\Add-Annotation.ps1 -ResourceName ingress -AnnotationName "nginx.ingress.kubernetes.io/service-upstream" -AnnotationValue "true" | ft

ResourceName Name             Exists ValueMatched Updated
------------ ----             ------ ------------ -------
ingress      minimal2-ingress   True        False    True
ingress      minimal1-ingress   True        False    True

The script will first check to see if the annotation is already set and if the value matches. The output is objects with the values shown above. It supports the -WhatIf parameter, so you can see what it would do without actually doing it.

Summary

By writing this post I hope to never forget this simple solution to a vexing problem. I also hope it helps others who are struggling with the same issue.