How to Fix the Kubernetes Namespace Stuck in Terminating State

Several reports lately have been about Kubernetes namespaces getting stuck in a terminating state. This article provides a high-level explanation of what can cause this and how to resolve it. You will find all the necessary information on how you can fix the issue. We will explain why this occurs in the first place. Let’s start with what the Kubernetes namespace stuck terminating issue is.

What Is the Issue of the Kubernetes Namespace Being Stuck in Terminating Issues?

To understand what the Kubernetes namespace stuck terminating issue is, it is important to get familiar with what a namespace is. Kubernetes namespace is a set of resources which are used by the Kubernetes Daemon to manage and control the deployed applications. A namespace is usually created when the first deployment of a new application on Kubernetes is started. The namespace remains in the “Creating” state for the first few seconds after the deployment. After this, it becomes the “Terminating” state, and the daemon starts assigning the resources to the application. The namespace is thought to be prepared for usage by the program when it is terminated. However, in some instances, the namespace might get stuck in this state indefinitely and refuse to become active even after multiple attempts are made to re-create it. There are a few actions that you can do to fix it when this occurs. We’ll examine some of the most typical reasons for this problem and possibly fix it.

Why Does the Namespace Gets Stuck in the Terminating State?

There are a few common reasons why a namespace might be stuck in a terminating state:

Reason 1: Common Operator Error

The most common error is the operator error wherein an operator accidentally deletes or stops the service that keeps the namespace alive.

Reason 2: Improper Configuration

Another common reason is that the underlying cluster needs to be configured correctly. If the cluster is configured with multiple masters and one master is suddenly removed from the cluster, it may lead to the termination state of all other clusters in the cluster network since they lack a viable master connection.

Reason 3: Network Connectivity Issues

Sometimes, an underlying issue, such as network connectivity issues, can cause the pods that are running inside the namespace to terminate abruptly, causing the namespace itself to go into a terminated state. It is crucial to track a cluster’s metrics and inspect them frequently to ensure that there are no underlying issues that cause the downtime for your applications.

Reason 4: Finalizers

Finally, namespaces have a finalizer which is defined below the spec. A finalizer is a metadata key that instructs Kubernetes to hold off on destroying a resource unless a particular condition is met. So, when a command to delete a NAMESPACE is executed, Kubernetes checks the metadata section for a finalizer. If the finalizer-defined resource cannot be destroyed, the namespace cannot be terminated as well, resulting in the NAMESPACE being in a terminating state for days, months, or even years.

How Can We Fix This Issue?

Here are some simple ways that you can follow to fix the issue easily:

Being Up-to-Date

First, ensure that your system is up-to-date by updating your K8s nodes with the latest release version. Some older versions have a flaw that could interfere with the functioning of the kubelet service and cause this failure.

Restart the Kubernetes Master Process

If the issue persists despite doing the step that we mentioned earlier, you can try to restart the Kubernetes master process. This process terminates any worker processes that might be stuck. This causes them to exit gracefully without causing problems for other pods.

Recreating the Stuck Pods

If the NAMESPACE remains stuck in this status after you restart the master process, the next step is to recreate the stuck pods. This requires copying them to a different namespace and deleting the broken pods in the original namespace. Once you have done this, you should ensure that all of the deleted pods are still running correctly in the target NAMESPACE. If any of them are not working properly, you should restore them. This helps to resolve the issue with the NAMESPACE in Kubernetes. Once you have done this, you can verify that all your containers are running correctly and that the broken pods are no longer running anywhere in the cluster.

Having Sufficient Disk Space Available for Storage on the Cluster

If that does not work as well, check whether there is adequate disk space that is open for storage on the cluster by running the following command on one of the nodes that are hosting the cluster:

kalsoom@VirtualBox > sudo df-kh | grep /var/lib/kubelet

As the name indicates, this command gives you a list of disks which are mounted on your system, along with the amount of space that is used by each device. This can be used to identify the devices that are experiencing issues with space allocation and free up additional space on those devices as needed.

Running an Apt-Get Update and a Complete System Reboot

If this doesn’t help to resolve the issue, try running an apt-get update followed by a complete system reboot. This forces the package manager to automatically check for new updates and install them. After your system is rebooted, perform the same command that you ran to identify any devices that run out of storage space. Once you identified the issue, free as much space as possible on the device to free up some space for the kubelet service to allocate to the namespace. You might also try using the different storage solutions for your cluster if the underlying hardware is underpowered.

Force Deleting the Namespace

You can also force delete the NAMESPACE by doing the following:

kalsoom@VirtualBox > NAMESPACE={YOUR_NAMESPACE_TO_DELETE}

kubectl proxy &

kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' >temp.json

curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize

The finalizers section components in this case are programmatically deleted using the jq function. You might also complete that manually. By default, the kubectl proxy creates the listener at 127.0.0.1:8001. You might be able to utilize that instead if you are aware of the hostname and IP address of your cluster master.

Removing the Finalizer

You can also remove the finalizer spec to delete the namespace completely. To do that, you need to remove the finalizer to completely delete the NAMESPACE by doing the following:

1. First, dump the Namespace spec in JSON format. The code is given as follows:

kalsoom@VirtualBox > kubectl get ns -o JSON > <namespacename>.json

2. Next, edit the namespace.json by removing the “finalizers” in the spec:

" spec": { "finalizers": },

to:

" spec": { },

3. After that, patch the namespace by doing the following:

kalsoom@VirtualBox> kubectl replace --raw "/API/v1/namespaces//finalize" -f <namespacename>.json

Conclusion

We briefly explained the issue of the namespace getting stuck in a terminating state. We also pointed out many reasons why this may happen and the necessary steps that we can take to fix this issue. We provided all the critical information regarding the said topic in detail.