Edit

Troubleshoot the agentic CLI for Azure Kubernetes Service (AKS) (preview)

This article provides guidance on troubleshooting common issues with the agentic CLI for Azure Kubernetes Service (AKS).

Common troubleshooting steps

If you run into any issues when you use the agentic CLI for AKS, try the following troubleshooting steps:

  • If you see requests retrying to /chat/completions in the responses, you might be throttled by the token-per-minute (TPM) limits from the LLM. Increase the TPM limit or apply for more quota.
  • If outputs vary, it might be because of LLM response variability or intermittent Model Context Protocol (MCP) server connections.
  • Ensure that the deployment name is the same as the model name in the Azure OpenAI deployments.
  • If the aks-agent installation is failing, try to uninstall the Azure CLI and reinstall the latest version.

Error: Docker daemon not running

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
  1. If you receive an error indicating that the Docker daemon isn't running, start the Docker service using the appropriate steps for your operating system:

    • macOS / Windows:

      • Launch Docker Desktop from your applications.
      • Wait for Docker to start.
    • Linux:

      • Start the Docker service using the following commands:

        sudo systemctl start docker
        sudo systemctl enable docker  # Enable Docker to start on boot
        
  2. Verify Docker is running using the following command:

    docker info
    

Error: Docker permission denied

Got permission denied while trying to connect to the Docker daemon socket

To resolve Docker permission issues, ensure your user has the necessary permissions to access the Docker daemon using the steps for your operating system:

  • macOS / Windows:

    • Restart Docker Desktop to ensure it has the necessary permissions.
  • Linux:

    • Add your user to the docker group to allow non-root access to Docker using the following commands:
    sudo usermod -aG docker $USER
    newgrp docker  # Apply group changes immediately
    

Error: Docker image pull failures

Error response from daemon: pull access denied for aks-agent, repository does not exist or may require 'docker login'

To resolve Docker image pull failures, try the following steps:

  • Ensure you have internet connectivity.
  • Check if corporate firewalls are blocking Docker registry access.
  • Try initializing the agent again with az aks agent-init.

Azure credentials issues

Error: Azure authentication failed

To resolve Azure authentication issues, ensure your Azure CLI is properly authenticated and has access to the necessary resources using the following steps:

  1. Verify your Azure credentials are properly configured using the az account show command.

    az account show
    
  2. If needed, sign in again using the az login command.

    az login
    

Service account and RBAC issues

Error: Service account not found

Error: service account "aks-mcp" not found in namespace "default"

To resolve service account issues, ensure the Kubernetes service account is properly created and configured using the following steps:

  1. Verify the service account exists using the following command:

    kubectl get serviceaccount aks-mcp --namespace $NAMESPACE
    
  2. If the service account isn't found, create one using the steps in Create a service account and configure workload identity for the agentic CLI for Azure Kubernetes Service (AKS) (preview)

Error: Permission denied errors

Error: forbidden: User "system:serviceaccount:<namespace>:aks-mcp" cannot get resource "pods" in API group "" in the namespace "<namespace>"

To resolve permission denied errors, ensure the Kubernetes service account has the necessary RBAC permissions using the following steps:

  1. Verify RBAC permissions are correctly configured using the following commands:

    kubectl get role aks-mcp-role --namespace $NAMESPACE
    kubectl get rolebinding aks-mcp-rolebinding --namespace $NAMESPACE
    
  2. Check the RoleBinding associates the correct service account with the Role using the following command:

    kubectl describe rolebinding aks-mcp-rolebinding --namespace $NAMESPACE
    

Workload identity issues

Error: Workload identity not enabled

Error: workload identity is not enabled on this cluster

If you receive an error indicating that workload identity isn't enabled, verify that your AKS cluster has workload identity enabled using the following steps:

  1. Check if workload identity is enabled on your AKS cluster using the az aks show command.

    az aks show --resource-group $RESOURCE_GROUP --name $CLUSTER_NAME --query "securityProfile.workloadIdentity.enabled"
    
  2. If workload identity isn't enabled, follow the steps in Create a service account and configure workload identity for the agentic CLI for Azure Kubernetes Service (AKS) (preview) to enable workload identity on your cluster.

Error: Annotation missing

Error: service account does not have workload identity annotation

To resolve missing annotation errors, ensure the Kubernetes service account has the correct workload identity annotation using the following steps:

  1. Check if the annotation exists on the service account using the following command:

    kubectl describe serviceaccount aks-mcp --namespace $NAMESPACE
    
  2. If the annotation is missing, add it using the following command. Make sure to replace $CLIENT_ID with the actual client ID of the federated identity credential.

    kubectl annotate serviceaccount aks-mcp --namespace $NAMESPACE azure.workload.identity/client-id="$CLIENT_ID" --overwrite
    

Error: Federated credential propagation delay

If you receive errors related to the federated identity credential not being found or authentication failures, it might be due to propagation delays after creating the federated identity credential in Azure. To resolve this issue, try the following steps:

  1. Wait a few minutes for the federated identity credential to propagate across Azure services.
  2. Verify the federated identity credential exists using the az identity federated-credential list command.
az identity federated-credential list --identity-name $IDENTITY_NAME --resource-group $RESOURCE_GROUP

Initialization issues

Error: Extension not found

ERROR: The command 'aks agent' is invalid or not supported. Use 'az aks --help' to see available commands

To resolve extension not found errors, ensure the aks-agent extension is properly installed and loaded using the following steps:

  1. Install the aks-agent extension using the az extension add command.

    az extension add --name aks-agent --debug
    
  2. Verify successful installation using the az extension list command.

    az extension list
    

    Your output should include an entry for aks-agent.