Troubleshooting Prometheus integration

Prometheus is unreachable

If a Prometheus instance installed with FSM can’t be reached, perform the following steps to identify and resolve any issues.

  1. Verify a Prometheus Pod exists.

    When installed with fsm install --set=fsm.deployPrometheus=true, a Prometheus Pod named something like fsm-prometheus-5794755b9f-rnvlr should exist in the namespace of the other FSM control plane components which named fsm-system by default.

    If no such Pod is found, verify the FSM Helm chart was installed with the fsm.deployPrometheus parameter set to true with helm:

    $ helm get values -a <mesh name> -n <FSM namespace>

    If the parameter is set to anything but true, reinstall FSM with the --set=fsm.deployPrometheus=true flag on fsm install.

  2. Verify the Prometheus Pod is healthy.

    The Prometheus Pod identified above should be both in a Running state and have all containers ready, as shown in the kubectl get output:

    $ # Assuming FSM is installed in the fsm-system namespace:
    $ kubectl get pods -n fsm-system -l app=fsm-prometheus
    NAME                              READY   STATUS    RESTARTS   AGE
    fsm-prometheus-5794755b9f-67p6r   1/1     Running   0          27m

    If the Pod is not showing as Running or its containers ready, use kubectl describe to look for other potential issues:

    $ # Assuming FSM is installed in the fsm-system namespace:
    $ kubectl describe pods -n fsm-system -l app=fsm-prometheus

    Once the Prometheus Pod is found to be healthy, Prometheus should be reachable.

Metrics are not showing up in Prometheus

If Prometheus is found not to be scraping metrics for any Pods, perform the following steps to identify and resolve any issues.

  1. Verify application Pods are working as expected.

    If workloads running in the mesh are not functioning properly, metrics scraped from those Pods may not look correct. For example, if metrics showing traffic to Service A from Service B are missing, ensure the services are communicating successfully.

    To help further troubleshoot these kinds of issues, see the traffic troubleshooting guide.

  2. Verify the Pods whose metrics are missing have an Pipy sidecar injected.

    Only Pods with an Pipy sidecar container are expected to have their metrics scraped by Prometheus. Ensure each Pod is running a container from an image with flomesh/pipy in its name:

    $ kubectl get po -n <pod namespace> <pod name> -o jsonpath='{.spec.containers[*].image}'
    mynamespace/myapp:v1.0.0 flomesh/pipy:0.50.0
  3. Verify the proxy’s endpoint being scraped by Prometheus is working as expected.

    Each Pipy proxy exposes an HTTP endpoint that shows metrics generated by that proxy and is scraped by Prometheus. Check to see if the expected metrics are shown by making a request to the endpoint directly.

    For each Pod whose metrics are missing, use kubectl to forward the Pipy proxy admin interface port and check the metrics:

    $ kubectl port-forward -n <pod namespace> <pod name> 15000

    Go to http://localhost:15000/stats/prometheus in a browser to check the metrics generated by that Pod. If Prometheus does not seem to be accounting for these metrics, move on to the next step to ensure Prometheus is configured properly.

  4. Verify the intended namespaces have been enrolled in metrics collection.

    For each namespace that contains Pods which should have metrics scraped, ensure the namespace is monitored by the intended FSM instance with fsm mesh list.

    Next, check to make sure the namespace is annotated with enabled:

    $ # Assuming FSM is installed in the fsm-system namespace:
    $ kubectl get namespace <namespace> -o jsonpath='{.metadata.annotations.flomesh\.io/metrics}'

    If no such annotation exists on the namespace or it has a different value, fix it with fsm:

    $ fsm metrics enable --namespace <namespace>
    Metrics successfully enabled in namespace [<namespace>]
  5. If custom metrics are not being scraped, verify they have been enabled.

    Custom metrics are currently disable by default and enabled when the fsm.featureFlags.enableWASMStats parameter is set to true. Verify the current FSM instance has this parameter set for a mesh named <fsm-mesh-name> in the <fsm-namespace> namespace:

    $ helm get values -a <fsm-mesh-name> -n <fsm-namespace>

    Note: replace <fsm-mesh-name> with the name of the fsm mesh and <fsm-namespace> with the namespace where fsm was installed.

    If fsm.featureFlags.enableWASMStats is set to a different value, reinstall FSM and pass --set fsm.featureFlags.enableWASMStats to fsm install.


Was this page helpful?