Setup Kubernetes monitoring using kube-state-metrics(KSM) and Prometheus Agent
Step by step guide to enable ingesting Kubernetes metrics via Prometheus Agent and send to Last9 via remote write.
Pre-requisites
- Ensure that your kubectl configuration is pointing to the right Kubernetes cluster
- Create a Last9 cluster by following Quick start guide
What is kube-state-metrics(KSM)
kube-state-metrics
(KSM) is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects. It is not focused on the health of the individual Kubernetes components but on the health of the various objects inside, such as deployments, nodes, and pods.
The metrics are exported by default to the port's HTTP endpoint /metrics
on port 8080. They are served as plaintext. They are designed to be consumed either by Prometheus itself or by a scraper compatible with scraping a Prometheus client endpoint. You can also open /metrics
in a browser to see the raw metrics. Note that the metrics exposed on the /metrics
endpoint reflect the current state of the Kubernetes cluster. When Kubernetes objects are deleted, they are no longer visible on the /metrics
endpoint.
The documentation for the metrics exposed by KSM can be found here.
Automated installation (Preferred)
Step 1: Copy the installation command
Step 2: Run the installation command
Before running the command, update it to use the write token of the Last9 cluster.
<Your Token Key>
Running the command will download the manifest yaml in the current working directory. It is strongly recommended that you check the manifest file in git so that it can be extended later.
You can just follow the video to see the end-to-end setup.
Manual Installation
-
Clone the GitHub repo
git clone https://github.com/kubernetes/kube-state-metrics.git
-
Deployment steps
To deploy this project, you can simply run
kubectl apply -f examples/standard
, and a Kubernetes service and deployment will be created.kubectl apply -f examples/standard
Read for more details on deployment here.
-
Validate corresponding deployment
kubectl get deployments kube-state-metrics -n kube-system
This is the sample output that you should see.
NAME READY UP-TO-DATE AVAILABLE AGE
kube-state-metrics 1/1 1 1 6d1h
Configure remote write to Last9
If you already have a running Prometheus setup, add the attached scrape configs, and remote write setup to your Prometheus config file to send data to Last9.
# prometheus.yaml
scrape_configs:
- job_name: "node-exporter"
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: "node-exporter"
action: keep
- job_name: "kubernetes-apiservers"
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
[
__meta_kubernetes_namespace,
__meta_kubernetes_service_name,
__meta_kubernetes_endpoint_port_name,
]
action: keep
regex: default;kubernetes;https
- job_name: "kubernetes-nodes"
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: "kubernetes-pods"
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels:
[__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: "kube-state-metrics"
static_configs:
- targets: ["kube-state-metrics.kube-system.svc.cluster.local:8080"]
- job_name: "kubernetes-cadvisor"
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: "kubernetes-service-endpoints"
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels:
[__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels:
[__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels:
[__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
remote_write:
- url: <Levitate_cluster_remote_write_url>
remote_timeout: 60s
queue_config:
capacity: 10000
max_samples_per_send: 3000
batch_send_deadline: 20s
min_shards: 4
max_shards: 200
min_backoff: 100ms
max_backoff: 10s
basic_auth:
username: <Levitate_Cluster_Id>
password: <Levitate_Cluster_Write_Token>
- Replace the
cluster
variable inexternal_labels
as per the description
external_labels:
# TODO - replace xyz.acme.io with a logical name for the cluster being scraped.
# by Prometheus e.g. prod1.xyz.com
cluster: "xyz.acme.io"
If you do not have a Prometheus setup, you can setup vmagent as well.
Steps to uninstall the KSM setup
For automated setup
To uninstall the Kubernetes resources that were created by the automated installation, you can use the kubectl delete
command with the -f
flag pointing to the same YAML file. This will delete all the resources defined in the file.
kubectl delete -f kube-state-metrics.yml
This command will remove the namespaces, deployments, services, service accounts, and any other resources defined in the kube-state-metrics.yml
file.
For manual setup
Delete the created kube-state-metrics objects.
kubectl delete -f examples/standard