Skip to main content

Declarative Alerting via IaC

Levitate supports configuring alerts and notifications automatically using a Python-based SDK tool which takes care of infrastructure changes

Configurations for alerting and notifications for observability at scale are hard to start, maintain and fix manually, just like provisioning infrastructure at scale. With infrastructure changes, it’s important that the observability stack also catch up with it to avoid the chances of issues because of a lack of observability or black swarm events. Last9 has introduced.l9iac tool to solve the exact same problem.


It is essential to install the IaC (Infrastructure as Code) tool. This powerful tool allows developers to automate creating entities and configuring alerts easily. The binary can be obtained by signing up for Levitate and contacting Last9 customer support on


It is highly recommended that IaC is installed inside a virtual environment, as this provides developers with an isolated space from the rest of the system, allowing them to test and develop their applications more easily. Instructions on how to set up a virtual environment can be found here.


cd <your workspace>
python -m venv env # this will create a ./env dir
source ./env/bin/activate

Quick Start:

  1. Create a YAML as per your alert rule configuration.

    Example: notification_service_am.yaml

# notification_service_am.yaml
- name: Notification Backend Alert Manager
type: service_alert_manager
data_source: prod-cluster
entity_class: alert-manager
external_ref: unqiue-slug-identifier
- name: availability
query: >-
count(sum by (job, taskid)(up{job !~ "ome.*"}) > 0) / count(sum by
(job, taskid) (up{job=~".*vmagent.*", job !~ "ome.*"})) * 100
- name: loss_of_signal
query: 'absent(up{job !~ "ome.*"})'
- name: Availability of notification service should not be less than 95%
description: >-
The error rate (5xx / total requests) is what defines the
availability, lower value means more degradation
indicator: availability
less_than: 99.5
severity: breach
bad_minutes: 3
total_minutes: 5
group_timeseries_notifications: false
team: payments
description: Error Rate described as number of 5xx/throughput
  1. Prepare the configuration file for running the IaC tool.

The configuration file has the following structure. It is a JSON file.

"api_config": {
"read": {
"refresh_token": "<LAST9_API_READ_REFRESH_TOKEN>",
"api_base_url": "",
"org": "<ORG_SLUG>"
"write": {
"refresh_token": "<LAST9_API_WRITE_REFRESH_TOKEN>",
"api_base_url": "",
"org": "<ORG_SLUG>"
"delete": {
"refresh_token": "<LAST9_API_DELETE_REFRESH_TOKEN>",
"api_base_url": "",
"org": "<ORG_SLUG>"
"state_lock_file_path": "state.lock"
  • The refresh_token can be obtained from the API Access page from the Last9 dashboard. You need to have refresh_tokens for all 3 operations - read, write and delete as the l9iac tool will perform all these 3 actions while applying the alert rules.
  • The <ORG_SLUG> is your organization's unique slug in Last9. It can be obtained from the API access page of Last9 dashboard.i
  • The default api_base_url is If you are on an on-premise setup of Last9, contact to get the api_base_url.
  • The state_lock_file_path is name of the file where l9iac will store the state lock of current alerting state(on the same lines of terraform state.lock).
  1. Run the following command to do a dry run for the changes.
l9iac -mf notification_service_am.yaml -c config.json plan
  1. Run the following command to apply the changes.
l9iac -mf notification_service_am.yaml -c config.json apply

We will provision the GitOps flow that will run apply command once changes are merged to the master branch in the GitHub repo. Contact for more details.


Here is the complete schema for generating the above .yaml file.


Entity here can be treated like a individual alert manager

namestringfalsetrueName of the entity (alert manager)
external_refstringtruetrueExternal reference for the entity, it’s a unique slug format identifier for each alert manager
typestringfalsetrueType of the entity
entity_classstringfalseoptionalDenotes the class of the entity. Supported values: alert-manager.
descriptionstringfalseoptionalDescription of the entity
data_sourcestringfalseoptionalData source
data_source_idstringfalseoptionalThe ID of the data source
teamstringfalseoptionalThe team that owns the entity
tierstringfalseoptionalTier of the entity
workspacestringfalseoptionalWorkspace of the entity
namespacestringfalseoptionalThe namespace of the entity
tagsarrayfalseoptionalList of tags for the entity
indicatorsarrayfalseoptionalList of indicators for the entity
alert_rulesarrayfalseoptionalList of alert rules for the entity
notification_channelsstring OR arrayfalseoptionalList of notification channels applicable to the entity
linksarrayfalseoptionalList of links associated with the entity.


uniqueness enforced at an entity levelrequiredName of the indicator
querystringfalserequiredThe PromQL query for the indicator
unitstringfalseoptionalUnit of the indicator
data_sourcestringfalseoptionalData Source of the indicator(Levitate)
descriptionstringfalseoptionalDescription of the indicator

Alert Rules

namestringtruerequiredRule name that describes the alert
descriptionstringtrueoptionalDescription for an alert rule that is included in the alert payload
indicatorstringfalserequiredName of the indicator
greater_thannumberfalseoptionalAlert triggers when the indicator value is greater than this
less_thannumberfalseoptionalAlert triggers when the indicator value is less than this
bad_minutesintegerfalserequiredNumber of minutes the indicator must be in a bad state before alerting
total_minutesintegerfalserequiredTotal number of minutes the indicator is sampled over
is_disabledbooleanfalseoptionalWhether the alert is disabled or not
runbookfalseoptionalrunbook link that will be included as part of the alert payload
severitystringfalseoptionalIt can be a threat or breach.
is_disabledbooleanfalseoptionaldenotes if the alert rules are enabled or disabled
label_filtermap/objectfalseoptionala mapping of the variables present in the indicator query and their pattern for this alert rule
expressionstringfalseoptionalthe alert rule expression. To be used only for pattern-based alerts
group_timeseries_notificationsbooleanfalseoptionalif the multiple affected time series in the alert needs to be grouped as one notification or not
mutebooleanfalseoptionalIf the alert notifications need to be muted or not


linkstringfalserequiredrunbook link that is included as part of the alert payload

Notification Channels

namestringfalserequiredName of the notification channel
typestringfalserequiredthe type of notification channel. Allowed values: Slack, Pagerduty, OpsGenie
severitystringfalseoptionalthe severity of the alerts sent through this channel. Allowed values: threat, breach
mentionstring OR list(string)falseoptionalOnly applicable to Slack. The user or list of users to tag in the alert message.
namestringfalserequiredDisplay name of the link
urlstringfalserequiredThe actual URL of the link