
Percona Monitoring and Management (PMM) recently introduced the Integrated Alerting feature as a technical preview. This was a very eagerly awaited feature, as PMM doesn’t need to integrate with an external alerting system anymore. Recently we blogged about the release of this feature.
PMM includes some built-in templates, and in this post, I am going to show you how to add your own alerts.
Enable Integrated Alerting
The first thing to do is navigate to the PMM Settings by clicking the wheel on the left menu, and choose Settings:
Next, go to Advanced Settings, and click on the slider to enable Integrated Alerting down in the “Technical Preview” section.
While you’re here, if you want to enable SMTP or Slack notifications you can set them up right now by clicking the new Communications tab (which shows up after you hit “Apply Changes” turning on the feature).
The example below shows how to configure email notifications through Gmail:
You should now see the Integrated Alerting option in the left menu under Alerting, so let’s go there next:
Configuring Alert Destinations
After clicking on the Integrated Alerting option, go to the Notification Channels to configure the destination for your alerts. At the time of this writing, email via your SMTP server, Slack and PagerDuty are supported.
Creating a Custom Alert Template
Alerts are defined using MetricsQL which is backward compatible with Prometheus QL. As an example, let’s configure an alert to let us know if MongoDB is down.
First, let’s go to the Explore option from the left menu. This is the place to play with the different metrics available and create the expressions for our alerts:
To identify MongoDB being down, one option is using the up metric. The following expression would give us the alert we need:
up{service_type="mongodb"}
To validate this, I shut down a member of a 3-node replica set and verified that the expression returns 0 when the node is down:
The next step is creating a template for this alert. I won’t go into a lot of detail here, but you can check Integrated Alerting Design in Percona Monitoring and Management for more information about how templates are defined.
Navigate to the Integrated Alerting page again, and click on the Add button, then add the following template:
--- templates: - name: MongoDBDown version: 1 summary: MongoDB is down expr: |- up{service_type="mongodb"} == 0 severity: critical annotations: summary: MongoDB is down ({{ $labels.service_name }}) description: |- MongoDB {{ $labels.service_name }} on {{ $labels.node_name }} is down
This is how it looks like:
Next, go to the Alert Rules and create a new rule. We can use the Filters section to add comma-separated “key=value” pairs to filter alerts per node, per service, per agent, etc.
For example: node_id=/node_id/123456, service_name=mongo1, agent_id=/agent_id/123456
After you are done, hit the Save button and go to the Alerts dashboard to see if the alert is firing:
From this page, you can also silence any firing alerts.
If you configured email as a destination, you should have also received a message like this one:
For now, a single notification is sent. In the future, it will be possible to customize the behavior.
Creating MongoDB Alerts
In addition to the obvious “MongoDB is down” alert, there are a couple more things we should monitor. For starters, I’d suggest creating alerts for the following conditions:
- Replica set member in an unusual state
mongodb_replset_member_state != 1 and mongodb_replset_member_state != 2
- Connections higher than expected
avg by (service_name) (mongodb_connections{state="current"}) > 5000
- Cache evictions higher than expected
avg by(service_name, type) (rate(mongodb_mongod_wiredtiger_cache_evicted_total[5m])) > 5000
- Low WiredTiger tickets
avg by(service_name, type) (max_over_time(mongodb_mongod_wiredtiger_concurrent_transactions_available_tickets[1m])) < 50
The values listed above are just for illustrative purposes, you need to decide the proper thresholds for your specific environment(s).
As another example, let’s add the alert template for the low WiredTiger tickets:
--- templates: - name: MongoDB Wiredtiger Tickets version: 1 summary: MongoDB Wiredtiger Tickets low expr: avg by(service_name, type) (max_over_time(mongodb_mongod_wiredtiger_concurrent_transactions_available_tickets[1m])) < 50 severity: warning annotations: description: "WiredTiger available tickets on (instance {{ $labels.node_name }}) are less than 50"
Conclusion
Integrated alerting is a really nice to have feature. While it is still in tech preview state, we already have a few built-in alerts you can test, and also you can define your own. Make sure to check the Integrated Alerting official documentation for more information about this topic.
Do you have any specific MongoDB alerts you’d like to see? Given the feature is still in technical preview, any contributions and/or feedback about the functionality are welcome as we’re looking to release this as GA very soon!