Since I have been using Scalr to manage my Amazon Web Services farms I have been wanting more monitoring in terms of statistical information on services, traffic, disk usage, and uptime to name a few. Scalr has built in means of basic event notifications such as host up, host down, etc. Along with providing very basic load statistic via RRDtool. In the past I have always used Zabbix for most projects I have worked on so I wanted to be able to use it with Scalr. I am still testing the setup I am going to speak of so please keep that in mind. This is NOT a howto, but more of a brainstorming of how I plan on getting Zabbix integrated into my Scalr setup. In the Zabbix documentation (PDF) there are a few ways to use the auto-discovery that they cover (page 173). You can have Zabbix monitor a block of IPs to find new Zabbix Agents running for example. So here is what I will have my Zabbix Server do:
- Look for new Zabbix Agents on my AWS internal IP range.
- If the system.uname contains “Scalr” it will add to Scalr server group
- Server must be up for 30+ minutes
There will be other stipulations in order to get the server added to Zabbix. I will have system templates for each of my Scalr AMI roles. Once the server is added to Zabbix it will add them to to their respective groups and monitor for items and triggers listed in the system template. There will also be a rule to remove old instances after 24 hours from Zabbix after receiving the host down trigger. This way I will not have a bunch of old instances that were once monitored still cluttering Zabbix database. If you happen to also have Windows AWS instances you can add a rule to monitor these as well. The AMI just needs to have the Zabbix Windows Agent installed.