啥也不说了,上脚本之前先说一下有种做成zabbix自动发现的方案不是很合适,因为线上的集群动不动就几千个queue,这样生成的zabbix监控项太多,综合各方面考虑都不是很合适,下面的方案是把结果写到文件里面,然后配置个让zabbix agent定时去扫关键字的模板,具体怎么弄就不赘述了,直接上我们的Python脚本哦。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
# -*- coding: utf-8 -*- import re import subprocess list_vhost_cmd = "/sbin/rabbitmqctl list_vhosts |grep -v 'Listing vhosts'" list_vhost_result = subprocess.check_output(list_vhost_cmd, shell=True).strip().split('\n') no_exception_count = 0 with open('./results.txt', 'w') as f: for vhost in list_vhost_result: list_queue_cmd = "/sbin/rabbitmqctl list_queues -p {0} |grep -Ev 'Listing queues|Timeout:|name\tmessages'".format(vhost) try: list_queue_result = subprocess.check_output(list_queue_cmd, shell=True).strip().split('\n') for q in list_queue_result: q_item = q.split('\t') # print(q_item) q_name = q_item[0] q_num = q_item[1] case1= re.findall('bakup', q_name, re.IGNORECASE) case2 = re.findall('backup', q_name, re.IGNORECASE) if case1 or case2: continue if int(q_num) >= 100000: f.write("Queue_Exception: " + "队列名:" + q_name + ' 数量:' + q_num + '\n') no_exception_count = no_exception_count + 1 # print("Queue_Exception: " + "队列名:" + q_name + ' 数量:' + q_num + '\n') except subprocess.CalledProcessError: pass if no_exception_count == 0: with open('./results.txt', 'w') as f: f.write("\n") |
Tips:上面的阈值设置为了100000,你可以根据你们公司的业务情况进行调整哦~