V2CE – Serverless实践系列(七):“定制”业务告警功能

在使用云产品的时,部分业务可能会需要“定制化”告警功能,那么,如何快速做一个定制化的告警系统呢?本文将会通过腾讯云云API对Kafka消息积压数量进行监控(在通用云监控部分不提供这个指标的告警),当超过阈值,通过Email以及企业微信和短信等进行业务告警。

云API对数据进行获取

说到云API数据获取部分,这里就要向大家推荐一款好用的产品:Explorer,这个产品可以帮助节省很多力气,本文也是通过Explorer来进行鉴权和监控数据获取的工作:

鉴权部分(已经去掉了我个人的SecretId和Key,如果使用请自行添加,但是注意不要泄漏):

API 2.0签名地址:https://cloud.tencent.com/document/product/215/1693

def GetSignature(param):
    # 公共参数
    param["SecretId"] = ""
    param["Timestamp"] = int(time.time())
    param["Nonce"] = random.randint(1, sys.maxsize)
    param["Region"] = "ap-guangzhou"
    # param["SignatureMethod"] = "HmacSHA256"

    # 生成待签名字符串
    sign_str = "GETckafka.api.qcloud.com/v2/index.php?"
    sign_str += "&".join("%s=%s" % (k, param[k]) for k in sorted(param))

    # 生成签名
    secret_key = ""
    if sys.version_info[0] > 2:
        sign_str = bytes(sign_str, "utf-8")
        secret_key = bytes(secret_key, "utf-8")
    hashed = hmac.new(secret_key, sign_str, hashlib.sha1)
    signature = binascii.b2a_base64(hashed.digest())[:-1]
    if sys.version_info[0] > 2:
        signature = signature.decode()

    # 签名串编码
    signature = urllib.parse.quote(signature)
    return signature

获取Kafka数据积压量

Kafka地址文档:https://cloud.tencent.com/product/ckafka

获取积压数据的API:https://cloud.tencent.com/document/product/597/30030

def GetGroupOffsets(max_lag, phoneList):
    param = {}
    param["Action"] = "GetGroupOffsets"
    param["instanceId"] = ""
    param["group"] = ""
    signature = GetSignature(param)

    # 生成请求地址
    param["Signature"] = signature
    url = "https://ckafka.api.qcloud.com/v2/index.php?Action=GetGroupOffsets&"
    url += "&".join("%s=%s" % (k, param[k]) for k in sorted(param))

    req_attr = urllib.request.urlopen(url)
    res_data = req_attr.read().decode("utf-8")
    json_data = json.loads(res_data)

    for eve_topic in json_data['data']['topicList']:
        temp_lag = 0
        result_list = []
        for eve_partition in eve_topic["partitions"]:
            lag = eve_partition["lag"]
            temp_lag = temp_lag + lag

        if temp_lag > max_lag:
            result_list.append(
                {
                    "topic": eve_topic["topic"],
                    "lag": lag
                }
            )
        
        print(result_list)
        if len(result_list)>0:
            KafkaLagRobot(result_list)
            KafkaLagSMS(result_list,phoneList)

接入企业微信

这里先贴一个企业微信的机器人地址:https://work.weixin.qq.com/api/doc#search

通过企业微信机器人配置,可以获得一个Webhook,编写告警代码:

(已经删除掉了企业微信的webhook,请自行添加到url中)

def KafkaLagRobot(content):

    url = ""
    data = {
        "msgtype": "markdown",
        "markdown": {
            "content": content,
        }
    }
    data = str(json.dumps(data)).encode("utf-8")
    print(urllib.request.urlopen(urllib.request.Request(url, data)).read().decode("utf-8"))

接入腾讯云短信服务

(已经删掉部分敏感信息)

短信页面地址:https://cloud.tencent.com/product/sms

def KafkaLagSMS(infor, phone_list):

    random_data = random.randint(1, sys.maxsize)
    url = ""
    strMobile = phone_list
    strAppKey = ""
    strRand = str(random_data)
    strTime = int(time.time())
    sig = hashlib.sha256()
    sig.update(
        ("appkey=%s&random=%s&time=%s&mobile=%s" % (strAppKey, random_data, strTime, ",".join(strMobile))).encode(
            "utf-8"))

    phone_dict = []
    for eve_phone in phone_list:
        phone_dict.append(
            {
                "mobile": eve_phone,
                "nationcode": "86"
            }
        )

    data = {
        "ext": "",
        "extend": "",
        "params": [
            infor,
        ],
        "sig": sig.hexdigest(),
        "sign": "你的sign",
        "tel": phone_dict,
        "time": strTime,
        "tpl_id": 你的模板id
    }
    data = str(json.dumps(data)).encode("utf-8")
    print(urllib.request.urlopen(urllib.request.Request(url=url, data=data)).read().decode("utf-8"))

发送邮件告警

可以参考之前的Demo:https://cloud.tencent.com/developer/article/1419135

def sendEmail(content, to_user):
    sender = '[email protected]'
    receivers = [to_user]

    mail_msg = content
    message = MIMEText(mail_msg, 'html', 'utf-8')
    message['From'] = Header("监控", 'utf-8')
    message['To'] = Header("站长", 'utf-8')

    subject = "告警"
    message['Subject'] = Header(subject, 'utf-8')

    try:
        smtpObj = smtplib.SMTP_SSL("smtp.exmail.qq.com", 465)
        smtpObj.login('[email protected]', '密码')
        smtpObj.sendmail(sender, receivers, message.as_string())
    except smtplib.SMTPException:
        pass

整合代码

此时我们只需要将所有的代码,通过一些逻辑进行整合即可:

# -*- coding: utf8 -*-
import json
import binascii
import hashlib
import hmac
import random
import sys
import ssl
import time
import urllib.parse
import urllib.request
import smtplib
from email.mime.text import MIMEText
from email.header import Header
ssl._create_default_https_context = ssl._create_unverified_context

def sendEmail(infor):

    temp_str = 'Topic:%s,积压数据量:%d;'
    content = ""
    for eve_infor in infor:
        content = content + temp_str % (eve_infor["topic"], eve_infor["lag"])

    sender = '[email protected]'
    receivers = ["[email protected]"]

    mail_msg = content
    message = MIMEText(mail_msg, 'html', 'utf-8')
    message['From'] = Header("监控", 'utf-8')
    message['To'] = Header("站长", 'utf-8')

    subject = "告警"
    message['Subject'] = Header(subject, 'utf-8')

    try:
        smtpObj = smtplib.SMTP_SSL("smtp.exmail.qq.com", 465)
        smtpObj.login('[email protected]', '密码')
        smtpObj.sendmail(sender, receivers, message.as_string())
    except smtplib.SMTPException:
        pass

def KafkaLagRobot(infor):
    base_str = "Kafka消费者监控提醒:\n"
    temp_str = '>Topic:<font color="comment">%s</font>,积压数据量:<font color="warning">%d</font>条;\n'
    content = ""
    for eve_infor in infor:
        content = content + temp_str % (eve_infor["topic"], eve_infor["lag"])

    content = base_str + content

    url = ""
    data = {
        "msgtype": "markdown",
        "markdown": {
            "content": content,
        }
    }
    data = str(json.dumps(data)).encode("utf-8")
    print(urllib.request.urlopen(urllib.request.Request(url, data)).read().decode("utf-8"))


def KafkaLagSMS(infor, phone_list):

    temp_str = 'Topic:%s,积压数据量:%d;'
    content = ""
    for eve_infor in infor:
        content = content + temp_str % (eve_infor["topic"], eve_infor["lag"])

    random_data = random.randint(1, sys.maxsize)
    url = ""
    strMobile = phone_list
    strAppKey = ""
    strRand = str(random_data)
    strTime = int(time.time())
    sig = hashlib.sha256()
    sig.update(
        ("appkey=%s&random=%s&time=%s&mobile=%s" % (strAppKey, random_data, strTime, ",".join(strMobile))).encode(
            "utf-8"))

    phone_dict = []
    for eve_phone in phone_list:
        phone_dict.append(
            {
                "mobile": eve_phone,
                "nationcode": "86"
            }
        )

    data = {
        "ext": "",
        "extend": "",
        "params": [
            content,
        ],
        "sig": sig.hexdigest(),
        "sign": "",
        "tel": phone_dict,
        "time": strTime,
        "tpl_id":
    }
    data = str(json.dumps(data)).encode("utf-8")
    print(urllib.request.urlopen(urllib.request.Request(url=url, data=data)).read().decode("utf-8"))


def GetSignature(param):
    # 公共参数
    param["SecretId"] = ""
    param["Timestamp"] = int(time.time())
    param["Nonce"] = random.randint(1, sys.maxsize)
    param["Region"] = "ap-guangzhou"
    # param["SignatureMethod"] = "HmacSHA256"

    # 生成待签名字符串
    sign_str = "GETckafka.api.qcloud.com/v2/index.php?"
    sign_str += "&".join("%s=%s" % (k, param[k]) for k in sorted(param))

    # 生成签名
    secret_key = ""
    if sys.version_info[0] > 2:
        sign_str = bytes(sign_str, "utf-8")
        secret_key = bytes(secret_key, "utf-8")
    hashed = hmac.new(secret_key, sign_str, hashlib.sha1)
    signature = binascii.b2a_base64(hashed.digest())[:-1]
    if sys.version_info[0] > 2:
        signature = signature.decode()

    # 签名串编码
    signature = urllib.parse.quote(signature)
    return signature


def GetGroupOffsets(max_lag, phoneList):
    param = {}
    param["Action"] = "GetGroupOffsets"
    param["instanceId"] = ""
    param["group"] = ""
    signature = GetSignature(param)

    # 生成请求地址
    param["Signature"] = signature
    url = "https://ckafka.api.qcloud.com/v2/index.php?Action=GetGroupOffsets&"
    url += "&".join("%s=%s" % (k, param[k]) for k in sorted(param))

    req_attr = urllib.request.urlopen(url)
    res_data = req_attr.read().decode("utf-8")
    json_data = json.loads(res_data)

    for eve_topic in json_data['data']['topicList']:
        temp_lag = 0
        result_list = []
        for eve_partition in eve_topic["partitions"]:
            lag = eve_partition["lag"]
            temp_lag = temp_lag + lag

        if temp_lag > max_lag:
            result_list.append(
                {
                    "topic": eve_topic["topic"],
                    "lag": lag
                }
            )
        
        print(result_list)
        if len(result_list)>0:
            KafkaLagRobot(result_list)
            KafkaLagSMS(result_list,phoneList)
            sendEmail(result_list)

def main_handler(event, context):
    # 发送短信的列表
    phone_list = ["PhoneNumber"]
    GetGroupOffsets(2000, phone_list)
    return True

总结

腾讯云云函数SCF是一个非常有趣、且非常有价值的产品。我之前做了一个项目,由于临时需要增加一个活动模块,但是不想修改源代码,就通过腾讯云云函数SCF对数据库进行增删改查,增加了点逻辑代码,与API网关结合,很快上线,开发过程非常愉快。

其实在生活中,灵活运用一个产品或者几个产品结合使用,是非常有趣的,同时正确应用一款产品,也会让你的工作事半功倍,本文主要通过云API对云监控数据进行获取,获取到Kafka数据积压量,进行一个逻辑处理,然后调用了发送邮件的方法、发送短信的方法以及发送企业微信的方法,实现了监控告警功能,经过使用时间触发器:

效果良好,也成功实现了基本告警功能。写本文的目的,也是希望大家,可以通过这样的一个Demo,将其应用到自己的项目中,发挥更大的价值,定制更有趣、更有价值、更加灵活多变的告警策略,服务项目。

【SCF实践系列】是腾讯云Serverless团队策划的SCF场景落地系列案例指导,旨在帮忙开发者了解SCF可应用的场景及其实现方式。同时,我们也欢迎大家分享、反馈SCF相关技术实践、产品体验问题及建议等,一经选用,即有精美礼品送,欢迎大家留言或者邮箱投稿~

【投稿邮箱】[email protected]

正文完