如何读取处理日志服务的snappy 压缩类型的数据

容器与中间件中间件技术服务知识库
问题现象

日志服务支持投递日志到 TOS 中,投递数据格式支持 snappy 压缩格式,那么如何读取处理 snappy 压缩类型的数据?

排查步骤
  1. 确认客户使用的压缩类型和投递格式,例如客户选择的压缩方式是 snappy, 投递格式为 json

图片

  1. 下载对应的投递后的文件,如下图的 json.snappy 后缀的文件

图片

解决方案
  1. 安装 python-snappy 依赖库
pip3 install python-snappy==0.6.1
  1. 下载 TOS 的 json.snappy 后缀的文件,处理代码 demo示例如下:
import snappy
compressed = open(path, 'rb').read()
d = snappy.StreamDecompressor()
f = d.decompress(compressed).decode(encoding='utf-8', errors="ignore")
print(f)

path: 指定的是 json.snappy 后缀的文件的本地路径

  1. 读取处理后的效果如下:
{"__time__":"1695217674710","cluster_id":"ccbp43nnxxxx.xxxxxx","__pod_name__":"csi-tos-xxxx.xxxxxxphkj","__container_source__":"stderr","__container_ip__":"192.168.xxx.xxx","__namespace__":"kube-system","__image_name__":"vke-xxxx.xxxxxx/vke/livenessprobe:v2.2.0","__tag____client_ip__":"192.168.xxx.xxx","__pod_uid__":"9137f5ac-8e82-43d7-aaxxxxxxxxxx","__tag____receive_time__":"1695217678","__container_name__":"liveness-probe","host_ip":"192.168.xxx.xxx","__content__":"I0920 13:47:54.709637       1 connection.go:153] Connecting to unix:///csi/csi.sock","__source__":"172.27.xxx.xxx"}
{"__tag____client_ip__":"192.168.xxx.xxx","__pod_name__":"csi-tos-xxxxxxxxxx","__namespace__":"kube-system","__container_source__":"stderr","__container_name__":"csi-tos-driver","__container_ip__":"192.168.xxx.xxx","__tag____receive_time__":"1695217678","host_ip":"192.168.xxx.xxx","__image_name__":"vke-xxxx.xxxxxx/vke/tosplugin:v2.3","__pod_uid__":"9137f5ac-8e82-43d7-aa65-xxxxxxxxxx","cluster_id":"ccbp43nnqtoxxxxxxxxxx","__source__":"172.27.xxx.xxx","__time__":"1695217674710","__content__":"time=\"2023-09-20T13:47:54Z\" level=info msg=\"return {\\\"ready\\\":{\\\"value\\\":true}}\" event=grpc_response method=/csi.v1.Identity/Probe"}
{"__tag____client_ip__":"192.168.xxx.xxx","__pod_uid__":"68521f85-a4b0-4265-94c3-0ee3cc45e881","__container_source__":"stderr","__pod_name__":"csi-ebs-xxxxxxxxxx","__image_name__":"cr-xxxx.xxxxxx/vke/livenessprobe:v2.2.0","__namespace__":"kube-system","__content__":"I0920 13:47:56.438150       1 connection.go:153] Connecting to unix:///csi/csi.sock","__source__":"172.27.xxx.xxx","__container_ip__":"192.168.xxx.xxx","__time__":"1695217676438","host_ip":"192.168.xxx.xxx","__container_name__":"liveness-probe","__tag____receive_time__":"1695217680","cluster_id":"ccbp43nxxxx.xxxxxx"}
参考文档
0
0
0
0
相关资源
基于火山引擎 EMR 构建企业级数据湖仓
火山引擎 EMR 是一款云原生开源大数据平台,提供主流的开源大数据引擎,加持了字节跳动内部的优化、海量数据处理的最佳实践。本次演讲将为大家介绍火山引擎 EMR 的架构及核心特性,如何基于开源架构构建企业级数据湖仓,同时向大家介绍火山 EMR 产品的未来规划。
相关产品
评论
未登录
看完啦,登录分享一下感受吧~
暂无评论