我正在寻找允许访问历史服务器中“流”选项卡中可用的Spark流统计的API。
我主要对批处理时间值感兴趣,但至少根据文档,它不能直接通过REST API获得:https : //spark.apache.org/docs/latest/monitoring.html#rest-api
有什么想法如何获取各种信息,例如“流”选项卡或历史服务器中的运行作业?
在驱动程序节点上与Spark UI相同的端口上有一个度量标准终结点。 http://<host>:<sparkUI-port>/metrics/json/
与流相关的指标.StreamingMetrics
的名称为:
来自本地测试作业的样本:
local-1498040220092.driver.printWriter.snb.StreamingMetrics.streaming.lastCompletedBatch_processingDelay: {
value: 30
},
local-1498040220092.driver.printWriter.snb.StreamingMetrics.streaming.lastCompletedBatch_processingEndTime: {
value: 1498124090031
},
local-1498040220092.driver.printWriter.snb.StreamingMetrics.streaming.lastCompletedBatch_processingStartTime: {
value: 1498124090001
},
local-1498040220092.driver.printWriter.snb.StreamingMetrics.streaming.lastCompletedBatch_schedulingDelay: {
value: 1
},
local-1498040220092.driver.printWriter.snb.StreamingMetrics.streaming.lastCompletedBatch_submissionTime: {
value: 1498124090000
},
local-1498040220092.driver.printWriter.snb.StreamingMetrics.streaming.lastCompletedBatch_totalDelay: {
value: 31
},
local-1498040220092.driver.printWriter.snb.StreamingMetrics.streaming.lastReceivedBatch_processingEndTime: {
value: 1498124090031
},
local-1498040220092.driver.printWriter.snb.StreamingMetrics.streaming.lastReceivedBatch_processingStartTime: {
value: 1498124090001
}
为了获得处理时间,我们需要对本地StreamingMetrics.streaming.lastCompletedBatch_processingEndTime - StreamingMetrics.streaming.lastCompletedBatch_processingStartTime
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句