通过本文您可以了解Databricks数据洞察的监控项。

当您调用云监控的API接口时,需要获取当前云服务的NamespacePeriod,具体取值如下:

  • Namespaceacs_spark
  • Period默认为60秒,也可以为60的整数倍。

当前云服务的MetricNameDimensions的取值如下表所示。

监控项 单位 MetricName Dimensions Statistics
Maximum percent used for all partitions % part_max_used userId、clusterId、role Average、Maximum、Minimum
阻塞的进程数目 Count procs_blocked userId、clusterId、role Average、Maximum、Minimum
创建的进程/线程数目 Count procs_created userId、clusterId、role Average、Maximum、Minimum
运行中的进程数目 Count proc_run userId、clusterId、role Average、Maximum、Minimum
总进程数目 Count proc_total userId、clusterId、role Average、Maximum、Minimum
Amount of available swap memory KB swap_free userId、clusterId、role Average、Maximum、Minimum
Total amount of swap space displayed in KBs KB swap_total userId、clusterId、role Average、Maximum、Minimum
网络流出速率 bit/s bytes_out userId、clusterId、role Average、Maximum、Minimum
15分钟平均负载 Count load_fifteen userId、clusterId、role Average、Maximum、Minimum
5分钟平均负载 Count load_five userId、clusterId、role Average、Maximum、Minimum
网络流入速率 bit/s bytes_in userId、clusterId、role Average、Maximum、Minimum
分配的Container个数 Count AllocatedContainers userId、clusterId、role Average
等待的Container个数 Count PendingContainers userId、clusterId、role Average、Maximum、Minimum
总共分配的Container个数 Count AggregateContainersAllocated userId、clusterId、role Average
总共释放的Container个数 Count AggregateContainersReleased userId、clusterId、role Average
Active状态的作业个数 Count ActiveApplications userId、clusterId、role Average、Maximum、Minimum
Active的用户数 Count ActiveUsers userId、clusterId、role Average、Maximum、Minimum
当前队列当前可用的内存大小 MB AvailableMB userId、clusterId、role Average、Maximum、Minimum
1分钟平均负载 Count load_one userId、clusterId、role Average、Maximum、Minimum
Amount of buffered memory KB mem_buffers userId、clusterId、role Average、Maximum、Minimum
Amount of cached memory KB mem_cached userId、clusterId、role Average、Maximum、Minimum
空闲内存容量 Byte mem_free userId、clusterId、role Average、Maximum、Minimum
Amount of shared memory KB mem_shared userId、clusterId、role Average、Maximum、Minimum
总内存容量 Byte mem_total userId、clusterId、role Average、Maximum、Minimum
数据包流入速率 Count/Second pkts_in userId、clusterId、role Average、Maximum、Minimum
数据包流出速率 Count/Second pkts_out userId、clusterId、role Average、Maximum、Minimum
Percent of time since boot idle CPU % cpu_aidle userId、clusterId、role Average、Maximum、Minimum
CPU空闲率 % cpu_idle userId、clusterId、role Average、Maximum、Minimum
Percent CPU interrupt % cpu_intr userId、clusterId、role Average、Maximum、Minimum
Percent CPU nice % cpu_nice userId、clusterId、role Average、Maximum、Minimum
Percent CPU soft interrupt % cpu_sintr userId、clusterId、role Average、Maximum、Minimum
系统态CPU使用率 % cpu_system userId、clusterId、role Average
用户态CPU使用率 % cpu_user userId、clusterId、role Average
Percent CPU wait io % cpu_wio userId、clusterId、role Average、Maximum、Minimum
空闲磁盘容量 Byte disk_free userId、clusterId、role Average、Maximum、Minimum
磁盘总容量 Byte disk_total userId、clusterId、role Average、Maximum、Minimum
已完成的作业数 Count AppsCompleted userId、clusterId、role Average
失败的作业数 Count AppsFailed userId、clusterId、role Average
被杀死的作业数 Count AppsKilled userId、clusterId、role Average
等待的作业数 Count AppsPending userId、clusterId、role Average
运行中的作业数 Count AppsRunning userId、clusterId、role Average
提交的作业数 Count AppsSubmitted userId、clusterId、role Average、Maximum、Minimum