Metrics

What metrics to collect?

Static metrics:

Cluster configuration:
- Number of containers running
- Per-container memory and CPU allocation
Number of partitions per stream and tasks per job

Dynamic metrics

Cluster configuration:
- Failed containers metrics
- container memory and CPU utilization (YARN metrics)
Per Job and/or per container:
- process calls
- time per process
- message rate and throughput
- input/output messages size
Dependent on job-type:
- window calls
- memory-related metrics for stateful Ops
- memory store get/set calls, data access/storage rate
JVM metrics
- JVM heap metrics
- Thread metrics

Where are metrics collected and viewed?

Metrics are sent to a time-series database (Graphite or Prometheus) every 10 seconds, and can be optionally monitored live through either Graphite-Web or Grafana.

Stream-Bench reports can be generated by querying the database offline. Analysis will primarily focus on understanding bottlenecks in the cluster configuration.

Reference

LinkedIn Engineering Blog - Operating Samza at Scale Details about the metrics they collect on Samza are listed towards the end of the blog

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metrics

What metrics to collect?

Static metrics:

Dynamic metrics

Where are metrics collected and viewed?

Reference

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally