Metrics

WombatOAM has more than 100 built-in metrics, organised into metric categories on the dashboard (such categories are Memory, Runtime, I/O, etc.) See the description of all built-in metrics below.

WombatOAM also collects metrics from the folsom and exometer applications if they are running on the managed node. These are shown under the “folsom” and “exometer” metric categories on the dashboard.

Viewing metrics

On the Metrics tab, select a node or node family. This reveals the available metrics, grouped by type and category. Select a type (counter, gauge, meter, spiral, histogram or duration – these are shown as tabs) and a category to reveal individual metrics, and then select a metric to display it.

For individual nodes, each of the numeric metrics are displayed as line charts. If you select a node family, the metrics are displayed as stacked graphs showing all the nodes in the family. For node families only one metric can be selected for stacked display and only counters and gauges are supported (as they only have single datapoints).

You can superimpose metrics to view several metrics on the same graph. Each metric you select is added to the graph. To remove a metric, click it again to clear the checkbox next to it. To clear all metrics from the graph, click the “trash” icon on the right above the graph. You can also remove individual metrics on the “Configure metrics” window.

Other viewing options:

  • Refresh interval: To change the frequency at which the graph refreshes, click the “settings” icon on the right above the graph to open the “Configure metrics” window, and then select an option in the “Refresh interval” list.
  • Markers: Different metrics are distinguished on the graph by different colors and markers. To hide or show these markers, click the “settings” icon, and select or clear the “Enable markers” checkbox.
  • Delta: To view deltas instead of actual values, select the “Delta” (Δ) icon on the right above the graph.

By default, each metric is polled once a minute, i.e. the metric graphs will show a new data point once a minute. If you would like to change this setting, please refer to the Configuration reference in the “Configuration” section.

Built-in metrics

I/O

Input I/O bytes

The total number of bytes received through ports.

  • Tags: dev, op

Output I/O bytes

The total number of bytes output to ports.

  • Tags: dev, op

TCP: Total bytes received

The total number of bytes that have been received by TCP.

  • Tags: dev, op

TCP: Packets received

The number of TCP packets that have been received.

  • Tags: dev

TCP: Average received packet size

The average size of TCP packets that have been received.

  • Tags: dev

TCP: Maximum received packet size

The maximum size of TCP packets that have been received.

  • Tags: dev

TCP: Total bytes sent

The total number of bytes that have been sent by TCP.

  • Tags: dev, op

TCP: Packets sent

The number of TCP packets that have been sent.

  • Tags: dev, op

TCP: Average sent packet size

The average size of TCP packets that have been sent.

  • Tags: dev

TCP: Maximum sent packet size

The maximum size of TCP packets that have been sent.

  • Tags: dev

UDP: Total bytes received

The total number of bytes that have been received by UDP.

  • Tags: dev, op

UDP: Packets received

The number of UDP packets that have been received.

  • Tags: dev

UDP: Average received packet size

The average size of UDP packets that have been received.

  • Tags: dev

UDP: Maximum received packet size

The maximum size of UDP packets that have been received.

  • Tags: dev

UDP: Total bytes sent

The total number of bytes that have been sent by UDP.

  • Tags: dev, op

UDP: Packets sent

The number of UDP packets that have been sent.

  • Tags: dev, op

UDP: Average sent packet size

The average size of UDP packets that have been sent.

  • Tags: dev

UDP: Maximum sent packet size

The maximum size of UDP packets that have been sent.

  • Tags: dev

Disk usage on x

The result of the latest disk check for each partition. Reports the disk usage (e.g. the percentage of disk space occupied) on a mounted partition.

  • Tags: op

Memory

Total memory

The total amount of memory currently allocated, which is the same as the sum of memory size for processes and system.

  • Tags: dev, op

Process memory

The total amount of memory currently allocated by the Erlang processes.

  • Tags: dev, op

Process memory used

The total amount of memory currently used by the Erlang processes. This memory is part of the memory presented as process memory.

  • Tags: dev

System memory

The total amount of memory currently allocated by the emulator that is not directly related to any Erlang process. Memory presented as processes is not included in this memory.

  • Tags: dev, op

Atom memory

The total amount of memory currently allocated for atoms. This memory is part of the memory presented as system memory.

  • Tags: dev

Atom memory used

The total amount of memory currently used for atoms. This memory is part of the memory presented as atom memory.

  • Tags: dev

Binary memory

The total amount of memory currently allocated for binaries. This memory is part of the memory presented as system memory.

  • Tags: dev

Code memory

The total amount of memory currently allocated for Erlang code. This memory is part of the memory presented as system memory.

  • Tags: dev

ETS memory

The total amount of memory currently allocated for ETS tables. This memory is part of the memory presented as system memory.

  • Tags: dev

System total memory

The amount of memory available to the whole operating system.

  • Tags: dev, op

Total memory available

The total amount of memory available to the Erlang emulator, allocated and free. May or may not be equal to the amount of memory configured in the system.

  • Tags: dev, op

Buffered memory

The amount of memory the system uses for temporary storing raw disk blocks.

  • Tags: dev

Cached memory

The amount of memory the system uses for cached files read from disk.

  • Tags: dev

Free memory

The amount of free memory available to the Erlang emulator for allocation.

  • Tags: dev, op

Free swap memory

The amount of memory the system has available for disk swap.

  • Tags: dev

Swap memory used

The amount of memory the system is using for disk swap.

  • Tags: dev, op

Atoms

The total number of atoms in the system.

  • Tags: dev

DETS tables

The number of open DETS tables on the selected node.

  • Tags: dev, op

ETS tables

The number of ETS tables at the selected node.

  • Tags: dev, op

Low memory

The total amount of memory allocated in low memory areas that are restricted to less than 4GB even though the system may have more physical memory. The metric is available only on 64-bit halfword emulator.

  • Tags: dev

Maximum memory

The maximum total amount of memory allocated since the emulator was started. The metric is available only when the emulator is run with instrumentation.

  • Tags: dev

Allocated atom_table area

The amount of allocated memory for this area in bytes.

  • Tags: dev

Allocated bif_timer area

Memory allocated for timers in bytes.

  • Tags: dev

Allocated bits_bufs_size area

The amount of allocated memory for this area in bytes.

  • Tags: dev

Allocated dist_table area

Memory allocated for the distribution table in bytes.

  • Tags: dev

Allocated ets_misc area

The amount of allocated memory for this area in bytes.

  • Tags: dev

Allocated export_list area

The amount of allocated memory for this area in bytes.

  • Tags: dev

Allocated export_table area

Memory allocated for the export table in bytes.

  • Tags: dev

Allocated fun_table area

Memory allocated for the function table in bytes.

  • Tags: dev

The amount of allocated memory for this area in bytes.

  • Tags: dev

Allocated loaded_code area

Memory allocated for all the loaded code in bytes.

  • Tags: dev

Allocated module_refs area

The amount of allocated memory for this area in bytes.

  • Tags: dev

Allocated module_table area

The amount of allocated memory for this area in bytes.

  • Tags: dev

Allocated node_table area

Memory allocated for the table of nodes in bytes.

  • Tags: dev

Allocated process_table area

Memory allocated for the process table in bytes.

  • Tags: dev

Allocated register_table area

The amount of allocated memory for this area in bytes.

  • Tags: dev

Allocated static area

The amount of allocated memory for this area in bytes.

  • Tags: dev

Allocated sys_misc area

The amount of allocated memory for this area in bytes.

  • Tags: dev

Mnesia System metrics

These metrics are collected by the Mnesia plugin.

checkpoints

The checkpoints currently active on the node.

db_nodes

The nodes which make up the persistent database.

dc_dump_limit

Controls how often disc_copies tables are dumped from memory. Lower values reduce CPU overhead but increases disk space and startup times.

dump_log_time_threshold

The time threshold for transaction log dumps in milliseconds.

dump_log_write_threshold

The write threshold for transaction log dumps as the number of writes to the transaction log.

extra_db_nodes

Extra db_nodes to be contacted at start-up.

held_locks

Locks held by the local Mnesia lock manager.

local_tables

Tables which are configured to reside locally.

lock_queue

Transactions that are queued for execution by the local lock manager.

master_node_tables

Tables with at least one master node

no_table_loaders

The number of parallel table loaders during start. More loaders can be good if the network latency is high or if many tables contains few records.

running_db_nodes

Nodes where Mnesia currently is running. For more information, see mnesia:system_info/1

subscribers

Local processes currently subscribing to system events.

tables

Locally known tables.

transaction_commits

A number that indicates how many transactions have terminated successfully since Mnesia was started.

transaction_failures

A number that indicates how many transactions have failed since Mnesia was started.

transaction_log_writes

A number that indicates the number of write operations that have been performed to the transaction log since start-up.

transaction_restarts

A number which indicates how many transactions have been restarted since Mnesia was started.

transactions

All currently active local transactions.

Nodes, modules and applications

Known nodes

The number of nodes that are known to the selected node; this includes not only visible nodes, but also hidden nodes and previously known nodes, etc.

  • Tags: dev, op

Connected nodes

The number of nodes that are connected to the selected node.

  • Tags: dev, op

Visible nodes

The number of nodes that are connected to the selected node through normal connections.

  • Tags: dev

Hidden nodes

The number of nodes that are connected to the selected node through hidden connections.

  • Tags: dev

Traced nodes

The number of nodes traced from the current node by the Erlang dbg facility.

  • Tags: dev

Loaded modules

The number of loaded Erlang modules (current and/or old code), including preloaded modules.

  • Tags: dev

Old modules

The number of modules that have old code. For more details, see “Current and Old Code” on the following page: www.erlang.org/doc/man/code.html

  • Tags: dev

Module name clashes

The number of module name clashes. The function searches the entire code space for module names with identical names.

  • Tags: dev, op

Loaded applications

The number of applications that have been loaded into the application controller. This includes any included applications.

  • Tags: dev

Started applications

The number of processes started by the application controller process, which starts all other applications.

  • Tags: dev

Running applications

The number of applications that are currently running.

  • Tags: dev

Ports and sockets

Open ports

The number of ports currently existing on the local node.

  • Tags: dev, op

Ports with driver level locking

Number of ports with driver level locking. Driver level locking implies that all instances (ports) of the same port driver will use a global lock and only one emulator thread will execute code in the driver at a time. (As opposed to port level locking where each instance of the same port driver will use a per-instance lock and multiple emulator threads may execute code in the driver at the same time.) It might indicate a bottleneck if such a driver has many instances. (See erlang:port_info/2 and the ERL_DRV_FLAG_USE_PORT_LOCKING driver flag.) This metric is always zero on VMs with SMP support disabled.

  • Tags: dev

Alive ports total input in bytes

The total amount of data, in bytes, queued by all ports using the ERTS driver queue implementation.

  • Tags: dev

Alive ports total output in bytes

The total number of bytes written to by all ports from Erlang processes using either port_command/2, port_command/3, or Port ! {Owner, {command, Data}}.

  • Tags: dev

Open TCP sockets

The number of TCP sockets that are connected.

  • Tags: dev, op

Open UDP sockets

The number of UDP sockets that are connected.

  • Tags: dev, op

Open SCTP sockets

The number of SCTP sockets that are connected.

  • Tags: dev, op

Open x ports

The number of open ports belonging to a specific type. This type is obtained from erlang:port_info/1 using the name key of the proplist.

  • Tags: dev, op

By default, only ports with type TCP/UDP/SCTP are displayed. The port_type_counters_mode option can be used to configure WombatOAM to show all ports, including ports opened by a running application.

Process notifications

Long GC

The number of “Long GC” triggers from the system monitor in the last period (minute or second). A “Long GC” trigger means that a garbage collection in the system took longer than expected.

  • Tags: dev

Long schedule

The number of “Long schedule” triggers from the system monitor in the last period (minute or second). A “Long schedule” trigger means that a process or port in the system has been running uninterrupted for a longer time than expected.

  • Tags: dev

Large heap

The number of “Large heap” triggers from the system monitor in the last period (minute or second). A “Large heap” trigger means that a garbage collection in the system resulted in the size of a heap being unusually large.

  • Tags: dev

Busy port

The number of “Busy port” triggers from the system monitor in the last period (minute or second). A “Busy port” trigger means that a process in the system was suspended because it was sending to a busy port.

  • Tags: dev, op

Busy dist port

The number of “Busy dist port” triggers from the system monitor in the last period (minute or second). A “Busy dist port” trigger means that a process in the system was suspended because it was sending to a process on a remote node whose inter-node communication was handled by a busy port.

  • Tags: dev, op

Processes

Processes

The number of processes currently existing at the selected node.

  • Tags: dev

Process limit

The maximum number of processes that can existing simultaneously on the selected node.

  • Tags: dev, op

Registered processes

The number of process names that have been registered.

  • Tags: dev

OS processes

Returns the result of calling the function cpu_sup:nprocs/0. Note that this function returns the number of LWP’s (aka threads) that are alive in the system. That is something similar to what you get when you run ps -eLF. It is a rudimentary way of measuring the system load that may be of interest in some cases.

  • Tags: dev, op

Processes traced

The total number of processes traced by the Erlang’s tracing mechanism.

  • Tags: dev

Sum process dictionary size

The total size of the process dictionaries of all processes.

  • Tags: dev

Sum message queue length

The total number of messages currently in the message queue of all processes.

  • Tags: dev, op

Error logger message queue length

The number of messages currently in the message queue of the Erlang error logger.

  • Tags: dev, op

Memory size of all processes

The total size of all processes. This includes call stack, heap and internal structures.

  • Tags: dev

Number of process groups

The total number of known process groups.

  • Tags: dev

Shell history length

Sum of history length (the number of commands evaluated by a shell process) of all shell processes.

  • Tags: dev, op

Shell process size

Sum of the size all shell processes. This includes call stack, heap, and internal structures.

  • Tags: dev

Processes in running state

The number of processes where the status of the process is running.

  • Tags: dev

Processes in runnable state

The number of processes where the status of the process is runnable (ready to run, but another process is running).

  • Tags: dev

Processes in exiting state

The number of processes where the status of the process is exiting.

  • Tags: dev

Processes in GC state

The number of processes where the status of the process is garbage_collecting.

  • Tags: dev

Processes in waiting state

The number of processes where the status of the process is waiting (for a message).

  • Tags: dev

Processes in suspended state

The number of processes where the status of the process is suspended (suspended on a “busy” port or by the erlang:suspend_process/[1,2] built-in function).

  • Tags: dev

Processes with max priority

The number of processes where the current priority level for the process is max.

  • Tags: dev

Processes with high priority

The number of processes where the current priority level for the process is high.

  • Tags: dev

Processes with normal priority

The number of processes where the current priority level for the process is normal.

  • Tags: dev

Processes with low priority

The number of processes where the current priority level for the process is low.

  • Tags: dev

Processes in app x

Every process is a member of some process group and all groups have a group leader process. Every application has a group leader. For each application, this metric shows the number of processes associated to the application group leader.

  • Tags: dev

Orphan processes

The number of processes that do not belong to any group leader.

  • Tags: dev

Unknown processes

The number of processes that belong to a non-application group leader process.

  • Tags: dev

Runtime

Context switch count

The total number of context switches since the system started.

  • Tags: dev

Scheduler run queue size

The total length of the run queues, that is, the number of processes that are ready to run on all available run queues.

  • Tags: dev, op

Reduction count for all processes

The total number of reductions executed by all processes.

  • Tags: dev

Reductions since last call

The number of reductions that happened since the last call of erlang:statistics/1. For more details, see statistics on the erlang module’s manual page in the Erlang/OTP documentation.

  • Tags: dev

Reductions total

The total number of reductions performed by processes. This is an approximate measure of how much CPU time they have used.

  • Tags: dev

Garbage collections

The total number of garbage collections since the system started.

  • Tags: dev

Minor garbage collections

The total of minor garbage collections that have happened so far for every process in the system.

  • Tags: dev

Bytes reclaimed by GC

The total number of bytes reclaimed through garbage collection.

  • Tags: dev

Average fullsweep after

The average value of the fullsweep_after parameter for all processes. Relates to garbage collection.

  • Tags: dev

Average min binary vheap size

The average of minimum binary virtual heap sizes (in words) for all processes.

  • Tags: dev

Average min heap size

The average of minimum heap sizes (in words) for all processes.

  • Tags: dev

CPU utilization total

The sum of CPU utilization in percentages on all the cores. In case of having 4 CPUs the maximum is 400.

  • Tags: dev, op

CPU load for 1 avg

The average system load in the last minute, as described at cpu_sup. 0 represents no load, 256 represents the load reported as 1.00 by rup.

  • Tags: dev, op

CPU load for 5 avg

The average system load in the last five minutes, as described at cpu_sup. 0 represents no load, 256 represents the load reported as 1.00 by rup.

  • Tags: dev, op

CPU load for 15 avg

The average system load in the last 15 minutes, as described at cpu_sup. 0 represents no load, 256 represents the load reported as 1.00 by rup.

  • Tags: dev, op

CPU utilization - kernel on core x

CPU utilization (the percentage share of the CPU cycles spent in this processor state) for executing code in kernel mode. Each CPU is specified separately, if this information can be retrieved from the operating system. Available for Solaris and Linux only.

  • Tags: dev, op

CPU utilization - user on core x

CPU utilization (the percentage share of the CPU cycles spent in this processor state) for executing code in user mode. Each CPU is specified separately, if this information can be retrieved from the operating system. Available for Solaris and Linux only.

  • Tags: dev, op

CPU utilization - nice user on core x

CPU utilization (the percentage share of the CPU cycles spent in this processor state) for executing code in low priority (nice) user mode. Each CPU is specified separately, if this information can be retrieved from the operating system. Available for Linux only.

  • Tags: dev, op

CPU utilization - idle on core x

The percentage share of the CPU cycles spent in the idle state. Each CPU is specified separately, if this information can be retrieved from the operating system. Available for Solaris and Linux only.

  • Tags: dev, op

Time

Active timers

The number of all timers (one-shot and interval timers) in the table holding timing requests and timer objects; this is an ETS table maintained by Erlang’s timer server.

  • Tags: dev

Interval timers

The number of internal timers in the timer interval table, an ETS table maintained by Erlang’s timer server.

  • Tags: dev

Runtime total

The sum of the runtime for all threads in the Erlang run-time system. This may be greater than the wallclock time.

  • Tags: dev

Runtime since last call

The run time of all process run time in milliseconds, since last call of erlang:statistics/1. For details, see erlang:statistics/1

  • Tags: dev

Wallclock total

The time that has elapsed since the program started, measured in real time (as if checking with a clock on the wall).

  • Tags: dev

Wallclock since last call

Time spent in terms of wall-clock time since last call of erlang:statistics(wall_clock). For more details, see erlang:statistics/1.

  • Tags: dev

Scheduler x active wall time

Active time of scheduler x in terms of wall-clock time.

  • Tags: dev

Scheduler x total wall time

Time elapsed in terms of wall-clock time since activation wall-clock in scheduler x.

  • Tags: dev
Create a pull request or raise an issue on the source for this page in GitHub