Metrics instrumentation guide
This guide describes how to develop Service Ping metrics using metrics instrumentation.
For a video tutorial, see the Adding Service Ping metric via instrumentation class.
Nomenclature
-
Instrumentation class:
- Inherits one of the metric classes:
DatabaseMetric
,RedisMetric
,RedisHLLMetric
,NumbersMetric
orGenericMetric
. - Implements the logic that calculates the value for a Service Ping metric.
- Inherits one of the metric classes:
-
Metric definition The Service Data metric YAML definition.
-
Hardening: Hardening a method is the process that ensures the method fails safe, returning a fallback value like -1.
How it works
A metric definition has the instrumentation_class
field, which can be set to a class.
The defined instrumentation class should inherit one of the existing metric classes: DatabaseMetric
, RedisMetric
, RedisHLLMetric
, NumbersMetric
or GenericMetric
.
The current convention is that a single instrumentation class corresponds to a single metric. On rare occasions, there are exceptions to that convention like Redis metrics. To use a single instrumentation class for more than one metric, please reach out to one of the @gitlab-org/analytics-section/product-intelligence/engineers
members to consult about your case.
Using the instrumentation classes ensures that metrics can fail safe individually, without breaking the entire process of Service Ping generation.
We have built a domain-specific language (DSL) to define the metrics instrumentation.
Database metrics
You can use database metrics to track data kept in the database, for example, a count of issues that exist on a given instance.
-
operation
: Operations for the givenrelation
, one ofcount
,distinct_count
,sum
, andaverage
. -
relation
: Assigns lambda that returns theActiveRecord::Relation
for the objects we want to perform theoperation
. The assigned lambda can accept up to one parameter. The parameter is hashed and stored under theoptions
key in the metric definition. -
start
: Specifies the start value of the batch counting, by default isrelation.minimum(:id)
. -
finish
: Specifies the end value of the batch counting, by default isrelation.maximum(:id)
. -
cache_start_and_finish_as
: Specifies the cache key forstart
andfinish
values and sets up caching them. Use this call whenstart
andfinish
are expensive queries that should be reused between different metric calculations. -
available?
: Specifies whether the metric should be reported. The default istrue
. -
timestamp_column
: Optionally specifies timestamp column for metric used to filter records for time constrained metrics. The default iscreated_at
.
Example of a merge request that adds a database metric.
module Gitlab
module Usage
module Metrics
module Instrumentations
class CountIssuesMetric < DatabaseMetric
operation :count
relation ->(options) { Issue.where(confidential: options[:confidential]) }
end
end
end
end
end
Ordinary batch counters Example
module Gitlab
module Usage
module Metrics
module Instrumentations
class CountIssuesMetric < DatabaseMetric
operation :count
start { Issue.minimum(:id) }
finish { Issue.maximum(:id) }
relation { Issue }
end
end
end
end
end
Distinct batch counters Example
# frozen_string_literal: true
module Gitlab
module Usage
module Metrics
module Instrumentations
class CountUsersAssociatingMilestonesToReleasesMetric < DatabaseMetric
operation :distinct_count, column: :author_id
relation { Release.with_milestones }
start { Release.minimum(:author_id) }
finish { Release.maximum(:author_id) }
end
end
end
end
end
Sum Example
# frozen_string_literal: true
module Gitlab
module Usage
module Metrics
module Instrumentations
class JiraImportsTotalImportedIssuesCountMetric < DatabaseMetric
operation :sum, column: :imported_issues_count
relation { JiraImportState.finished }
end
end
end
end
end
Average Example
# frozen_string_literal: true
module Gitlab
module Usage
module Metrics
module Instrumentations
class CountIssuesWeightAverageMetric < DatabaseMetric
operation :average, column: :weight
relation { Issue }
end
end
end
end
end
Redis metrics
You can use Redis metrics to track events not kept in the database, for example, a count of how many times the search bar has been used.
Example of a merge request that adds Redis
metrics.
The RedisMetric
class can only be used as the instrumentation_class
for Redis metrics with simple counters classes (classes that only inherit BaseCounter
and set PREFIX
and KNOWN_EVENTS
constants). In case the counter class has additional logic included in it, a new instrumentation_class
, inheriting from RedisMetric
, needs to be created. This new class needs to include the additional logic from the counter class.
Required options:
-
event
: the event name. -
prefix
: the value of thePREFIX
constant used in the counter classes from theGitlab::UsageDataCounters
namespace.
Count unique values for source_code_pushes
event.
time_frame: all
data_source: redis
instrumentation_class: RedisMetric
options:
event: pushes
prefix: source_code
Availability-restrained Redis metrics
If the Redis metric should only be available in the report under some conditions, then you must specify these conditions in a new class that is a child of the RedisMetric
class.
# frozen_string_literal: true
module Gitlab
module Usage
module Metrics
module Instrumentations
class MergeUsageCountRedisMetric < RedisMetric
available? { Feature.enabled?(:merge_usage_data_missing_key_paths) }
end
end
end
end
end
You must also use the class's name in the YAML setup.
time_frame: all
data_source: redis
instrumentation_class: MergeUsageCountRedisMetric
options:
event: pushes
prefix: source_code
Redis HyperLogLog metrics
You can use Redis HyperLogLog metrics to track events not kept in the database and incremented for unique values such as unique users, for example, a count of how many different users used the search bar.
Example of a merge request that adds a RedisHLL
metric.
Count unique values for i_quickactions_approve
event.
time_frame: 28d
data_source: redis_hll
instrumentation_class: RedisHLLMetric
options:
events:
- i_quickactions_approve
Availability-restrained Redis HyperLogLog metrics
If the Redis HyperLogLog metric should only be available in the report under some conditions, then you must specify these conditions in a new class that is a child of the RedisHLLMetric
class.
# frozen_string_literal: true
module Gitlab
module Usage
module Metrics
module Instrumentations
class MergeUsageCountRedisHLLMetric < RedisHLLMetric
available? { Feature.enabled?(:merge_usage_data_missing_key_paths) }
end
end
end
end
end
You must also use the class's name in the YAML setup.
time_frame: 28d
data_source: redis_hll
instrumentation_class: MergeUsageCountRedisHLLMetric
options:
events:
- i_quickactions_approve
Aggregated metrics
The aggregated metrics feature provides insight into the number of data attributes, for example pseudonymized_user_ids
, that occurred in a collection of events. For example, you can aggregate the number of users who perform multiple actions such as creating a new issue and opening
a new merge request.
You can use a YAML file to define your aggregated metrics. The following arguments are required:
-
options.events
: List of event names to aggregate into metric data. All events in this list must use the same data source. Additional data source requirements are described in Database sourced aggregated metrics and Redis sourced aggregated metrics. -
options.aggregate.operator
: Operator that defines how the aggregated metric data is counted. Available operators are:-
OR
: Removes duplicates and counts all entries that triggered any of the listed events. -
AND
: Removes duplicates and counts all elements that were observed triggering all of the following events.
-
-
options.aggregate.attribute
: Information pointing to the attribute that is being aggregated across events. -
time_frame
: One or more valid time frames. Use these to limit the data included in aggregated metrics to events within a specific date-range. Valid time frames are:-
7d
: The last 7 days of data. -
28d
: The last 28 days of data. -
all
: All historical data, only available fordatabase
sourced aggregated metrics.
-
-
data_source
: Data source used to collect all events data included in the aggregated metrics. Valid data sources are:
Refer to merge request 98206 for an example of a merge request that adds an AggregatedMetric
metric.
Count unique user_ids
that occurred in at least one of the events: incident_management_alert_status_changed
,
incident_management_alert_assigned
, incident_management_alert_todo
, incident_management_alert_create_incident
.
time_frame: 28d
instrumentation_class: AggregatedMetric
data_source: redis_hll
options:
aggregate:
operator: OR
attribute: user_id
events:
- `incident_management_alert_status_changed`
- `incident_management_alert_assigned`
- `incident_management_alert_todo`
- `incident_management_alert_create_incident`
Availability-restrained Aggregated metrics
If the Aggregated metric should only be available in the report under specific conditions, then you must specify these conditions in a new class that is a child of the AggregatedMetric
class.
# frozen_string_literal: true
module Gitlab
module Usage
module Metrics
module Instrumentations
class MergeUsageCountAggregatedMetric < AggregatedMetric
available? { Feature.enabled?(:merge_usage_data_missing_key_paths) }
end
end
end
end
end
You must also use the class's name in the YAML setup.
time_frame: 28d
instrumentation_class: MergeUsageCountAggregatedMetric
data_source: redis_hll
options:
aggregate:
operator: OR
attribute: user_id
events:
- `incident_management_alert_status_changed`
- `incident_management_alert_assigned`
- `incident_management_alert_todo`
- `incident_management_alert_create_incident`
Numbers metrics
-
operation
: Operations for the givendata
block. Currently we only supportadd
operation. -
data
: ablock
which contains an array of numbers. -
available?
: Specifies whether the metric should be reported. The default istrue
.
# frozen_string_literal: true
module Gitlab
module Usage
module Metrics
module Instrumentations
class IssuesBoardsCountMetric < NumbersMetric
operation :add
data do |time_frame|
[
CountIssuesMetric.new(time_frame: time_frame).value,
CountBoardsMetric.new(time_frame: time_frame).value
]
end
end
end
end
end
end
end
You must also include the instrumentation class name in the YAML setup.
time_frame: 28d
instrumentation_class: IssuesBoardsCountMetric
Generic metrics
You can use generic metrics for other metrics, for example, an instance's database version. Observations type of data will always have a Generic metric counter type.
-
value
: Specifies the value of the metric. -
available?
: Specifies whether the metric should be reported. The default istrue
.
Example of a merge request that adds a generic metric.
module Gitlab
module Usage
module Metrics
module Instrumentations
class UuidMetric < GenericMetric
value do
Gitlab::CurrentSettings.uuid
end
end
end
end
end
end
Support for instrumentation classes
There is support for:
-
count
,distinct_count
,estimate_batch_distinct_count
,sum
, andaverage
for database metrics. - Redis metrics.
- Redis HLL metrics.
-
add
for numbers metrics. - Generic metrics, which are metrics based on settings or configurations.
There is no support for:
-
add
,histogram
for database metrics.
You can track the progress to support these.
Create a new metric instrumentation class
To create a stub instrumentation for a Service Ping metric, you can use a dedicated generator:
The generator takes the class name as an argument and the following options:
-
--type=TYPE
Required. Indicates the metric type. It must be one of:database
,generic
,redis
,numbers
. -
--operation
Required fordatabase
&numbers
type.- For
database
it must be one of:count
,distinct_count
,estimate_batch_distinct_count
,sum
,average
. - For
numbers
it must be:add
.
- For
-
--ee
Indicates if the metric is for EE.
rails generate gitlab:usage_metric CountIssues --type database --operation distinct_count
create lib/gitlab/usage/metrics/instrumentations/count_issues_metric.rb
create spec/lib/gitlab/usage/metrics/instrumentations/count_issues_metric_spec.rb
Migrate Service Ping metrics to instrumentation classes
This guide describes how to migrate a Service Ping metric from lib/gitlab/usage_data.rb
or ee/lib/ee/gitlab/usage_data.rb
to instrumentation classes.
- Choose the metric type:
-
Determine the location of instrumentation class: either under
ee
or outsideee
. -
Fill the instrumentation class body:
- Add code logic for the metric. This might be similar to the metric implementation in
usage_data.rb
. - Add tests for the individual metric
spec/lib/gitlab/usage/metrics/instrumentations/
. - Add tests for Service Ping.
- Add code logic for the metric. This might be similar to the metric implementation in
-
Remove the code from
lib/gitlab/usage_data.rb
oree/lib/ee/gitlab/usage_data.rb
. -
Remove the tests from
spec/lib/gitlab/usage_data.rb
oree/spec/lib/ee/gitlab/usage_data.rb
.
Troubleshoot metrics
Sometimes metrics fail for reasons that are not immediately clear. The failures can be related to performance issues or other problems. The following pairing session video gives you an example of an investigation in to a real-world failing metric.