Kora облачно ориентированный серверный фреймворк написанный на Java для написания Java / Kotlin приложений с упором на производительность, эффективность, прозрачность сделанный выходцами из Т-Банк / Тинькофф

Kora is a cloud-oriented server-side Java framework for writing Java / Kotlin applications with a focus on performance, efficiency and transparency

Skip to content

Metrics

Module for collecting application metrics using Micrometer.

Requires private HTTP server module added to provide metrics in prometheus format.

Dependency

Dependency build.gradle:

implementation "ru.tinkoff.kora:micrometer-module"

Module:

@KoraApp
public interface Application extends MetricsModule { }

Dependency build.gradle.kts:

implementation("ru.tinkoff.kora:micrometer-module")

Module:

@KoraApp
interface Application : MetricsModule

Configuration

Example of HTTP server path configuration for retrieving metrics described in the HttpServerConfig class (default values are specified):

httpServer {
    privateApiHttpMetricsPath = "/metrics" //(1)!
}
  1. Path to get metrics in prometheus format (if HTTP server module is added):
httpServer:
  privateApiHttpMetricsPath: "/metrics" #(1)!
  1. Path to get metrics in prometheus format (if HTTP server module is added):

Example of the complete configuration described in the MetricsConfig class (default values are specified):

metrics {
    opentelemetrySpec = "V120" //(1)!
}
  1. OpenTelemetry standard metrics format (available values: V120 / V123)
metrics:
  opentelemetrySpec: "V120" #(1)!
  1. OpenTelemetry standard metrics format (available values: V120 / V123)

Metrics collection configuration parameters are described in modules where metrics collection is present, e.g. HTTP server, HTTP client, etc.

Usage

We follow and encourage to use the notation described in the specification.

Once the Metrics.globalRegistry module is connected, the PrometheusMeterRegistry will be registered and used in all components that collect metrics.

Personalization

In order to make changes to the PrometheusMeterRegistry configuration, you need to add to the PrometheusMeterRegistryInitializer container.

Important, PrometheusMeterRegistryInitializer is applied only once when the application is initialized.

For example, we want to add a common tag for all metrics:

@Module
public interface MetricsConfigModule {
    default PrometheusMeterRegistryInitializer commonTagsInit() {
        return registry -> {
            registry.config().commonTags("tag", "value");
            return registry;
        };
    }
}
@Module
interface MetricsConfigModule {
    fun commonTagsInit(): PrometheusMeterRegistryInitializer? {
        return PrometheusMeterRegistryInitializer {
            it.config().commonTags("tag", "value")
            it
        }
    }
}

Standard metrics have some configurations such as ServiceLayerObjectives for Distribution summary metrics. The configuration field names can be viewed in ru.tinkoff.kora.micrometer.module.MetricsConfig.

Standard

The original metrics format used the OpenTelemetry V120 standard, after Kora 1.1.0 it became possible to provide metrics in the OpenTelemetry V123 standard, a partial list of changes can be seen in the OpenTelemetry documentation and OpenTelemetry migration guidelines

Metrics Reference

All Kora metrics use OpenTelemetry semantic conventions for naming and tags.

Micrometer metric types used:

  • DistributionSummary — used for collecting distributions of arbitrary values. This metric type enables efficient data visualization across buckets and percentile calculation.
  • Counter — monotonically increasing counter
  • Gauge — current metric value

HTTP Server

Metric Prometheus Type Description Tags
http.server.request.duration http_server_request_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary HTTP server request processing duration http.request.method, http.response.status_code, http.route, url.scheme, server.address, error.type
http.server.active_requests http_server_active_requests Gauge Number of active HTTP requests http.request.method, http.route, server.address, url.scheme

See HTTP Server module documentation for more details.

HTTP Client

Metric Prometheus Type Description Tags
http.client.request.duration http_client_request_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary HTTP client request duration http.request.method, http.response.status_code, server.address, url.scheme, http.route, error.type

See HTTP Client module documentation for more details.

Database

Metric Prometheus Type Description Tags
db.client.request.duration db_client_request_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Database operation/query duration db.pool.name, db.statement, db.operation, error.type

See Database module documentation for more details.

Kafka

Metric Prometheus Type Description Tags
messaging.receive.duration messaging_receive_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Single message processing duration messaging.system, messaging.destination, messaging.operation, error.type
messaging.publish.duration messaging_publish_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Message send duration messaging.system, messaging.destination, messaging.partition_id, error.type
messaging.process.batch.duration messaging_process_batch_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Message batch processing duration messaging.system, messaging.destination, error.type
messaging.kafka.consumer.lag messaging_kafka_consumer_lag Gauge Consumer lag per partition messaging.system, messaging.destination, messaging.partition_id, messaging.consumer_group

See Kafka module documentation for more details.

gRPC Server

Metric Prometheus Type Description Tags
rpc.server.duration rpc_server_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary gRPC server call processing duration rpc.service, rpc.method, rpc.status, error.type
rpc.server.requests_per_rpc rpc_server_requests_per_rpc_total Counter Number of requests received per RPC rpc.service, rpc.method
rpc.server.responses_per_rpc rpc_server_responses_per_rpc_total Counter Number of responses sent per RPC rpc.service, rpc.method

See gRPC Server module documentation for more details.

gRPC Client

Metric Prometheus Type Description Tags
rpc.client.duration rpc_client_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary gRPC client call duration rpc.service, rpc.method, rpc.status, error.type, server.address
rpc.client.requests_per_rpc rpc_client_requests_per_rpc_total Counter Number of requests sent per RPC rpc.service, rpc.method, server.address
rpc.client.responses_per_rpc rpc_client_responses_per_rpc_total Counter Number of responses received per RPC rpc.service, rpc.method, server.address

See gRPC Client module documentation for more details.

SOAP Client

Metric Prometheus Type Description Tags
rpc.client.duration rpc_client_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary SOAP client call duration rpc.system, rpc.service, rpc.method, rpc.result, server.address, server.port

See SOAP Client module documentation for more details.

Scheduling

Metric Prometheus Type Description Tags
scheduling.job.duration scheduling_job_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Scheduled job execution duration code.class, code.function, error.type

See Scheduling module documentation for more details.

Cache

Metric Prometheus Type Description Tags
cache.duration cache_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Cache operation duration (GET, SET, DELETE, etc.) cache, operation, origin, status
cache.ratio cache_ratio_total Counter Cache hit/miss counter cache, origin, type

Standard Micrometer metrics are automatically registered when using Caffeine:

Metric Prometheus Type Description
cache.gets cache_gets_total Counter Number of cache requests
cache.puts cache_puts_total Counter Number of cache writes
cache.evictions cache_evictions_total Counter Number of cache evictions
cache.size cache_size Gauge Current cache size

See Cache module documentation for more details.

Redis / Lettuce

Metric Prometheus Type Description Tags
lettuce.command.completion.duration lettuce_command_completion_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Redis command completion duration type, remote, local, command, error.type
lettuce.command.firstresponse.duration lettuce_command_firstresponse_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Redis command first response duration type, remote, local, command, error.type

Resilience

Metric Prometheus Type Description Tags
resilient.circuitbreaker.state resilient_circuitbreaker_state Gauge Circuit breaker state (0=CLOSED, 1=HALF_OPEN, 2=OPEN) name
resilient.circuitbreaker.transition resilient_circuitbreaker_transition_total Counter Circuit breaker state transitions name, state
resilient.circuitbreaker.call.acquire resilient_circuitbreaker_call_acquire_total Counter Circuit breaker call acquire attempts/rejections name, state, status
resilient.retry.attempts resilient_retry_attempts_total Counter Number of retry attempts name
resilient.retry.exhausted resilient_retry_exhausted_total Counter Number of exhausted retries name
resilient.timeout.exhausted resilient_timeout_exhausted_total Counter Number of timeouts name
resilient.fallback.attempts resilient_fallback_attempts_total Counter Number of fallback invocations name, type

See Resilience module documentation for more details.

JMS

Metric Prometheus Type Description Tags
messaging.receive.duration messaging_receive_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary JMS message receive duration messaging.system, messaging.destination.name, error.type

S3 Client

Metric Prometheus Type Description Tags
s3.client.duration s3_client_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary S3 HTTP request duration aws.s3.bucket, aws.operation.name, error.type
s3.kora.client.duration s3_kora_client_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Kora S3 client operation duration aws.client.name, aws.s3.bucket, aws.operation.name, error.type

See S3 Client module documentation for more details.

Camunda 7 BPMN

Metric Prometheus Type Description Tags
camunda.engine.delegate.duration camunda_engine_delegate_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Camunda BPMN Java delegate execution duration delegate, business.key, error.type
camunda.engine.delegate.active_requests camunda_engine_delegate_active_requests Gauge Number of active delegate executions delegate, business.key

See Camunda 7 BPMN module documentation for more details.

Camunda REST

Metric Prometheus Type Description Tags
camunda.rest.server.request.duration camunda_rest_server_request_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Camunda REST request duration http.request.method, http.response.status_code, http.route, url.scheme, server.address, error.type
camunda.rest.server.active_requests camunda_rest_server_active_requests Gauge Number of active Camunda REST requests http.route, http.request.method, server.address, url.scheme

See Camunda 7 REST module documentation for more details.

Camunda 8 Worker

Metric Prometheus Type Description Tags
zeebe.worker.handler.duration zeebe_worker_handler_duration_milliseconds / _count / _sum / _bucket / _max DistributionSummary Zeebe worker job handler duration job.name, job.type, status, error, error.code
zeebe.worker.handler zeebe_worker_handler_total Counter Zeebe worker error counter job.name, job.type, status, error.code
zeebe.client.worker.job zeebe_client_worker_job_total Counter Number of activated/handled Zeebe jobs action, type

See Camunda 8 Worker module documentation for more details.

System

Metric Prometheus Type Description Tags
kora.up kora_up Gauge Framework status indicator (value = 1) version

JVM

Standard JVM metrics are collected automatically via Micrometer:

Metric Prometheus Type Description Tags
jvm.gc.pause jvm_gc_pause_milliseconds / _count / _sum / _max DistributionSummary GC pause duration action, cause
jvm.gc.memory.allocated jvm_gc_memory_allocated_bytes_total Counter Allocated memory size
jvm.gc.memory.promoted jvm_gc_memory_promoted_bytes_total Counter Memory promoted to old gen
jvm.gc.max.data.size jvm_gc_max_data_size_bytes Gauge Max old gen size
jvm.gc.live.data.size jvm_gc_live_data_size_bytes Gauge Old gen size after full GC
jvm.memory.used jvm_memory_used_bytes Gauge Used memory area, id
jvm.memory.committed jvm_memory_committed_bytes Gauge Committed JVM memory area, id
jvm.memory.max jvm_memory_max_bytes Gauge Max available memory area, id
jvm.threads.live jvm_threads_live_threads Gauge Number of live threads
jvm.threads.daemon jvm_threads_daemon_threads Gauge Number of daemon threads
jvm.threads.peak jvm_threads_peak_threads Gauge Peak thread count
jvm.threads.states jvm_threads_states_threads Gauge Thread count by state state
process.cpu.usage process_cpu_usage Gauge Process CPU usage
system.cpu.usage system_cpu_usage Gauge System CPU usage
system.cpu.count system_cpu_count Gauge Number of available processors
logback.events logback_events_total Counter Logging event count level
jvm.classes.loaded jvm_classes_loaded_classes Gauge Number of loaded classes
jvm.classes.unloaded jvm_classes_unloaded_classes_total Counter Number of unloaded classes
process.files.open process_files_open_files Gauge Number of open file descriptors
process.files.max process_files_max_files Gauge Max file descriptors
process.uptime process_uptime_milliseconds Gauge Process uptime