Skip to content

OpenTelemetry: A modern observability standard

Blog thumbnail 2022 11 24 2

OpenTelemetry

Please check out our first article on observability to gain a fuller context for the topic we’re about to discuss. OpenTelemetry is currently the most actively developed standard in the field of observability. It is being adopted as the Cloud Native Computing Foundation incubating project. Born primarily as a merging of former OpenTracing and OpenCensus standards, OpenTelemetry continues to gain popularity, with its supporters including representatives of Google, Microsoft, and Uber.

The goal of the OpenTelemetry project is to introduce a standardized open solution for any development team to enable a proper observability layer in its project. OpenTelemetry provides a standard protocol description for metrics, tracing, and logging collection. It also collects APIs under its nest instrumentation for different target languages and data infrastructure components.

Below is a visualization of the overall scope of OpenTelemetry (credits to CNCF):

The development of specifications and all related implementations is being run in an open way in Github, so anyone involved can propose changes.

Different instrumentation implementations for different languages are in development. The current state of readiness can always be found on a related page of official documentation (for example, PHP).

Logs

Logs are the oldest and best-known type of telemetry signals, and they have a significant legacy. Log collection and storage is a well-understood task, with many solutions being established and widely adopted to carry it out. For example, the infamous ELK (or EFK) stack, Splunk, and Grafana Labs recently introduced the Loki project, a lighter alternative to ElasticSearch.

The main problem is that logs are not integrated with other telemetry signals – no solutions offer an option to correlate a log record with a relative metric or trace. Having the opportunity to do this can form a very powerful introspection framework.

OpenTelemetry specifications try to solve this problem with a logging format standard proposal. It allows correlating logs via execution context metadata, timing, or a log emitter source.

However, right now the standard is at an experimental stage and under heavy development, so we won’t focus on it here. The current specifications can be found here.

Metrics

As discussed previously, metrics are numeric data aggregates representing the software system’s performance. Through aggregation, we can develop a combination of measurements into exact statistics during a time window.

The OpenTelemetry metrics system is flexible. It was designed to be like this to cover the existing metric systems without any loss of functionality. As a result, a move to OpenTelemetry is less painful than other alternatives.

The OpenTelemetry standard defines three metrics models:

  • Event model — metric creation by a developer on the application level.

  • Stream model — metric transportation.

  • Time Series model — metric storage.

The metrics standard defines three metric transformations that can happen in between the Event and Stream models:

  • Temporal reaggregation reduces the number of high frequency metrics being transmitted by changing the resolution of the data.

  • Spatial reaggregation reduces the number of high frequency metrics being transmitted by removing some unwanted attributes and data.

  • Delta-to-cumulative reduces the size of high frequency metrics being transmitted via a move from absolute numbers (cumulative) to changes between different values (delta).

We will talk about the Stream and Time Series models in the third part of our blog series, where we will discuss signal transportation and storage. For now, let’s focus on the Event model, which is related to instrumentation.

 

 

The process of creation for every metric in OpenTelemetry consists of three steps:

  • Creation of instruments that will generate measurements – particular data points that we evaluate.

  • Aggregation of measurements into a View – a representation of a metric to output from the instrumented software system.

  • Metric output – the transportation metrics to storage using a push or pull model.

The OpenTelemetry measurements model defines six types:

  1. Counter – non-negative, continually increasing monotonic measurement that receives increments. For example, it may be a good fit for counting the overall number of requests the system has processed.

  2. UpDownCounter – the same as the Counter, but non-monotonic, allowing negative values. It may be a good fit for reporting the amount of requests being currently processed by the system.

  3. Histogram – multiple statistically relevant values distributed among a list of predefined buckets. For example, we may be interested not in particular response time but in the percentile of response time distribution, it falls into (a Histogram would be useful here).

  4. Asynchronous Counter – the same as the Counter, but values are emitted via a registered callback function, not a synchronous function call.

  5. Asynchronous UpDownCounter – the same as the UpDownCounter, but values are emitted via a registered callback function, not a synchronous function call.

  6. Asynchronous Gauge – a specific type for values that should be reported as is, not summed. For example, it may be a good fit for reporting the usage of multiple CPU cores – in this case, you will likely want to have the maximum (or average) CPU usage, not summed usage.

Through Aggregations in OpenTelemetry, measurements are being aggregated into end metric values that afterward will be transported to storage. OpenTelemetry defines the following measurements as Aggregations:

  • Drop – full ignore of all measurements.

  • Sum – a sum of measurements.

  • Last Value – only the last measurement value.

  • Explicit Bucket Histogram – a collection of measurements into buckets with explicitly predefined bounds.

  • Exponential Histogram (optional) – the same as the Explicit Bucket Histogram but with an exponential formula defining bucket bounds.

A developer can define their own aggregations, but in most cases, the default ones predefined for each type of measurement will suit the developer’s needs.

After all aggregations have been done, additional filtering or customization can be carried out on the View level. To summarize, an example of a simple metric creation is the following (in GoLang):

import “go.opentelemetry.io/otel/metric/instrument”
counter := Meter.SyncInt64().Counter(
“test.counter”,
instrument.WithUnit(“1”),
instrument.WithDescription(“Test Counter”),
)

// Synchronously increment the counter.
counter.Add(ctx, 1, attribute.String(“attribute_name”, “attribute_value”))

Here we create a simple metric consisting of one counter-measurement. As you can see, many details we discussed are hidden but can be exposed if the developer needs them.

In the next part of our blog series, we will talk about metrics transportation, storage, and visualization.

Traces and spans

As we discussed previously, traces represent an execution path inside a software system. The execution path itself is a series of operations. A unit of operation is represented in the form of a span. A span has a start time, duration, an operation name, and additional context attached to it. Spans are interconnected via context propagation and can be nested (one operation can consist of multiple smaller operations inside itself). The resulting hierarchical tree structure of spans represents the trace – an entire execution path inside a software system.

The internal span structure can be visualized like this:

Here is an example of the simplest span creation (in GoLang):

import “go.opentelemetry.io/otel/trace”

var tracer = otel.Tracer(“test_app”)

// Create a span
ctx, span := tracer.Start(ctx, “test-operation-name”,
trace.WithSpanKind(trace.SpanKindServer))

testOperation()

// Add attributes
if span.IsRecording() {
span.SetAttributes(
attribute.Int64(“test.key1”, 1),
attribute.String(“test.key2″,”2”),
)
}

// End the span
span.End()

Now we have our first trace.

A trace can be distributed through different software microservices. In this case, so as not to lose the interconnection, OpenTelemetry SDK can automatically propagate context through the network according to the protocol being used. One example is the W3C Trace Context HTTP headers definition. However, not all language SDKs support automatic context propagation, so you may have to instrument it manually depending on the language you use.

Detailed documentation about traces with format explanations can be found here.

Signal interconnections

The ability to interconnect different types of signals makes an observability framework powerful. For example, it allows you to identify a service response that took too long via metrics and, in one click, jump to the correlating trace of this response execution to identify what part of the system caused the slow processing.

Signals in OpenTelemetry can be interconnected in a couple of ways. One is the use of Exemplars – specific values supplied with trace, logs, and metrics. These consist of a particular record ID, time of observation, and optional filtered attributes specifically dedicated to allowing a direct connection between traces and metrics. Detailed documentation about Exemplars can be found here.

Another approach to signal interconnection is the association of the same metadata with the use of Baggage and Context. Baggage is a specific value supplied with traces, logs, and metrics that allows you to annotate it and consists of user-defined pairs of keys and values. By annotating corresponding metrics and traces with the same values in Baggage, the user can correlate them. Detailed documentation about Baggage can be found here.

Conclusion

We covered the pillars of OpenTelemetry and some details of application instrumentation. But we don’t just need to instrument our applications – we should also introduce tooling for the aggregation, storage, and visualization of the signals we supply. In the third part of this series, we will discuss tooling and the OpenTelemetry collector component in detail.

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

About Nord Security
The web has become a chaotic space where safety and trust have been compromised by cybercrime and data protection issues. Therefore, our team has a global mission to shape a more trusted and peaceful online future for people everywhere.

How to Not Fall Victim to Browser Vulnerabilities

JumpCloud’s Universal Chrome Browser Patch Management

Browsers are the gateway to online productivity. 

Without them, we would not be able to get work done. To that end, they are also one of the biggest attack targets for bad actors. If we are not careful, and do not make a conscious effort to upkeep web browser security, hackers can easily exploit browser vulnerabilities. 

What makes browsers especially appealing to these individuals? Browsers access, collect, and hold lots of sensitive data — from personal credentials to company information — that cyber hackers can sell on the dark web and use to blackmail companies.

According to Atlas VPN, Google Chrome, the world’s most popular browser, has the highest number of reported (303) vulnerabilities year to date. Google Chrome also has a total of 3,159 cumulative vulnerabilities since its public release. 

In this article, we’ll dive into the topic of browser vulnerabilities, the importance of patch management, and how to streamline protection.

Atlas VPN top web browsers by vulnerability graph
Image courtesy of Atlas VPN

A Closer Look at Google Chrome’s Latest Vulnerabilities

On November 8, 2022, the Center for Internet Security (CIS) reported finding multiple vulnerabilities in Google Chrome. 

The most severe vulnerability within this group could potentially allow for arbitrary code execution in the context of the logged on user. What does that mean? 

Depending on a user’s privileges, an attacker could install programs and view, change, or delete data. The bad actor could even create new accounts with full user rights! 

Of course, users whose accounts have minimal user rights on the system would be less impacted than those with administrative user rights.

Multi-OS systems were affected, including:

  • Google Chrome versions prior to 107.0.5304.110 for Mac
  • Google Chrome versions prior to 107.0.5304.110 for Linux
  • Google Chrome versions prior to 107.0.5304.106/.107 for Windows

First and foremost, CIS recommends applying appropriate updates provided by Google to vulnerable systems immediately after appropriate testing. See here for all the other CIS recommended actions. 

The Need for Browser Patching 

Here are the key reasons you should regularly update or patch your browsers:

  • Enhance Security: Prevention of spyware, malware, and other viruses that could give someone access to your data or trick you into handing it over.
  • Improve Functionality: Outdated browsers might not work (well) or support new apps or software.
  • Boost User Experience: Older browsers usually do not support the latest and greatest code and will have trouble loading component files in the website. This might cause a website to freeze, crash or take forever to work.

For IT admins, security aspects are probably the most important reason to patch browsers. Keeping browsers updated with the latest version (i.e., downloading and installing all provided patches) goes a long way toward preventing cyber attacks and bad actors from exploiting known vulnerabilities. 

How to Create Default Chrome Browser Patch Policies

One of the easiest ways to stay on top of patches, and reduce browser vulnerability risk, is to use the JumpCloud Directory Platform. 

The latest capability addition to our Patch Management solution provides a universal policy to keep Google Chrome up to date for macOS, Windows, and Linux. 

A universal policy saves time by automatically scheduling and enforcing Chrome security patches on a large number of managed devices.

Screenshot of JumpCloud Policy Management Console 
JumpCloud Policy Management Console 

The platform’s four universal preconfigured default Chrome browser patch policies allow admins to deploy browser updates with different levels of urgency. Admins also have the option to configure a custom universal policy; this feature allows for easy modification of existing policy settings to tailor update experiences to organizational needs. 

The four JumpCloud default Chrome browser patch management policies control how and when a Chrome update is applied. The recommended deployment strategies include:

  • Day Zero: Deploy automated upgrades inside your IT Department the first day an update is available.
  • Early Adoption: Deploy automated upgrades to early adopters outside of IT.
  • General Adoption: Deploy automated upgrades to general users in your company.
  • Late Adoption: Deploy automated upgrades to remaining users in your company.

Once you have created a Chrome browser patch policy, you can assign it to any devices, policy groups, or device groups. A policy group helps quickly and efficiently roll out existing policies to large numbers of similar devices. 

Capabilities of JumpCloud Browser Patch Management

JumpCloud’s new Browser Patch Management also introduces the following features:

  • Enforce Chrome updates and browser relaunch. 
  • Enforce or disable Chrome Browser Sign In Settings.
  • Restrict sign-in to a regex pattern to ensure users sign in via company email accounts.
  • Automate device enrollment into Google Chrome Browser Cloud Management, which unlocks limitless capabilities for browser and extension control within the Google Admin console. 

Dive deeper into the new Universal Chrome Browser Patch Management Release by exploring the release notes for this feature in the JumpCloud Community. 

Learn More About JumpCloud

The good news? Browser patching and patch management are included in JumpCloud’s affordable A La Carte pricing package. 

Try JumpCloud for free for up to 10 devices and 10 users. 

Complimentary support is available 24×7 within the first 10 days of account creation.

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

About JumpCloud
At JumpCloud, our mission is to build a world-class cloud directory. Not just the evolution of Active Directory to the cloud, but a reinvention of how modern IT teams get work done. The JumpCloud Directory Platform is a directory for your users, their IT resources, your fleet of devices, and the secure connections between them with full control, security, and visibility.

×

Hello!

Click one of our contacts below to chat on WhatsApp

×