Home > Best Practices in Cloud Observability

Observability With Google Cloud: A Strong Solution That’s Stronger With Proven Partners

Introduction
Observability is increasingly important to modern IT operations because it enables proactive collection, aggregation, and correlation of information about applications running on cloud infrastructure. It is no longer sufficient for organizations to use dashboards to monitor system performance or to view operational performance solely through point tools such as application performance monitoring or log management.

This is particularly true in an era of skyrocketing data volumes and the de facto standardization on multi-cloud architectures to support business- and mission-critical workloads. In fact, cloud observability is now widely accepted in both large enterprises and midmarket organizations, according to TechTarget’s Enterprise Strategy Group. More than three-quarters of organizations surveyed currently employ an observability practice, according to Enterprise Strategy Group1

Figure 1. Majority of Organizations Are Actively Engaged in Observability

figure1
Source: Enterprise Strategy Group, a division of TechTarget, Inc.

Google Cloud operations suite offers a broad suite of observability tools and resources and has helped a wide range of organizations gain essential insight into the behavior of their cloud infrastructure and applications, which is simply not attainable with traditional visibility. Combined with the tools and expertise of Google Cloud’s many ISVs, Google Cloud acts as an effective and efficient foundation for enterprise-class observability.

Driving Forces for Observability
As cloud observability continues to rise in alignment with the dramatic swing toward multi-cloud adoption, organizations find themselves awash in data, especially unstructured data that has heightened the challenge to understand how that data and its related workloads are behaving. The massive uptick in data volumes, combined with a wide span of data formats, makes traditional visibility techniques insufficient.

Research from Enterprise Strategy Group lays out this challenge starkly:

  • 72% agree that the number of tools used for monitoring and observability adds complexity to their environment.
  • 74% say troubleshooting systems for data and application behavior accounts for a substantial part of their organizational resources.
  • 69% point out that their observability data is growing at a concerning rate.

Organizations also are using observability techniques to address a number of other issues designed to accelerate and improve operational excellence and provide better insights for IT and business leaders. These include:

  • Delivering insights into application and infrastructure environments that assist with tracing, root cause analysis, accelerated fault isolation, and problem resolution.
  • Enhancing security, risk management, compliance, and governance postures.
  • Creating a pathway for improved and expanded automation that relieves the traditional burden of manual monitoring and intervention.

For these and other reasons, organizations are moving aggressively to adopt a more strategic view of observability into their IT operations. One approach is the use of full-stack observability, which involves monitoring and understanding the entire software stack, from the application layer down to the infrastructure layer. Enterprise Strategy Group research indicates that 43% of organizations already have deployed full-stack observability, with another 41% saying they either are in a proof-of-concept stage or are planning to implement it within the next 12 months.

There is also a change underway in how organizations approach visibility, moving from a point-solution mindset to a more broad-based observability framework across a wider range of applications. This latter approach does not yet have widespread adoption due to a number of organization-specific issues, such as budget limitations and a lack of in-house skills. The benefits of a modernized and more comprehensive view of observability are numerous and tangible. According to Enterprise Strategy Group research, the most frequently cited benefits of well-executed observability strategies are improved security detection and response; upgraded service-level agreement (SLA) performance; better alignment among IT teams, developers, and security teams; and improved operational costs.

Figure 2. Top 5 Most Impactful Realized Observability Strategy Benefits

figure 2
Source: Enterprise Strategy Group, a division of TechTarget, Inc.

Taking a more strategic, modernized, and automated approach to observability delivers another key benefit: freeing up IT and business teams to spend more time innovating. Organizations with well-defined and broadly deployed observability programs typically spend 4% more time on new software development, 3% more on modernizing applications to be cloud-ready, and 6% less time on maintaining and troubleshooting. While single-digit percentage improvements might not seem meaningful at first glance, those efficiencies can yield big dividends in the aggregate, typically in the form of improved sales, profitability, and customer satisfaction.

Observability Challenges
While observability adoption and utilization are on the rise, many organizations still have work to do in order to maximize the short- and long-term benefits of observability. Critical capabilities such as scalability, reliability, visibility into the edge and remote locations, preparing existing applications and infrastructure for observability, and insight into cloud-native and container-based applications all are potential bottlenecks that must be understood and addressed with the right observability solution, according to Enterprise Strategy Group research.

Figure 3. Top 5 Observability Solution Challenges

figure3
Source: Enterprise Strategy Group, a division of TechTarget, Inc.

The good news is that organizations are partnering with cloud service providers such as Google Cloud to put together plans for enterprise observability that is optimized for the cloud-first, cloud-native, and multi-cloud environments that are increasingly the norm.

But as application, workload, and infrastructure demands accelerate, today’s widely heterogeneous IT environments become increasingly complex to monitor and manage without the introduction of more sophisticated techniques and tools. For instance, Google Cloud provides a wide range of native observability tools, such as monitoring, logging, and tracing. Rapidly evolving organizations’ multi-cloud frameworks, however, require an even wider array of capabilities.

Many organizations have sought to address this growing demand for more observability by amassing and implementing more observability tools. Goals such as coping with rapidly expanding storage requirements, optimizing log volumes, migrating data to lower-cost platforms, and shortening log retention periods all are important for IT teams to have in their next observability framework, creating more and more specialized tools.

In fact, observability “tool sprawl” is a real and demanding issue for IT teams to overcome. According to Enterprise Strategy Group research, 85% of organizations indicate they are already using at least six different observability tools, with 56% using more than 10 tools.

Figure 4. Amount of Observability Tools Used by Organizations

figure 4
Source: Enterprise Strategy Group, a division of TechTarget, Inc.

That’s why working with a leading cloud service provider that has a wide range of experienced, market-proven ISVs is essential. Those observability-focused ISVs provide unique capabilities to meet the needs of individual organizations, which can go a long way toward reducing complexity, cost, and the risk of “do-it-yourself observability puzzles.”

Google Cloud and Its Observability Partner Ecosystem
When planning observability strategies and evaluating observability solution partners, it’s important to keep in mind that observability usage is synonymous with multi-cloud environments. Enterprise Strategy Group research indicates that, among organizations using at least three different cloud service providers, 88% have modern observability practices in place.

This means that observability solutions must incorporate a wider range of functions, especially automated and intelligent analysis and alerting of anomalous data and application behavior. This is a strength of Google Cloud, which offers the tools and resources to enable observability at scale, including:

  • Migrating virtual machines directly to Google Compute Engine (GCE) to optimize budgets and performance.
  • Building and deploying applications in Google Kubernetes Engine (GKE) and Anthos serverless landing zones.
  • Ensuring consistent development and operations experiences for hybrid cloud and multi-cloud environments.

Still, the growing demand for application and infrastructure performance in increasingly complex multi-cloud settings means that even more is needed.

Organizations working with Google Cloud as their strategic cloud development and deployment environment benefit from Google’s extensive third-party network of observability-proficient ISVs. These ISVs are highly experienced and certified in Google Cloud, understanding both the technical underpinnings of the platform and the best practices necessary to turn multi-cloud observability into a strategic advantage.

While Google Cloud works with a number of qualified ISVs in the observability space, three stand out: Dynatrace, Datadog, and Chronosphere. Working tightly in concert with Google Cloud’s teams of developers, engineers, architects, and application support staff, these and other ISVs help organizations achieve their ultimate goal of end-to-end application and infrastructure visibility in a multi-cloud world.

Dynatrace
Dynatrace offers what it calls a fully automated, AI-assisted observability solution across Google Cloud and hybrid cloud environments. Dynatrace’s observability platform has been engineered to address the full gamut of observability requirements, including infrastructure monitoring, applications and microservices, application security, digital experience, business analytics, and automation.

This span of capabilities and use cases makes it applicable to a wide range of audiences, including developers, operations teams, and line-of-business groups. The platform is built upon a technology foundation that includes its OneAgent technology, which needs to be deployed only once on a host in order to collect relevant metrics and to automatically discover new services and technologies that come online without additional instrumentation. It also includes Smartscape dynamic topology mapping to identify dependencies between applications and infrastructure.

Dynatrace also uses Davis AI, an AI engine that automates the delivery of precise answers. These and other related technology building blocks have helped Dynatrace deploy observability solutions that dramatically reduced multi-cloud complexity for retailers such as Rack Room Shoes, which saw dramatic increases in converting shoppers into customers using Dynatrace in a Google Cloud environment by automatically monitoring and optimizing the user journey for a flawless digital experience.

Working closely with Google Cloud, Dynatrace builds and delivers fully automated, AI-assisted observability across Google Cloud and hybrid environments. Dynatrace enables a single view across the full Google Cloud ecosystem, including GCE, GKE, Anthos, and a range of other Google Cloud environments.

Datadog
Datadog’s expertise is “cloud monitoring as a service” with an active, widely adopted solutions portfolio for Google Cloud environments. It offers full observability for dynamic Google Cloud environments, collecting and unifying all data streaming from Google Cloud environments and supporting Google Cloud services with easy-to-install integrations.

Datadog is optimized to take Google Cloud’s many native observability capabilities to an even higher level, offering a reliable, resilient, integrated, and cost-efficient solution. It provides insight into both known and unknown behavior, enabling organizations to gain a comprehensive view across applications, infrastructure, networks, services, and functions, including those essential to supporting DevOps and cybersecurity.

Whether an organization’s workload is a GKE container, an Anthos on-prem environment, or a hybrid configuration, Datadog makes migration planning flexible and efficient. Its ability to seamlessly integrate with Google Cloud Services enables organizations to take full advantage of many Google Cloud resources, including AutoPilot to reduce system overhead and enhance system-wide performance.

Datadog also provides features that enable anomaly detection, outlier detection, forecasting, and more, drawing upon substantial telemetry data captured from across the entire application stack and infrastructure. Finally, all data is consolidated, integrated, unified, and visualized for organizations in a single pane of glass.

Chronosphere
Chronosophere’s focus is on enhancing observability through data reduction, as well as a differentiating pricing model that only charges for post-process data stored. It also offers tools to help organizations rationalize their observability investment and reduce operational costs.

Chronosphere is itself highly focused on Google Cloud to run its own business. The company’s observability suite is a software-as-a-service solution, hosting its systems on Google Cloud and tightly aligned with GKE.

Chronosphere deliberately selected Google Cloud as the focus of its efforts because of the ability to offer reliable, highly resilient performance at scale. This capability is extremely important for cloud-native applications, because those container-based systems tend to generate very large amounts of monitoring data compared with more monolithic on-premises or VM-based environments.

Chronosphere’s observability platform is optimized for customized control of data used for DevOps requirements. The company has demonstrated a real-world capability to scale its solution to as much as 2 billion data points per second. In addition, new control plane enhancements allow Chronosphere to provide customers full visibility into what observability data is valuable so they can make the best possible optimization decisions, resulting in lower costs, improved developer productivity, and faster problem resolution.

Conclusion
The sustained march toward multi-cloud IT operations has turned into an all-out sprint, as organizations look to develop, deploy, and run more of their mission-critical and business-critical applications in any of the multiple cloud environments. While this has helped organizations achieve such benefits as enhanced scalability, improved economics, and better use of existing in-house resources, it also has increased complexity, especially when it comes to ensuring that applications run smoothly and systems are resilient, secure, and available.

Making that possible requires a modernized, comprehensive view of cloud observability for applications, data, and infrastructure. Cloud service providers such as Google Cloud have built a wide range of observability features into their cloud platforms, such as monitoring, logging, and tracing. Still, organizations need and demand even more, which means Google Cloud’s ecosystem of experienced ISVs is an invaluable asset.

Google Cloud, in concert with leading ISV partners, such as Dynatrace, DataDog, and Chronosphere, provides an end-to-end observability solution that helps reduce risk, improve operational efficiency, and deliver improved cost efficiencies.

1 Source: Enterprise Strategy Group Complete Survey Results, Distributed Cloud Series: Observability and Demystifying AIOps, July 2023.

Shutterstock

Close