Joshua Resnick - Fotolia
Yellowbrick data warehouse update boosts workload management
Hybrid cloud data warehouse vendor updates platform with self-healing cluster capabilities and a "penalty box" feature to improve workload management.
Yellowbrick Data has updated its namesake hybrid cloud data warehouse platform with a series of workload management enhancements.
Yellowbrick Data, which is based in Palo Alto, Calif., provides a data warehouse platform that can run on-premises and in multi-cloud deployments. With the Release 5 update, the vendor has added self-healing clusters to improve workload management capabilities.. Resource mapping also gets a boost, as does a rate-limiting feature the company refers to as "penalty boxing."
Yellowbrick Release 5 was introduced Nov. 16 and is generally available.
There has been a lot of interest in the cloud data warehouse market with the recent IPO of pure cloud data warehouse vendor Snowflake. While interest is in cloud technologies is high, there's still a place for a high-performance data warehouse that can be deployed and managed on-premises and in a private cloud approach.
That is the niche Yellowbrick occupies, along with vendors such as Greenplum and Netezza that pioneered the niche more than a decade ago, according to Doug Henschen, an analyst at Constellation Research.
"I view the sweet spot as meeting the needs of those who want high scalability and performance at a competitive price point," he said.
Workload management updates
Looking at the updated workload management features in Yellowbrick Release 5, Henschen said that while he doesn't see the new features as being revolutionary, they are helpful.
Doug HenschenAnalyst, Constellation Research
"Release 5 includes a long list of workload management upgrades, as well as support for user-defined functions," Henschen said. "These aren't groundbreaking capabilities, but they are healthy signs of product maturation and demanding customer requirements for Yellowbrick."
Henschen said workload management features are important for ensuring performance and balancing job priorities while managing costs. He added that the updated capabilities for user-defined functions will help support customer-specific requirements for data engineering and data science.
How self-healing clusters better workload management
Mark Cusack, CTO of Yellowbrick, noted that the self-healing feature in the update provides query fault tolerance within an individual Yellowbrick cluster.
Cusack explained that should a hardware component fail, or the resources needed to complete a query become unavailable and cause the query to fail, this failure is masked from the user. Instead, the query is automatically resubmitted with a changed set of resources assigned to it, if required.
"From the perspective of the end user, all they will see is a delay as the query is re-run, rather than a query failure in their BI [business intelligence] tool," Cusack said.
Putting 'bad' queries in the penalty box
Named after the ice hockey area where penalized players must sit, the new penalty boxing feature in Yellowbrick Release 5 is another key part of the workload management update.
Cusack explained that when a query executes, it is assigned to a particular resource pool. These pools define things like the maximum number of concurrent queries, amount of memory allocated, disk usage and priority. If a query becomes classified as a resource hog it will be moved to the penalty box, which has a much lower concurrency level and priority.
"The purpose is to provide administrators the ability to ensure high-priority short running queries continue to have system resources available, by penalizing longer running queries by moving them to alternative pools or penalty boxes," Cusack said.