Data Management & Analytics

  • Real-world SLAs and Availability Requirements

    ESG conducted a comprehensive online survey of IT professionals from private- and public-sector organizations in North America (United States and Canada) between March 20, 2020 and March 28, 2020. To qualify for this survey, respondents were required to be IT professionals responsible for data protection technology decisions, including those in place to ensure application SLAs are met.

    This Master Survey Results presentation focuses on real-world SLAs and availability requirements, including tolerance for downtime, downtime metrics, and real-world SLAs in the context of actual data loss against the backdrop of availability technologies and methods.

    (more…)

  • In this new edition of Data Protection Conversations, I speak to Molly Presley, Founder of the Active Archive Alliance.

  • multi-cloudA couple weeks back, Google Cloud’s multi-week virtual event Next 20: OnAir started. There were a number of announcements, but the biggest was BigQuery Omni. By combining BigQuery and Anthos, BigQuery Omni enables organizations to embrace multi-cloud analytics by cost-effectively bringing Google Cloud’s data warehouse to where the data resides across public cloud environments.

    (more…)

  • Commvault Reaches A Milestone

    GettyImages-1158544197Execution.  On many fronts.

    This is really what came to mind in the first few minutes of the Commvault FutureReady online event yesterday (7/21).

    First, the event itself was excellent: length, content, delivery, use of customers, and it included primarily live sessions from what I could tell. The platform worked without a hitch. To net it out: as a participant, one benefited from both engaging content, and active engagement through the ability to interact with the speakers. Great job! Great execution.

    (more…)

  • Commvault Metallic: Pedal to the Metal

    commvault-speedI wrote some months ago that under Sanjay Mirchandani, Commvault was not just changing, it had already changed. Fast forward a few months and we now find ourselves in a totally different world. But one thing that hasn’t changed and stands out is the continued execution and focus of the Commvault team, even through these confusing times.

    (more…)

  • In this episode of Data Protection Conversations, I catch up with Justin Augat from iLand.

  • data-lakeFirst came the traditional enterprise data warehouse (EDW). Structured data is integrated into an EDW from external data sources using ETLs (check out my recent blog post on this). The data can then be queried by end-users for BI and reporting. EDWs were purpose built for BI and reporting. But with the growing desire to incorporate more data, of different types, from different sources, of different change rates, the traditional EDW has fallen short. It does not support unstructured data (i.e., video, audio, unstructured text, etc.), streaming is for the most part out of the question, there is no data science or machine learning that can be done directly on the data, and because of their closed/proprietary nature, costs quickly skyrocket as organizations scale their deployments. Modern, cloud-based EDWs have looked to address several of these challenges and done a good job of it, but some challenges still remain, with the obvious being lack of unstructured data support.

    (more…)

  • Data Protection and COVID-19:  Time To Double Down!

    data-protection-spending-growthThe recent research ESG conducted on the impact of COVID-19 on knowledge workers and on IT spending intentions revealed a few interesting findings that directly affect data protection. Backup and recovery is hot!  While some organizations are cutting their IT budgets, not every one is and some specific technologies are actually doing better than others. Among the top 3 technologies least likely to be reduced: cybersecurity, remote/telework solutions…and data protection! 26% of our IT executive respondents said their data protection budget is actually going increase, and 54% will keep it steady. Cloud technologies and services fare very well across the board as one could have guessed in the current climate.  

    End-users also reported suffering from an intensification of cyber-attacks, making the remediation strategies (backup and recovery) even more relevant.  

    For end-users, this means that more than ever you should inspect your current backup and recovery infrastructure, its SLAs, and test its capabilities. Many new options for remote management now exist, and with many workloads migrating to the cloud, it may be time to revisit how you are protecting those data assets. 

    What this means for vendors of backup and recovery solutions, especially cloud-focused offerings, is that it’s a great time to double down on your marketing efforts and investments. People are listening. They want to spend to more, and need to modernize in many cases. 

    Never a dull moment in this market! 

  • VeeamON Goes Digital

    GettyImages-1215184800VeeamON used to have 2000 attendees. That was in the “old” world from a few months ago. The “new normal” has made the company pivot its popular event to digital, as is happening around the industry, with more or less success and execution prowess. 

    Veeam nailed it.

    One interesting thing happened: it “democratized” the event, making it available to 20 times more people than the physical event could. So out of a “bad” thing, a good thing happened. Veeam will also hold localized versions of this in various parts of the world in the next few weeks, like Europe. However, it was pretty clear that many did not want to wait and joined the event despite the time difference. 

    (more…)

  • data-integrationData integration is hard. Over the years, of all the technologies and processes that are part of an organization’s analytics stack/lifecycle, data integration continuously has been cited as a challenge. In fact, according to recent ESG research, more than 1 in 3 (36%) organizations say data integration is one of their top challenges with data analytics processes and technologies. The data silo problem is very real, but it’s about so much more than having data in a bunch of locations and needing to consolidate. It’s becoming more about the need to merge data of different types and change rates; the need to leverage metadata to understand where the data came from, who owns it, and how it’s relevant to the business; the need to properly govern data as more folks ask for access; and the need to ensure trust in data because if there isn’t trust in the data, how can you trust the outcomes derived from it?

    Whether ETL or ELT, the underlying story is the same. At some point, you need to extract data from its source, transform it based on the destination tool and/or merging data set, and then load it into the destination tool, whether that be something like a data warehouse or data lake for analysis. While we won’t get into the pros and cons of ETL or ELT, the ETL process is still prevalent today. And this is due in part to the mature list of incumbents in the ETL space, like Oracle, IBM, SAP, SAS, Microsoft, and Informatica. These are proven vendors that have been in the market for multiple decades and continue to serve many of the largest businesses on the planet. There are also several new(ish) vendors looking to transform the data integration market. Big companies like Google (via the Alooma acquisition), Salesforce (via MuleSoft), Qlik (via Attunity acquisition), and Matillion all have growing customer bases that are embracing speed, simplicity, automation, and self-service.

    Now whichever your approach is to addressing data integration, I keep hearing the same things from customers: “Vendor X is missing a feature” or “I wish I could…” or “I can’t get buy-in to try a new solution because the technology isn’t mature” or “that sounds great, but it’s a lot of work and we’re set in our ways” or “I’m just going to keep using Vendor Y because it’s too disruptive to change.” And every time I hear these common responses, I ask the same follow-up question: what’s your ideal tool? Everyone wants to ensure the technology is secure, reliable, scalable, performant, and cost-effective, but I wanted to understand the more pointed wants based on the actual folks who are struggling with data integration challenges day in and day out.

    Without further ado, I present to you the top list of “wants” when it comes to an ideal data integration tool/product/solution/technology:

    1. Container-based architecture – Flexibility, portability, and agility are king. As organizations are transforming, becoming more data-driven, and evolving their operating environments, containers enable consistency in modern software environments as organizations embrace microservice-based application platforms.
    2. GUI and code – Embrace the diversity of personas that will want access to data. A common way I’ve seen organizations look at this is that (generally speaking) the GUI is for the generalists and the code behind is for the experts/tinkerers. By the way, this mentality is evolving as modern tools are looking to help the generalists and experts alike with more automation via no-code/low-code environments and drag-and-drop workflow interfaces.
    3. Mass working sets – Common logic or semantic layers are desired. The last thing an engineer or analyst wants to do is write unique code for each individual table. It doesn’t scale and becomes a nightmare to maintain.
    4. Historic and streaming – Using batch and ad-hoc on historic and streaming data will ensure relevant outcomes. Organizations increasingly want hooks to better meet the real-time needs of the business and that means real-time availability and access to relevant data without having to jump through hoops.
    5. Source control with branching and merging – Code changes over time. Ensure source control is in place to understand how and why code has changed. Going hand in hand with source control is the ability to support branching and/or merging of code to address new use cases, new data sources, or new APIs.
    6. Automatic operationalization – This is focused on the DevOps groups. Ensure new workflows can easily go from source control to dev/test or production. Deployment is the first priority, but do not lose sight of management and the iterative nature of data integration processes as users, third-party applications, and data changes/evolves. 
    7. Third-party integrations and APIs – The analytics space is massive and fragmented. The more integrations with processing engines, BI platforms, visualization tools, etc., the better. And ensure the future of the business is covered, too. That means incorporating more advanced technology that feeds data science teams, like AI and ML platforms and services.

    While this list is by no means complete or all encompassing, it speaks to where the market is headed. Take it from the data engineers and data architects: they’re still primarily ETLing and ELTing their lives away, but they want change and recognize there are opportunities for vast improvement. And marginal improvements without massive disruption is the preferred approach. So a note for the vendors: it’s about meeting customers where they are today and minimizing risk as they continue on their data transformation journeys.

  • no-facial-recognitionThe recent announcement from IBM to withdraw from all research, development, and offerings of facial recognition will not stop facial recognition from being used by law enforcement or government entities. There. I said it. Facial recognition will continue on its gray area trajectory with or without IBM. But what IBM has done, specifically Arvind Krishna, is bringing attention to a growing concern that needs far more national and global attention. The use of facial recognition needs to be scrutinized for bias and privacy concerns. It needs oversight. It needs guardrails. Usage, especially from law enforcement and governing entities, needs to be transparent. And frankly, the technology needs to be better for it to work the way people envision.

    (more…)

  • GettyImages-1156502194Over the last several months, automation has seen a jump in interest. Operational efficiency has been a top priority for years, but as of late, it’s an even greater priority. For businesses, tasks or processes that used to be viewed as manageable but inefficient are now being scrutinized. The inefficiency aspect is being amplified and organizations don’t have a choice but to act. And one of those actions is to look into a trendy buzzword that is proving to be so much more: robotic process automation (RPA).

    So first off, what is it? RPA uses software and advanced technology like AI and ML to automate repetitive processes that are traditionally performed by a human. A software bot is configured to mimic structured actions to rapidly interact and transmit data based on established business workflows.

    We, as humans, require breaks. We have defined working hours. And whether we want to admit it or not, we’re predisposed to make a mistake here or there. This is especially true when executing a repetitive task over and over again. All it takes is a fat-finger typo entering information from a submitted form to create a ripple effect that could be catastrophic to a business or a customer.

    While RPA bots don’t sleep, don’t stop, and (when programmed properly) don’t make mistakes, it’s easy to get lost in the potential of RPA. It’s important to not lose sight of what RPA is and isn’t. RPA is not a physical robot. It doesn’t think freely. It doesn’t have cognitive abilities. RPA does enable predictability and reliability in the time it takes to complete a task or execute a workflow from beginning to end. RPA does save humans countless hours completing mundane tasks, enabling them to focus on more important tasks and projects. RPA does improve operational and business process efficiency.

    So where are organizations today in their adoption of RPA? Enterprise Strategy Group research shows that nearly 1/3rd of respondents report their organization currently utilizes bots in production environments to help automate tasks. But what is interesting is that when looking at this data based on level of digital transformation maturity, it shines a spotlight on the continued separation of the more digitally transformed businesses over their less digitally transformed peers. ESG recently published a research brief that highlights some of these key findings. It can be found here.

    For more RPA info, stay tuned over the coming weeks as I’ll be doing a double click on the RPA market, as well as highlight best practices on how to get started.