Living on the edge: Why IoT demands a new approach to data
Thanks to increasingly powerful devices and our hunger for more and more data, the internet of things is the latest and greatest innovation to capture our collective attention. Citing the incredible value presented by IoT data, organizations across a variety of industries — manufacturing, retail and medical, to name a few — are or will soon be investing in IoT initiatives. In fact, a recent 451 Research Voice of the Enterprise poll found that 65.6% of respondents plan to increase their IoT spending this year.
These IoT investments will have a tremendous impact on data architecture. According to Gartner principal research analyst Santhosh Rao, “Currently, around 10% of enterprise-generated data is created and processed outside a traditional centralized data center or cloud. By 2022, Gartner predicts this figure will reach 50%.” Where in the past IoT has been an “edge case” (pardon the pun), it is quickly going mainstream, with significant IT and business implications.
One of the byproducts of IoT is the need for “edge analytics” or computation occurring in the IoT device itself rather than within a data center or cloud computing platform.
Making the case for edge analytics
There are three factors driving edge analytics as opposed to traditional data center processing:
- IoT apps are often bidirectional;
- For some things, latency means death; and
- Edge analytics removes the latency.
Self-driving cars are a great example of these factors. Edge data collection occurs when sensors in cars transmit data that is aggregated and processed in the data center or cloud to determine how well the navigation system is working. After analyzing the data, the system manufacturer can adjust the navigation rule set to improve the safety, reliability or performance of the system, and push that new rule set out to the entire population of cars.
But sometimes immediate action must be taken on collected IoT data, and the time lag routing data to the cloud or a data center can be disastrous. Take, for example, a mining company that works with expensive underground equipment. With IoT sensors on drilling equipment, data can be transmitted back to the data center or centralized cloud to monitor vibration patterns and engine hiccups that don’t need to be addressed immediately. But what about sensor information that comes in on information that needs immediate action, such as related to a potential explosion or mine collapse? Waiting for the cloud or data center to determine there is a safety risk may lead to a disaster. Edge analytics can come to a conclusion about the safety issue and initiate necessary action immediately.
Faced with scenarios like this — as well as security and compliance issues in which decisions must be made in real time — organizations are beginning to look for ways to analyze and/or act on data at the edge.
Architecting for edge data
Historically, to move data from where it is created to where it is stored and analyzed, organizations have typically opted for some sort of do-it-yourself coding or data integration software. But unlike other types of data like transactions in a database, IoT edge data analysis is a complex process requiring substantial orchestration and live operational visibility into performance, which is difficult to implement with traditional approaches.
Furthermore, hand-coding requires scarce specific skills, and you don’t want those skilled people tied down to projects indefinitely to deal with maintenance to fix or update data pipelines when data drift (unexpected changes to data structure or semantics) occurs or requirements change. In short, the hard and soft costs of hand-coding become prohibitive in such a dynamic and real-time IoT environment.
Fortunately, new software tools offer a simpler and future-proof alternative to hand-coding IoT pipelines, and they hold many advantages. First, these tools combine design, testing, deployment and operation into a single process, which some vendors are calling DataOps or DevOps for data. This shortens initial project time and streamlines the pipeline maintenance process. In addition, once hand-coding is eliminated, time and money are saved, as programming errors become much less likely.
Second, tooling that provides instrumentation enables an organization to gain greater control over operational management of the IoT pipelines. Throughput, latency and error rates can be monitored live to ensure analytics is based on complete and current data sets. Hand-coded data ingestion pipelines are usually opaque, lending very little, if any, visibility into a pipeline’s performance.
Third, security and data privacy must be baked into data movement — it can’t be an afterthought. Tools often come with integrated security that makes it easier for organizations to comply with regulations like HIPAA and GDPR, which punish misuse of sensitive data with heavy penalties. While mining equipment probably doesn’t emit personal data, medical equipment, video cameras and smartphones do. It is best to detect sensitive data at the point it is gathered so it can be properly obfuscated or quarantined.
Lastly, since edge analytics is part and parcel of many IoT projects, data pipelines should be able to launch native or third-party processing on the edge device when needed. An example of an edge analysis would be to monitor a camera or sensor for changes locally rather than send back fairly static data to the data center. Another common case is to aggregate data into summary statistics rather than send every single reading over the network. A third is to analyze for signs of transaction fraud at a point-of-sale terminal or ATM so that it can be blocked before the transaction completes.
IoT presents an incredible opportunity for data generation that can be used to great benefit for organizations today. As the volume of data grows and use cases for edge analytics become more diverse and varied, organizations’ technologies for managing this data must evolve as DIY systems give way to more sophisticated DataOps tooling.
All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.