Use network telemetry for improved IoT analytics
Today’s solution for IoT analytics has primarily been through application instrumentation. This means the developer of the application inserts code, which sends back telemetry to some kind of monitoring or analytics platform. These solutions are most often SaaS or live in the public cloud. These are great methods when you have control over the code and have the knowledge of what and how to instrument. Oftentimes, people don’t have this prior knowledge. Another approach has been the application of packet capture technologies to IoT. However, due to the fact that so many IoT solutions leverage content delivery networks and public cloud, that approach doesn’t work particularly well due to large visibility gaps.
Some forward-thinking organizations have begun to use traffic data such as NetFlow, sFlow and IP Flow Information Export (IPFIX) to send back IoT information within a network flow. This has several advantages when used to capture IoT specific data. First, the data is standardized into industry-accepted formats, which I will get into later. The second is that once the data is captured from the gateway, it can be correlated with traffic data coming from the data center or cloud services in use. Today’s public cloud environments all have the ability to generate and export flow data, including the four examples listed below, which have been sorted by popularity.
- Amazon provides the Virtual Private Cloud (VPC) Flow Log service. The service exports network traffic summaries — such as traffic levels, ports, network communication and other network-specific data — across AWS services on user-defined VPCs to understand how components communicate. The data is published to CloudWatch logs in JavaScript Object Notation (JSON) on a Simple Storage Service bucket or can be fed to other services such as Kinesis. The data contains basic network data about the communication flow and is published every 10 to 15 minutes. Unfortunately, Amazon’s service is a bit behind the other major cloud providers.
- Microsoft Azure provides the Network Security Group Flow Logs. This service similarly publishes the logs in a JSON format to Azure storage. The one difference — which improves upon Amazon’s implementation — is that Microsoft publishes the data in real-time, making it more useful operationally.
- Finally, Google is ahead of the pack on this data source. Google has created the VPC Flow Log service, which can be consumed by Stackdriver logging. Google does everything the others do, but most importantly, they also embed latency and performance data within the exported logs. The data is highly granular which makes it more useful, but it generates a lot of volume.
Tools for network-flow export
As you can see, there are many implementations. All of them provide a rich set of summarized data sets that are very useful for understanding how services interact, which services are most used and which applications consume network resources or answer requests. This data is valuable for countless operational and security use cases.
If you are implementing on a smaller device and want to collect data from the gateway or IoT itself, there are lightweight network flow-export tools that can provide a lot of additional context on network traffic generated by the hardware. These agents can sit on Windows or Linux systems. Many of them will run on embedded Linux devices as well. Here are some options:
- nProbe has been around for a long time, and hence is very mature and heavily used. The company behind it has been tuning and expanding capabilities for over a decade. While nProbe was once free, it now costs money, but it has the ability to classify over 250 types of applications using deep packet inspection. These application types and latency information are embedded in the exported flow, which adds additional value to the flow. The solution can operate in both packet capture mode and PF_RING mode to reduce the overhead on the operating system.
- kProbe is a Kentik product to do what nProbe does, which is to convert packet data from the network card to NetFlow or kFlow. While it doesn’t have as many application decodes, it’s free to use and highly efficient.
- SoftFlowd is a great open-source project, but it hasn’t had too many updates recently. Similar to the other solutions above, this small open-source agent converts packet data to flow data. The product has been tuned over many years and is highly efficient. It lacks a lot of application classification, but it does do some.
- NDSAD is a host-based agent that captures traffic from the interfaces and exports to NetFlow v5. It also supports more advanced capture methods for lower latency capture from the network card. This project doesn’t execute application classification, so the exported flow is less rich when it comes out as NetFlow.
Analyze flow data with these tools
Once these products are in place, there are many tools to analyze the output from them. Unlike tracing tools on the software side — which lock you into a specific implementation due to protocol differences in the network data sources — the data is standardized. This is the case in NetFlow, Simple Network Management Protocol (SNMP) and streaming telemetry, though it does contain fewer standards compared to the others.
While each vendor that makes network devices has its own analytics and management platform, they don’t support other vendors. Most environments are highly variable with many vendors and open-source components deployed. Each of the devices have different formats for NetFlow, but this is handled by flexible NetFlow templates and IPFIX. SNMP is handled via management information bases. Streaming telemetry is a new data type, but it lacks data taxonomy standards, which is a step back from SNMP. Tools that ingest any of this network data will normalize the data so the user doesn’t need to do that work. That means if you are using specific vendor implementations, you can avoid lock-in when you are using these data sources, particularly as the access will be standard once it’s in network-based analytics tools not made by vendors.
Aside from the vendor tools, there are more popular third-party tools, such as Kentik, and other open-source options. Most of them can handle NetFlow and other network data, but few can handle the cloud-based flow log data too. In IoT, the scale is an important consideration, which causes problems with many of the older tools built on traditional databases. Common commercial tools to analyze flow data include those built by Solarwinds, ManageEngine, Plixer, Paessler and Kentik. I will highlight a few open-source analytics products, which are still actively maintained within the last five years.
- ntopng was designed by the same folks who made nProbe and Ntop. This product can take data from flow or packet data and does similar visualizations in a nice web-based user interface. This tool has been around for a long time and works great. However, it isn’t meant as a scalable analytics platform beyond a small number of low-volume hosts. It’s still a useful tool for those managing networks. It’s also suitable for those looking to gather data about what’s happening on the network and which devices are speaking to one another.
- Cflowd is a project by the Center for Applied Internet Data Analysis, which is a non-profit focused on gathering and analyzing data on the internet. This project is a good foundation for building a DIY analytics solution and is still maintained.
- sflowtool captures sFlow coming from various sources and can output text or binary data, which can be used to feed data into another tool. It can also convert the incoming data to NetFlow v5, which can be forwarded elsewhere. sFlow is a great data source, but not the most common. It contains data that Juniper generates from many of their devices.
As you can see, many of these analytics tools are not full-featured. More often than not, if an organization wants a free or open-source analytics solution, they end up using Elasticsearch, Logstash, and Kibana or Elastic Stack, which ends up having scalability issues when dealing with network data. This trend will progress quickly as the cloud creates unique requirements and constraints for organizations moving in that direction. We should see a lot more IoT projects using network data, as it’s a highly flexible and well-understood data source.
All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.