Bring AI to the edge with AWS Greengrass ML Inference
Learn the basics of machine learning inference with AWS Greengrass and how it can help train models and increase data value with real-time insights.
Data scientists use machine learning to make decisions based on a virtual flood of data, but they must consider two distinctly different parts of the process: training and inference.
Training tells a machine learning model how to deal with existing data to provide the correct response to a given pattern or event. But training depends on hindsight, and the machine can only make the same decisions when a data pattern recurs. The real value of AI is in inference, which helps the machine learning model determine how to handle events and data patterns it has not previously encountered.
In the cloud, AWS Greengrass Machine Learning (ML) Inference can help users quickly tune models for real-time decision-making.
AWS Greengrass ML Inference basics
AWS Greengrass supports IoT deployments through compute, messaging, caching, synchronization and machine learning capabilities for devices connected to AWS. The addition of inference capabilities brings machine learning to the connected device level.
In a typical machine learning configuration, a data scientist creates, trains and transfers a model to a fleet of connected devices that can then use that model to make decisions about the data it encounters. At the same time, data is sent back to a central location and used to tune and refine the model, which periodically updates each connected device.
Inference essentially examines how variables can influence a response and can predict or draw conclusions via reasoning rather than explicit, predetermined statements or relationships. For example, machine learning can train a system to estimate the cost of a home based on a multitude of characteristics, but inference can predict price variations based on additional or untrained data, such as the presence of additional rooms, a finished basement, additional yard space and so on.
The inference feature enables each AWS Greengrass connected device to reach real-time conclusions. Those decisions are then shared with the AWS Greengrass service, which can effectively tune the machine learning model in Amazon SageMaker. The tuned model can be updated on connected devices later.
AWS Greengrass ML Inference benefits
The major cloud providers all offer machine learning inference capabilities on their platforms, but AWS appears to be the only one that offers inference at the edge.
Inference is more efficient at the device level because it typically requires less bandwidth and compute power than full-scale data transfers and retraining. As a result, Greengrass can respond faster and improve reliability when those tasks are done locally. Greengrass ML Inference can also use the GPUs available on the endpoint device for additional computational assistance.
Greengrass software and the machine learning model enhance system reliability because they run on each Greengrass-enabled endpoint device, along with AWS Lambda functions. Greengrass software also supports messaging between endpoint devices on the local network, so devices can share data and events without cloud connectivity.
Greengrass -- though not Greengrass ML Inference specifically -- supports spotty connectivity through AWS IoT Device Shadow, which caches the state of each device and synchronizes its state with the cloud when connectivity is restored.
AWS Greengrass ML Inference supports a variety of deep learning frameworks and endpoint devices, including predefined packages for TensorFlow, Apache MXNet and Chainer. It also supports Caffe2 and Microsoft Cognitive Toolkit. Amazon SageMaker can create and train machine learning models, though data scientists may create models in any tool and store pretrained models for deployment from AWS S3 buckets.
AWS Greengrass ML Inference limitations
Greengrass can behave locally, but it requires AWS cloud connectivity and access to the six regions where the AWS Greengrass ML Inference service is available: U.S. East (N. Virginia), U.S. West (Oregon), Asia Pacific (Tokyo), Asia Pacific (Sydney), EU (Frankfurt) and EU (Ireland). Thus, some global users may not be able to use Greengrass or may experience latency when their devices need to connect to a suitable region.
Consider Greengrass and machine learning framework version discrepancies, too. For example, AWS Greengrass Core users can employ version 1.5.0, but to use a pre-established library for Apache MXNet 1.2.1, users must employ AWS Greengrass Core version 1.6.0. In addition, MXNet does not ensure compatibility between versions, so use the same MXNet framework for model training and model deployment.
And while Greengrass offers a measure of offline resilience, the overall application performance on a Greengrass endpoint device is hardly foolproof. Some applications deployed to endpoints in Greengrass-enabled devices may make calls to the cloud to use AWS offerings, such as DynamoDB. Such calls are perfectly legitimate but will fail when cloud connectivity is disrupted.
Finally, consider how to handle the data that Greengrass ML Inference will produce. Ideally, that data is passed back to the Greengrass service in the AWS cloud for subsequent model tuning and training. This should work smoothly using integrated services, such as Amazon SageMaker, but consider ways to use inference results for other model building/training tools outside of the AWS platform.