Mathias Rosenthal - Fotolia
How developers can use SageMaker for DevOps machine learning
SageMaker from AWS gives software developers a way to tackle AI and machine learning. But expert Torsten Volk said it will also require lots of experimenting.
We are only at the beginning of the ML/AI journey.
Machine learning and artificial intelligence will provide decision support and automation to increase product quality, while lowering cost. The technologies are poised to disrupt IT and business as a whole. But these are big words when considering the myriad of problems we face in 2018 when it comes to applying ML/AI to DevOps and businesses.
The next step in ML/AI
To get to the next step in DevOps machine learning, full-stack developers must recognize and evaluate opportunities where ML/AI will make a significant improvement. Today's standard cognitive APIs offered by Amazon, Google, IBM and Microsoft give developers a taste of how ML/AI could impact productivity and product quality output by covering simple, standard scenarios, such as voice to text, language translation, object or facial recognition and image classification. However, once the requirements exceed the capabilities of these standard cognitive APIs, the next step is to fire up a dedicated DevOps machine learning or AI environment, with all the resource requirements and deployment complexity that come with it.
Enter Amazon SageMaker
Amazon launched SageMaker to lower the barriers of entry for ML/AI. SageMaker provides training wheels for developers to enable faster and cheaper DevOps machine learning experimentation and pilot projects. Developers will be more willing to engage with a project if the proof of concept takes an hour or two to create, rather than 10 or more. This is where Amazon SageMaker comes in.
SageMaker lowers the barriers of experimentation by addressing some of the ML/AI pain that full-stack developers typically experience:
- Creates the environment: Automatic deployment and configuration of the required server, storage and software infrastructure are provided.
- Creates algorithms for popular use cases: Amazon created its own version of popular ML/AI algorithms and pointers on when and how to use them.
- Incorporates software development kit: The included SDK for Python and Spark provides libraries for model training and deployment.
- Automatically sets hyperparameters for many algorithms: Developers need a certain level of experience to set model hyperparameters that, for example, determine the number of layers of a neural network, the number of iterations (epochs) the algorithm should take and the maximum of how each sample record should change the model's values. SageMaker is rolling out autoconfiguration for hyperparameters, wherein the AI service tests the model with different configurations and selects the one that fits best.
- Protects from overfitting: Overfitting ML/AI models refers to a situation wherein the model loses predictive power because the training effort accommodates many or all corner cases. SageMaker offers some protection against overfitting but cannot fully prevent the problem.
The limits of SageMaker
While SageMaker gives full-stack developers a chance to benefit from ML/AI, there are limitations. Developers still need to dive into the DevOps machine learning mindset and spend time and resources to experiment with ML/AI. They need to nurture the ability to identify ML/AI use cases. If you want to work in ML/AI development for a given project, be sure you can answer these questions:
- Which algorithm could make sense for what problem?
- Is there training data available?
- How much training data do I need?
- Do I have to clean it up first?
- Should I "feature engineer" to help the algorithm along or let the algorithm do its own magic?
- Do I need more training, or am I overfitting the algorithm?
- Do the input variables need preprocessing, or should they be used as is?
- Will I get better results if I combine more than one algorithm and then calculate the averages?
- How strongly should I penalize mistakes?
While SageMaker is not a turnkey service, it still enables DevOps machine learning by wrapping deployment and training into a template that can then be configured by the developer in less time than creating a standalone environment with EC2.