Machine learning in cybersecurity is coming to IAM systems
Machine learning in cybersecurity applications for identity management systems are becoming more common today. But will algorithms be the best option for authenticating and authorizing users?
CHICAGO -- The next time you have trouble accessing a mission critical application and need to prove your identity, you may be making your case not to network administrators or IT support but to a machine learning algorithm.
The oft-discussed machine learning model has already taken root in the information security industry, as several vendors have embraced the technology to improve malware and threat detection and displace traditional signature-based detection. But now machine learning is making its way into identity and access management (IAM) to make rulings for authentications and authorizations. Several experts at the 2017 Cloud Identity Summit this week discussed machine learning in cybersecurity applications for identity management systems, as well the risks and rewards of such applications.
The appeal of machine learning in cybersecurity is straightforward: IAM increasingly relies on a growing number of factors – from physical and behavioral biometrics to geolocation data -- to determine the identity and authorizations of an individual, and companies are turning to algorithms to process and judge those factors for IAM systems. Alex Simons, director of program management at Microsoft's Identity Division, said there will be so many interactions for authentication and identity confirmation that it won't be feasible for humans to manage the entire system. "You'll have to rely on machines to do this for you," Simons said during his Cloud Identity Summit keynote.
The bulk of the authentications will be performed by machine learning technology, while human judgment will be reserved for select cases that involve red flags, Simons said. Microsoft has already begun applying machine learning in cybersecurity to provide what Simons called "intelligent protection" for Active Directory. And to give an idea of the volume of activity that identity management systems are dealing with today, Simons said his company sees 115.5 million blocked log in attempts and 15.8 million takeover attempts for Microsoft accounts each day.
Like Microsoft, IBM is already has machine learning applications for IAM. Eric Maass, director of cloud IAM strategy at IBM, delivered a keynote presentation on how cognitive computing is influencing the way enterprise IAM technology and programs will evolve. "We're going to start to see the transformation of authentication to recognition," he said. "The application of machine learning tied with biometric authentication mechanisms will allow us to do that."
Maass argued that pattern recognition for physical and behavioral biometrics will be able to provide continuous authentication; the model would be similar to human behavior, where trust is built up over time and through several different factors. "As a human, I don't authorize you to do something – I trust you," Maass said. "Trust is established through behavior."
But humans can't be removed from identity management systems completely, Maass said, because of the complexity of IAM. "So why do we still all have jobs here?" he asked the audience. "We're all here because these things are still tough. Authentication is still tough because it's hard to prove who you are to a machine. It's hard to write an infinite list of ACLs [access control lists] and different types of authorization policies and entitlements to define who can do what on what systems, when and where, in a finite or comprehensive fashion."
The risks for machine learning in cybersecurity applications
Maass made several references to artificial intelligence in popular culture, the most notable and relevant of which was the famous scene in "2001: A Space Odyssey" where the supercomputer HAL 9000 prevents astronaut Dave Bowman from entering the spaceship. That reference evokes the apprehension some have about turning over IAM systems to algorithms.
Rajiv Dholakia, vice president of products at Nok Nok Labs, worked on artificial intelligence research early in his career and believes machine learning will become more prevalent in the IAM space. But he said current identity management systems may not be ready for it.
Dholakia explained that many IAM systems rely on what he called "weak signals" – usernames, passwords, security questions, and others – that can be stolen, guessed or spoofed, instead of strong signals such as a biometric template bound to an encryption key. "Calculations today are generally made with weak signals, and now we're trying to add more and more of them. When you keep grabbing all these weak signals to perform authentication, it becomes a mess," he said. "So instead of applying machine learning to all these weak signals, let's get stronger signals in the first place."
But the catch for identity and access management systems is that machine learning requires more data, not less, to be effective. Riddhiman Das, product architect at biometrics startup EyeVerify in Kansas City, said machine learning algorithms are only as good as the data they get to set baselines and patterns.
"If you look at Facebook and Google, their neural nets are extremely accurate because they have so much data," Das told SearchSecurity. "But for smaller companies like a startup out of Missouri that doesn't have nearly as much data, then the accuracy may not be as high."
To that end, data mining will become a critical part of IAM and machine learning, according to Maass. Instead of static identity profiles that rely on basic, unchanging data, he said, identity management systems will be constantly mining data about users to not only authenticate them but to also monitor their access and behavior for potentials risks or threats.
But Maass acknowledged that predictive analytics about users might not always be accurate, and he cited another classic science fiction movie: "Minority Report," which is about predicting and stopping crime before it happens. "We're going to see that concept creep further and further into our access management systems," he told the audience. "But just like in 'Minority Report,' there are the minority reports."
Maass said if the baseline for good behavior is set incorrectly, then the identity management systems will learn incorrectly and make mistakes. In addition, they'll assign probabilities or confidence levels to authentication claims rather than making a "yes or no" decision, and IAM policies will have to take that into account and decide what acceptable percentages are for authorizations and rejections.
Dholakia cited another potential problem for machine learning-powered IAM: Continuous and possibly endless accumulation of data for identity management systems will make machine learning in cybersecurity applications increasingly complex and harder for actual human identity professionals to manage. As stacks of data about, for example, a user's typing patterns and mouse movements, begin to pile up with other behavioral analytics, it will be tougher to see where the actual bottom line is. "If you go with too many signals and too many calculations, then in some cases these can become black boxes where you're not sure what it's actually doing," he said. "If you can avoid complexity, then you're better off."
While Maass said machine learning in cybersecurity systems should mimic human behavior for recognition and trust, Dholakia suggested another aspect that should be explored for identity management systems: the ability of the human brain to discard data that it doesn't need to prevent it from being overwhelmed.
"Just because you can get every bit of data and hold onto it forever doesn't mean you should," Dholakai said. "The human brain is very good at this, and I think machine learning should probably do that as well."