MIT Builds Cyber Security System That Can Detect 85% of Cyber Attacks…Really?

A couple of weeks ago, I visited a city that I haven’t been to for a while. At a restaurant, I uncharacteristically decided to pay with my Visa card. However, when I input my PIN number into the card terminal, my transaction was refused. Of course, there ensued that somewhat embarrassing moment where I explained to the waitress that I had plenty of funds in my account and that there must be some other problem, but the transaction continued to be rejected. That’s when I realized that I had become… An Outlier.

Like many systems that have to deal with enormous amounts of data, Visa uses algorithms to detect unusual use of its cards. Each user generates a behavior pattern and if variations in this pattern occur that are beyond certain limits, the person using the card becomes an outlier; a possible criminal. Over the past few years, Visa’s algorithms have become more sensitive to variations in user behavior. Now, I realize that if I want to use my card in this city, I’ll have to call my bank or Visa and tell them I will be there. They will then reset the parameters. Yes, I’m glad they are erring on the side of caution, but it is still troublesome. The SMS verification code that some retailers are using will just make things even more complicated. And all of this because hackers have been able to outmaneuver every security defense put in their way.

However, this situation may change if companies begin to implement a new security system based on the work of MIT researchers. The claim, made in their paper, AI2 : Training a big data machine to defend, is that their security architecture can detect 85% of all cyber attacks while reducing the number of false positives (attacks that are not really attacks) by five times. So, how did they do it?

Many algorithms associated with preventing cyber attacks work like traditional firewalls and antivirus programs. The structure of known attacks is plugged into the software which enables the program to recognize an attack when it sees it. New attack angles are not recognized because there is nothing the software has in its database to compare them to. Newer security algorithms, like those at banks or credit card companies, build a database built on user metrics. Significant deviations from a user’s normal behavior trigger certain attack prevention procedures. Neural network programs are self-learning. If they get information that a certain deviation has led to an attack, that deviation becomes a new part of the database. They can even go beyond this by recognizing similar patterns which could indicate that attackers are trying to fool it by making minor alterations in their attack strategies.

Security algorithms can be supervised or unsupervised. The best programs use the knowledge of human security experts to improve their detection skills. These are supervised programs. The algorithms detect what they determine to be irregular behavior and send it on to the humans to analyze. The problem comes when the algorithm begins to detect too much irregular behavior. There is a tendency for it to overwhelm its human partners who have other things to do than look through millions of lines of logs every day. When this happens, there is a tendency for the humans to start ignoring these queries from the algorithm. In the end, security suffers. In fact, the Target breach was apparently tied to security teams being overwhelmed with data that they had no time to respond to. In other words, the key to good security is for the algorithm to get the right balance between caution and practicality.

The MIT researchers claim to have solved this problem as their system “beat fully unsupervised outlier detection by a large margin” (3.41X). However, I would be surprised if it did not beat an unsupervised detector by a large margin. I would have the same criticism for the decrease in the number of false positives that the MIT algorithm returned. You would expect that an algorithm given feedback by human experts (it is not clear how many human experts were participating) would begin to get some focus on which outliers were potentially malicious and which were not. In other words, I would not get too excited about these findings.

I am also puzzled as to where the 85% detection rate comes from that is mentioned in the title of an article posted on the school’s Computer Science and Artificial Intelligence Laboratory website. This figure is never mentioned in the main research paper and seems to appear out of nowhere. By my calculations, based on the data given in the paper, the MIT algorithm detected, at the end of its 3 months of training, about 70% of attacks (218 of 311). Moreover, even this percentage is inflated due to the amount of historic data programmed into the detector to achieve this result. Without these “historic labels”, as they refer to them, detection rates are closer to 46% (143 of 311). If I’m missing something here, I hope someone will let me know.

But the validity of these statistics is not really the point. The point is whether a 15% to 30% non-detection rate is acceptable. Let’s face it, the most skilled attackers are also going to figure out what is and is not getting through the defenses of its target. Even those attacks that don’t get through give them feedback on how to alter their attack structure. Eventually, persistent attackers will make it through and get whatever it is they want. In other words, the MIT neural network cyber defense system may prevent the most common attack patterns and even recognize simple variations on them, but the most sophisticated attacks will mask themselves as normal traffic and make it past these pattern recognizing networks. The strongest and weakest link is the human component. Human experts may be able to easily see through certain attempted breaches that baffle the network; however, more complex attacks may take the experts much more time to analyze and could slow up the system.

To a large extent, neural network security is based on the premise that human behavior, the behavior of attackers, is predictable: a premise that is fundamentally flawed. My decision to visit a city I don’t often visit and to pay with a credit card when I normally don’t caused the Visa algorithm that monitors me to ‘panic’. Of course, there was no way it could interpret my behavior otherwise…or could it.

Google has been working on neural networks for some time now. “We developed a distributed computing infrastructure for training large-scale neural networks. Then, we took an artificial neural network and spread the computation across 16,000 of our CPU cores (in our data centers), and trained models with more than 1 billion connections.” Their goal was to see if such an extensive neural network could mimic the learning behavior of a newborn brain. To some extent, it did.

It is, therefore, disturbingly possible that these algorithms will learn that our behavior is more predictable than even we may believe. With all the data that Google has on all of us, our unpredictable behavior may, in fact, be predictable after all. In terms of cyber security, current outlier behavior may be someday analyzed as normal behavior under certain constraints. Perhaps, Google will learn that I will occasionally use my credit card when certain conditions are met. After all, my behavior was not totally random. I did make the decision to use the card based on some underlying facts, didn’t I? Google, or Visa, or my bank could mine my database to determine what these conditions might be. How much cash had I withdrawn recently from an ATM? What types of restaurants am I likely to visit? Who else was in the restaurant while I was there and what was the chance that that person was there with me? How often, when I was with that person, did I use my credit card? Did I make any recent Twitter of Facebook posts to indicate I might be in this restaurant? The bank’s recent change to sending an SMS code to my phone to help with transactions could also give them the ability to see if my GPS coordinates matched those of the restaurant. In short, data-driven algorithms could make the restaurant transaction seem more likely than not.

So it seems that our security may depend on us giving up our privacy. Something we’ve heard many times before. Certainly, if a neural network could not only respond to but predict an individual’s behavior under certain conditions it would be able to go a long way towards stopping attacks and limiting false positives. Actually, hackers who use social engineering to attack individuals and companies have been using this vector for years. They can, for example, predict what sites a person will visit and then infect those sites with malware as in a watering hole attack.

But, as is always the case, better neural network security would force attackers to up their game as well. As has always been the case, both security and those who try to compromise it will evolve together.

That’s why we should all take this claim by MIT with a very large grain of salt. However wonderful the results may seem for the present, it is only a temporary solution to a permanent problem, or, in the case in which all of our behaviors become ultimately predictable, the opening of a door to a whole new set of problems.




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s