The battle is set: on one side stands data – ever growing, ever more important; on the other stands analytics technology – also continuously gaining speed and capabilities. We, as machine learning and data analytics enthusiasts, want nothing more than to see the “tech” side winning this battle. But, as our datasets and problems continue to grow larger and larger, our tools to analyze and solve them must grow in stride, less we let the untapped power of the data go to waste. It is like a twisted, data version of Frankenstein, our own creations like the internet of things (IoT) are producing vast quantities of data that we can’t properly deal with. The waste and opportunity cost from unanalyzed data is out of control!
Technologies like high performance computing and cloud are racing to deal with problems in big data analytics, but even they are struggling to keep up with the sheer quantities of data being collected. It is estimated that by 2020, humans will have generated over 44 Zb (1 zettabyte = 1 trillion Gb) of data, and that amount will continue to grow at a rate of 1.7 Mb per person per second . In addition to this, crucial optimization problems like traveling salesman, protein folding, and, most importantly to me and this crowd, the training phase in machine learning, are also proving to be extraordinarily difficult for even the best modern computing systems to implement at large scale. So what do we do?
One thing we can do is improve computer speed. Most people have heard of Moore’s law – it’s the notion that the number of transistors in the top-of-the-line processors doubles approximately every two years . There is a great fear in the tech field that we may not be able to continue Moore’s Law for much longer, and this will lead to a great stagnation in our analytics progress (and other technologies). What most people don’t realize about our current technological progress, though, is that even if Moore’s Law never ends, and we keep scaling up our tech at its current speed ad infinitum, there are still problems we will never solve at scale! They simply require too many calculations to solve after they grow beyond a certain size. One class of problems that fall into this category are combinatorial optimization problems like the classic traveling salesman problem.
Enter quantum computing, the coolest of all the alternative computing paradigms [no citation needed]. Quantum computing is a relatively new tool in the arsenal against big data and hard problems. Quantum computing is a special type of computing that uses the quantum properties of certain materials to perform calculations in ways that are impossible for traditional computers. It has been shown theoretically that computers of this type will be able to perform certain types of calculations at speeds impossible to classical hardware – and this is quite exciting.
Quantum computing’s potential is enormous: vastly improved optimization, better search, increased security, advanced artificial intelligence/machine learning, and much more. Researchers all over the world are still working hard to figure out what a fully functional quantum computer could do, and more importantly, how do we build one?
Booz Allen is part of the growing field of quantum computing researchers looking to push the limits of existing technology and tip the scales of the data vs. analytics tech arms race in our favor. We are researching machine learning and optimization problems on specialized devices known as quantum annealers. Quantum computing is very young and still has a long way to go, but shows great promise.
See you in the future.
- Emc.com. (2014). The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things Sponsored by EMC. Retrieved 11 December 2015, from
- To be clear, Moore’s law is not an actual physical law. Rather, it is an observed trend that has continued for some time.
—Written by Joseph “JD” Dulny III