The idea of collecting data isn’t new. As far back as 18,000 BCE, Palaeolithic people used tally sticks to record trading activity and keep track of food supplies. The first known census was taken 6,000 years ago in 3800 BCE by the Babylonians, and ancient merchants are known to have kept scrupulous records on their customers. Of course, the methods used to collect data have evolved through the ages. Today data is being collected on each and every one of us from the moment we awake to the moment we turn off our devices at night. Actually, our devices continue to collect data as we sleep. What makes this latest iteration of data collection possible is artificial intelligence (AI), which has been a game-changer for collecting and analyzing data.
When most people think of AI, they think of technology like Alexa, machine vision or recommendation engines, but those are just a branch of AI. AI is actually a very broad umbrella term that covers different branches of mathematics, algorithms, computer optimization, statistics, probability, data engineering, machine learning operations and software engineering. Just as you build the foundations of a house, these components of AI work in tandem to enable the high-tech features of AI.
Organizations are increasingly making use of AI-enabled devices to collect a whole array of data from various sources and then evaluate and harness the data in myriad ways that are increasing efficiencies, lowering costs and improving the consumer experience.
Today, AI-enabled devices are listening in or seeing what we’re doing and collecting and digitizing massive amounts of data. What underpins the entire enterprise is the less sexy work — keeping data lakes and warehouses that are able to store the data, performing basic data engineering tasks to add structure and using statistical analysis to make sense of it. Once the data has been processed, it can be made available to AI algorithms that use it to make automated “intelligent” decisions.
The underpinning was there before but in a different form. There have been enormous improvements in our ability to store big data and process it in near real time, and this has allowed us to unlock the true value of AI. Now it’s also consumer-facing, and end consumers can interact with the AI product — so it feels tangible.
Cracking the Five Senses
What’s extraordinary about the evolution of AI is that we’re well on our way to cracking all five senses.
Let’s look at optical character recognition (OCR) and intelligent document processing. Think about the way invoice processing worked 10 or 15 years ago: There were a lot of people in the mailroom opening envelopes and keying information into a computer. Now, invoices arrive electronically and vision AI can extract the relevant data and update systems through workflow automation. In the case of the invoice, there might be specific technical finance terms, so the machine learning algorithm hands the invoice off to a natural language processing algorithm that has been trained on finance data sets so it can understand finance language. We’ve also become adept at analyzing images. There are cameras on traffic systems that can detect speed and read license plates as cars go by at 60 miles per hour. And we’re able to transcribe speech by converting it to bytes, zeros, ones and metadata.
Such advances have allowed us to crack two out of the five senses: vision and speech. The benefit is obviously speed and labor efficiency, but it’s also accuracy. As the AI gets better and better, you’ll have fewer mistakes because there is less interaction with humans who are naturally prone to more errors.
“Machine vision” and “machine hearing” have become ubiquitous in our lives, and we are well our on our way to developing the other three senses. It’s a progression; 15 or 20 years ago, if you’d said, “Hey, there’s going to be a device that can understand you and know exactly what you’re talking about,” nobody would have believed that. So, let’s fast forward another 20 years, and I’m sure we’ll come across a device that can smell and taste.
The Frenchman and the Terminator
As AI has expanded the ways data can be collected, it’s also expanded how data is analyzed. There are two notable examples of how AI has enabled vast improvements in the area. The first involves a technical method called transfer learning. In the last three to four years, in the field of deep learning and natural language processing especially, many companies have built models, and they’ve made them open source for everybody to use. And what this transfer learning concept entails is taking something that’s already been prebuilt and fine-tuning it with your specific industry or use case data and making it even more intelligent. (Think of the Terminator movies when Skynet becomes self-aware.)
We’re coming to that zone now where these AI algorithms are getting really good at becoming self-aware.
The second example (which is more analysis and analytics than AI and predictive algorithms, but is equally important) is the advent of a class of analysis and statistics called Bayesian statistics.
The French theologian and mathematician Thomas Bayes first developed Bayes’ theorem more than 250 years ago, and it’s taken that long for us to finally realize its potential. It’s a fundamentally different concept from the classical statistics we were taught in the 1980s and 1990s, and the reason we were taught that particular breed of statistics is that the machines at that time could only compute approximate variables or parameters. Furthermore, classical statistics was the better fit when we didn’t have massive troves of data but only a handful of data points (30–50). Now, because of the uniqueness of data, advances in compute capacity and better algorithms, we’re able to more easily do Bayesian statistical inference, which is a completely different and more intuitive way to think about data.
A concrete example is one looking at a distribution and how likely it is that a particular variable of interest is to fall within a certain bound. That’s the Bayesian approach to it, that we have a high degree of credibility that this unknown parameter we’re tracking is more likely to fall in this range than that range. Those are just two examples of how artificial intelligence has changed analysis. Because these algorithms have improved, and the compute power has increased exponentially, this type of analysis has really started coming to the forefront.
Responsible use of Data and AI
These machine learning algorithms are immensely powerful, and they can do a lot of good for society, but with great power comes great responsibility. It only takes one bad character to do something nefarious, and many of these things can be weaponized.
The techniques, algorithms and mechanisms exist today to encrypt data and anonymize it. So it’s more a function of the discipline, of implementation and making that investment. At the end of the day, when a software vendor or a service provider gets hacked and the personal information of consumers gets out there, those consumers are going to vote with their feet. Consumer pressure is going to force companies to be more stringent about security.
Better. Faster. Smarter.
We’ve seen that AI has enabled new ways of collecting and analyzing a wide array of data, which is enormously beneficial to organizations and consumers. The classic example is Alexa. As you talk to Alexa in real-time, it’s taking the information, digitizing the data, putting it into a data pipeline, and analyzing it so the AI algorithm becomes better in near real-time — the AI algorithm has been fine-tuned specifically for you and your vocabulary and your world. What makes an AI algorithm really potent is when it can take and start making connections with other data sets and data points.
Amazon knows your shopping behavior, what you’re doing at home, what music you like listening to, what you like eating, and even knows whether you live in an independent home or a condo, in suburbia or in the city. When you start collating all these pieces of information and feeding these multiple features of attributes to the AI engine, that’s when it really becomes intelligent and can cater to your specific needs.
It’s faster because you have devices that are constantly listening and streaming data. One of the biggest advances for AI has been big data architecture and lakehouses where you can process streaming data in near real-time for billions of people on the planet.
As for smarter, AI algorithms are getting so good that they might eventually start writing the code that a programmer typically writes today. It doesn’t necessarily mean that programmers are going to start losing their jobs, but it means they will have to continuously learn and innovate and stay on top of these changes.
Just as the Palaeolithic people and their descendants had to evolve as the technology to collect data advanced, so too must we adapt as AI takes us to new frontiers.
Alexa, end this article.