How COVID-19 Is Affecting Data And AI Algorithms
Since the world was asked to lockdown, we’ve seen some dramatic shifts in human behavior – for systems that monitor and interpret “how we move” and “what we do” via backend AI algorithms, otherwise good data is causing issues with various data models. The usually bustling streets in New York are relatively sparse, stores have far fewer patrons, and so on, all thanks to the pandemic.
Here, artificial intelligence (AI) and machine learning (ML) refer to analytics systems that use structured data to learn and complete tasks like make various kinds of predictions. In these cases, AI relates to business systems that tether with financial, inventory, consumer-facing apps, and others to make predictions where ML is sometimes used as a subset for unsupervised learning to aid in decision making.
Data currently being collected and interpreted by AI systems is reflecting fewer people doing what we normally do, whether it’s traveling, shopping, or simply visiting places we frequent. Let’s take a look at some examples then dive into ways that, if left unchecked, data collected during the pandemic will disrupt AI-driven predictions.
Examples of AI trends today
It might sound like hyperbole to say that AI is being used for everything today but it’s not far from the truth. We use AI and ML algorithms – or rather, they’re incorporated by apps we use – to infer relationships for large datasets collected by virtue of users simply engaging with an app.
When you use an app of any flavor, most backend systems pool data such as how long you spend on a certain screen, where you navigate to next, your physical location from a GPS, ads you might watch or click, things you might buy, and much more. This information is then processed by AI algorithms to accomplish various objectives.
For example, Google and Waze use non-identifiable information (i.e. non-PII data) collected by users as they travel to generate data models like heatmaps. When you Google a restaurant and observe the Popular Times section that shows how busy the location is expected to be, know this trend generated by an AI algorithm which uses historic information and some real-time anonymous location data to give you an idea of how busy a location might be at that very moment.
Waze works in a similar fashion to determine the best traffic routes which is incredibly helpful for congested metropolitan areas. Trends are derived by AI algorithms that look at historic datasets and are used in conjunction with data about the roadways (e.g. speed limits, roadblocks, traffic stop duration averages, etc.) along with some real-time info to generate the most efficient route for users.
Since most have had no choice but to observe lockdown to varying degrees, these models are getting not only less data than normal but skewed data as people’s movement patterns have dramatically changed. Data collected over the past few months is a little “off” compared to normal which might result in big problems when life returns to normal and these algorithms start making predictions based on this data.
The problem with outlier data in AI algorithms
Let’s say you wanted to come up with your own system to propel a vessel into space, land on the moon, and make it back to Earth. You’ve masterfully designed a spacecraft that has the capability to do so because you’re a genius and a savvy engineer.
Now let’s assume NASA has opened up a publicly accessible data lake with information collected from every space voyage, all the way back to the original Apollo missions. With this data adjusted for the performance specifications of the craft you built, you could build an AI algorithm that would handle every detail of the voyage from escaping the Earth’s atmosphere, adjusting trajectory and speed while taking into account the shifting gravity and electromagnetic fields as it travels through space before making a comfortable landing on the surface of the moon.
Data from successful missions would be highly valuable but there are also missions that didn’t go so well, namely the missions that involved the Challenger and Columbia.
Data collected from the botched missions we just mentioned (and possibly that of non-Lunar missions) has the potential to obfuscate an algorithm because it exists as an outlier to what is considered “healthy data” or data that aligns with the scenario to which it’s applied. Imagine, after all the mental and physical work you put in, your system decides to focus on the data extrapolated from a mission where the spacecraft exploded – instead of pushing through the atmosphere, your marvel of engineering veers off course and rockets laterally across the Earth, forcing an emergency landing.
Without some manual intervention or a specifically tailored (possibly secondary) AI algorithm that recognizes bad data, information pulled from datasets has the potential to be treated all the same. We’re seeing this now with the above-mentioned systems from Google and Waze. Too, this bad data is affecting other systems as well so let’s look at a couple of examples.
AI in retail. Most retail businesses operate around a model that looks at sales figures from the year prior to determine sales goals for a day, especially at brick-and-mortar locations. One of my marketing people describes the system used during his time at Lenscrafters locations which essentially would create sales figures for a day by referencing the same day from the year prior and increasing the figure by a modest percentage.
This archaic system has obvious flaws, as it didn’t take into account the day of the week among other pertinent factors such as staffing, stock availability for the 1-hour lab, and so on – understandably, the amount of money made on any given day felt like a toss-up.
Today, retail is using more sophisticated AI algorithms to more accurately project sales figures, but the basis for calculating sales, especially at brick-and-mortar locations is still quite similar. With the COVID-19 pandemic throwing a wrench into sales for virtually all retailers, their projections are going to be thrown off in the coming years.
If left unchecked, the data relative to the past couple of months of lockdowns will cause businesses to under project sales for the following year which could possibly lead to some better margins for a future day but it will affect other areas like staffing since most businesses tend to schedule around the dollar amount projected for the day. Not having enough staff because the system demands less labor will likely lead to less sales and missed opportunities because of not having enough staff to engage with customers.
Apparently, this was a frequent problem with the old Lenscrafters system as lackluster sales the year prior would mandate less staff for the same day of the current year, causing opticians to get overwhelmed to the point where customers went unattended or the individuals in the lab would receive more orders than they could process in a timely fashion.
Hence, it will be important for analysts of retail systems to either modify their sales analytics system to weed out outlier data (or manually do so) otherwise, sales figures will continue to mismatch with future projections.
AI for eCommerce and digital services. Because of the pandemic, our behavior has altered in other ways such as changing our buying behaviors and consuming more digital content than usual because, well, we need something to do during all this downtime. While this is good for some on-demand service providers, it’s also causing issues in other areas.
The shift in Amazon sales detailed in the last link is one of many major behavioral shifts we are currently witnessing. Companies that normally sold a modest amount of household items like toilet paper and cleaning supplies quickly overtook Amazon staple products like phone accessories and Legos that typically dominated sales charts.
At the end of the day, Amazon itself is affected marginally while these typically lower cost items overtake more expensive items like toys and other entertainment goods. The real problem is how the backend systems and AI algorithms have responded to pandemic-drive purchases.
Inventory systems for major supply chains now think that buying these items in large quantities this is the new normal – to the AI, humans are now in dire need of large amounts of toilet paper, cleaning supplies, and other “pandemic items” that suddenly became immensely popular as COVID-19 tightened its grip on the world.
It’s safe to assume that some warehouses right now are overwhelmed with trying to find places to put all the extra toilet paper that AI-driven inventory systems automatically ordered as if the entire world made Taco Bell breakfast a staple to its diet.
Streaming services like Netflix, Disney+, Hulu, Amazon Prime, and others have all seen dramatic increases in subscribers. This is great for the market but the problem is that these sales are a direct result of the COVID-19 pandemic.
On the backend, these services require more resources to run content that’s delivered over the Internet to the consumer’s device. The biggest content streaming services, like Netflix, use specialized pricing models that take into consideration the expected usage of computing resources (Netflix uses AWS) against actual usage which is balanced out over time.
Increased usage will cause Netflix to pay more in the coming months. If left unchecked, the systems that calculate usage would cause Netflix to spend higher rates than usual, based on current trends, even though the amount of content consumption will dwindle as people return to work.
In the case of Netflix, this means AI algorithms should be adjusted accordingly to accommodate recent changes observed in purchasing and usage trends. However, if we see that Netflix raises rates on our consumers in the near future, it’s almost certain that COVID-19 will be to blame.
Blue Label Labs works with AI to ensure the best results for our customer’s apps
Thankfully, AI isn’t at the point where, if left unchecked, robots will take up arms and begin to enslave the human race. What we are seeing is emerging problems with navigation systems, visitor predictions for brick-and-mortar locations, and predictive sales systems.
At Blue Label Labs, we understand the value of AI and ML but also know that the technology is still young and has the potential to learn from bad data. We realize data hygiene is important meaning it’s crucial to manage outlier data and other aberrations to prevent otherwise good AI algorithms from producing bad results.
Get in touch with us to learn more about how the AI systems we use for apps can have remarkable effects for your business.