Not too long ago, I was writing a piece on the RUL of assets in modern industries. And, like any writer in 2023, I found myself typing away for answers on ChatGPT. I asked the chatbot, quite politely, “Please find real-life examples of the positives of using RUL for companies.” A press of a button, and ChatGPT presented a list of major companies, data on their success stories, and even the names of their case studies, highlighted as the “source” for each example. The answers were perfect—maybe a little too perfect, even.
I decided to dig deeper and googled the names of these case studies. This made me run into a tiny problem: none of these case studies actually existed! Yes, the companies were real. And yes, they most probably did leverage RUL estimates. But the case studies that the chatbot so conveniently mentioned as the “source” for each example were all fabricated.
Seeing sophisticated LLMs such as OpenAI’s ChatGPT and Google’s Bard spit out entirely fictitious facts and figures paints a rather unfortunate picture for modern AI tools.
A myriad of questions emerged. How many times have I been given a fabricated answer from these AI chatbots? How many times have just accepted it as true? Are any of the answers true at all? And, more importantly, how can I make sure never to fall for these lies again?
Delving deeper into this issue, it quickly became evident that these fabrications and inaccuracies were already well documented. One particularly prominent case was that of Roberto Mata’s lawsuit against Avianca Airlines. It started off as any other personal injury lawsuit filed in New York federal court; the case itself wasn’t very interesting. However, when Avianca moved to dismiss the case and Mata’s attorney, Peter LoDuca, cited previous cases and legal decisions in favor of his client to oppose the motion, things took a fascinating turn.
The attorneys representing Avianca Airlines claimed they couldn’t find a number of the legal cases that LoDuca had cited. This ultimately led to a surprising revelation. Nobody could find any information on these cases because they were completely made up and fabricated by OpenAI’s popular AI marvel, ChatGPT. The case grabbed headlines, showcasing not only a comical turn of events but also the fatal inadequacies of our modern artificial intelligence.
The tech industry has collectively given these inaccuracies and falsifications the name “hallucinations.” Seeing sophisticated LLMs such as OpenAI’s ChatGPT and Google’s Bard spit out entirely fictitious facts and figures paints a rather unfortunate picture for modern AI tools. But before making a definitive statement on the capabilities of AI, it’s important to understand the hows and the whys.
How do we actually define a hallucination? Merriam-Webster defines it as “a sensory perception (such as a visual image or a sound) that occurs in the absence of an actual external stimulus.” But this isn’t really how an AI hallucinates. For starters, modern chatbots don’t have the ability to take in sensory information. Hence, a better word here would be “confabulation”.
A confabulation can be described as “the production or creation of false or erroneous memories without the intent to deceive.” This is a more accurate description of what modern LLMs do. Since generative AI tools, such as ChatGPT or Bard, fundamentally rely on complex algorithms that merely analyze human language and string words together to match your query perfectly, they overlook a critical aspect of any answer: logic. With logic and reasoning taking a back seat, generative AI produces answers that sound extremely realistic. But here’s the part to keep in mind: it never actually understands what it’s saying. So, unlike Johnny, AI isn’t really telling you lies; it simply can’t comprehend its own answers. Moreover, in the case of OpenAI’s ChatGPT, the training data only goes up to September 2021. Since the data it has is limited, and it browse or carry out new research, ChatGPT fills in the gaps to give users the ideal answers.
These shortcomings of generative AI pose a massive problem. Currently, ChatGPT is used by millions of people from a plethora of disciplines across the globe. People use it for a variety of functions and often even look at it as an online tutor. But with the reliability of AI coming into question and no method of ensuring the accuracy of its data, it’s difficult to justify its usefulness.
So, how should we go about using these modern AI marvels? Should we stop using them altogether? I, for one, don’t think so.
After gaining a clearer understanding of AI hallucinations, we can see that to justify the use of these chatbots, we need to be able to prevent the generation of false information. But how? Eliminating AI hallucinations can be a daunting task; however, there are a few ways we can minimize their occurrence.
AI has developed rapidly in the last few years, becoming the hottest topic in technology. Given its recent struggles with hallucinations, it is clear that fighting this issue is the next big step.
OpenAI has already started the battle by adopting a new reward-based approach called “process supervision.” According to researchers, the goal is to encourage models to adopt a more human-like chain of thought. In a recent discussion with CNBC, Karl Cobbe, a mathgen researcher at OpenAI, said, “Detecting and mitigating a model’s logical mistakes, or hallucinations, is a critical step towards building aligned AGI [or artificial general intelligence].” He went on to add, “The motivation behind this research is to address hallucinations in order to make models more capable of solving challenging reasoning problems.”
Despite these adversities and the ongoing skepticism, I’m keeping an optimistic view. Without a shadow of a doubt, modern AI remains a testament to the massive leaps we have made in technological development. Like with any other technology, challenges and barriers exist, but they won’t be enough to keep us bound forever. We are witnessing a vibrant new era of innovation, and the wave of AI will continue to sweep across our industries.
____________
Written By: Emerlad Tuladhar
There is such a thing as too much of a good thing! Just ask companies dealing…
Imagine if, instead of renting cameras, hiring actors, and booking a set, you could type…
Workplace dynamics have seen monumental shifts over the last several years, with diversity and inclusion…
Reports suggest the Trump administration’s AI policy will show a greater risk tolerance for the…
“Ever tried. Ever failed. No matter. Try again. Fail again. Fail better.” –Samuel Beckett The…
Ever-more capable AI music tools emerging are set to spark a meteoric explosion in the…