AI’s Achilles Heel Exposed – AI Validation Industry Must Emerge ASAP
https://dynastywealth.com/ais-achilles-heel-exposed-ai-validation-industry-must-emerge-asap/
By Michael Markowski
Published January 9, 2024
2023 was, and perhaps the entire decade ending 2030 could be, defined by the emergence of AI (artificial intelligence). AI related stocks powered the S&P 500 to a gain of 24.2% for 2023. Microsoft’s share price increased by 57% and the chart below depicts the 238% gain for leading AI chip maker Nividia in 2023.
Wedbush analyst Dan Ives on December 28 stated that, “We view this as Microsoft’s ‘iPhone moment’ with AI set to change the cloud growth trajectory in Redmond over the next few years”. “We continue to believe this is a ‘1995 Moment’ with a transformational tech spending wave not seen since the start of the Internet in the mid 90’s with AI now hitting the shores of the tech sector,”
AI is widely considered to be the panacea that will change the ways that humans work and live for many lifetimes.
The sobering reality is the output of unvalidated AI is fraught with hallucination risk. Sam Altman, the CEO of OPENAI, the leading AI company, in an ABC News interview could not have said it better:
“The thing that I try to caution people the most is what we call the ‘hallucinations problem,
the model will confidently state things as if they were facts that are entirely made up.”
For AI to change the world and especially for commerce will require a radical change. An auxiliary industry, AI Validation to support, validate and verify AI output has to emerge quickly. Otherwise, AI will become inapplicable and its growth will be stunted.
Uniquely positioned to fill the need and to be a leader of the new industry is EmotionTrac. The company has patented and automated technology which tests created content. My December 15, 2023, report “Two Paths For EmotionTrac Shares to Increase by 25X” covered EmotionTrac’s progress in selling its SAAS (Software As A Service) solutions into two of the nine verticals, Advertisement Testing and Legal Case management, in which its technology is applicable. The nine have an aggregate addressable market of $1 trillion.
The 10th vertical emerging for EmotionTrac, AI Validation is projected to reach $158 billion by 2032. The link below is to a video clip from my 12/23/23 weekly “Markowski on the Market” session in which the hallucination problem was initially covered.
Hallucinations, The Elephant in the Room
The content that Generative AI ― including ChatGPT, Dalle and other AI generative apps ― produces has the potential to be “hallucinative”. According to the NY Times, up to 27% of all AI generated content is hallucinative. Without validation, AI-produced content can substantially damage a brand, product, business and even a person’s reputation. For this reason, these hallucinations are the Achilles Heel for generative AI.
Please note. Microsoft, a marketing partner of EmotionTrac, made the company aware, during a VIP meeting, of the AI hallucinations problem. Microsoft has made it clear for EmotionTrac multiple times that, “The human element cannot be taken out of AI.” In the last several months there have been collaborations with Microsoft strategic team members about providing EmotionTrac as a solution for ChatGPT Hallucinations.
Below are actual AI hallucinations that have occurred:
1. AI hallucination occurrences published on IBM.com. See “What are AI hallucinations?”…Some notable examples of AI hallucinations include:
- Google’s Bard chatbot incorrectly claiming that the James Webb Space Telescope had captured the world’s first images of a planet outside our solar system.
- Microsoft’s chat AI, Sydney, admitting to falling in love with users and spying on Bing employees.
- Meta pulling its Galactica LLM demo in 2022, after it provided users inaccurate information, sometimes rooted in prejudice.
2. “New ad from Transcend warns of potential brand damage caused by hallucinating AI” drum.com 11/06/23
“(The spot was based on an actual incident from earlier this year in which a New Zealand supermarket chain’s branded AI chatbot, which was supposed to generate creative meal ideas, instead provided a user with a recipe for chlorine gas.)”
3. “Users complained that such chatbots often seemed to pointlessly embed plausible-sounding random falsehoods within their generated content. By 2023, analysts considered frequent hallucination to be a major problem in LLM technology, with some estimating chatbots hallucinate as much as 27% of the time. Wikipedia.
4. “Chatbots May ‘Hallucinate’ More Often Than Many Realize”, New York Times, 11/16/23:
“When Google introduced a similar chatbot several weeks later, it spewed nonsense about the James Webb telescope. The next day, Microsoft’s new Bing chatbot offered up all sorts of bogus information about the Gap, Mexican nightlife and the singer Billie Eilish. Then, in March, ChatGPT cited a half dozen fake court cases while writing a 10-page legal brief that a lawyer submitted to a federal judge in Manhattan. Now a new start-up called Vectara, founded by former Google employees, is trying to figure out how often chatbots veer from the truth. The company’s research estimates that even in situations designed to prevent it from happening, chatbots invent information at least 3 percent of the time — and as high as 27 percent.”
5. Dec 29 (Reuters) – Michael Cohen, Donald Trump’s former fixer and lawyer, said in court papers unsealed on Friday that he mistakenly gave his attorney fake case citations generated by an artificial intelligence program that made their way into an official court filing.
Cohen, who is expected to be a star witness against Trump at one of the former president’s criminal trials, said in a sworn declaration in federal court in Manhattan that he did not realize the citations generated by Google Bard were fictitious.
6. Popular AI Chatbots Found to Give Error-Ridden Legal Answers – Bloomberglaw.com, 01/15/2024
“Large language models hallucinate at least 75% of the time when answering questions about a court’s core ruling, the researchers found. They tested more than 200,000 legal questions on OpenAI’s ChatGPT 3.5, Google’s PaLM 2, and Meta’s Llama 2—all general-purpose models not built for specific legal use.”
According to a new report by Bloomberg Intelligence, the Generative AI portion of AI “…is poised to explode, growing to $1.3 trillion over the next 10 years from a market size of just $40 billion in 2022.” The report also states that Generative AI will expand at a CAGR of 42%. The chart below from the Bloomberg report depicts that the spend for Generative AI will increase from less than 1% of the total Technology spend in the world in 2020 to 12% by 2032.
The table below contains a breakdown of two AI segments which comprise the Total AI Market in 2032.
Sources: Bloomberg Intelligence | Market.US
The AI Validation Industry
New industries which are slated to change the world result in ancillary and supportive industries emerging. The Gold Rush created the durable apparel industry and Levi Strauss. The automobile created the tire industry and Goodyear. The Smart Phone created the social media industry and Facebook. AI generated hallucinations has created the need for an industry to validate and verify AI to quickly emerge.
AI Validation has the potential to quickly become one of the world’s largest ancillary industries. It’s because all AI generated content will need to be checked or validated by humans before the content can be confidently utilized for commerce. A significant portion of this content, such as advertising and media, that will be intended for broad audience exposure will need comprehensive testing before it can be released. Additionally, because AI-generated content has the potential to be offensive, content will need to be validated to cover all demographics.
The 2023 Bud Light and the 2019 Peleton TV commercials are good examples of content being utilized for commerce before it was reviewed by a diverse focus group. The inadequately reviewed advertisements that were aired by both companies damaged their products and negatively impacted share prices and market valuations. Had both ads been tested by a $600 testing solution that is offered by EmotionTrac, neither would have aired. See “Peloton’s controversial ad goes viral and sparks backlash” and “Bud Light’s New Ad Met With Backlash on Social Media”.
The need for all AI to be validated and verified was identified in 2021 by SEBoK, an acronym for “System Engineering Body of Knowledge”. SEBok is a community-based forum that regularly updates the baselines for systems engineering (SE) knowledge.
A co-authored white paper “Verification and Validation of Systems in Which AI is a Key Element” that was initially published on SEBoK in 2021 provided the rationale for all AI output to be verified and validated. The first two sentences of the first paragraph of the white paper:
“Many systems are being considered in which artificial intelligence (AI) will be a key element. Failure of an AI element can lead to system failure (Dreossi et al 2017), hence the need for AI verification and validation (V&V).”
The white paper also included the “Characteristics of AI Leading to V&V Challenges”. See excerpts below:
- Lack of an oracle: It is difficult or impossible to clearly define the correctness criteria for system outputs or the right outputs for each individual input.
- Imperfection: It is intrinsically impossible to for an AI system to be 100% accurate.
- Uncertain behavior for untested data: There is high uncertainty about how the system will behave in response to untested input data, as evidenced by radical changes in behavior given slight changes in input (e.g., adversarial examples).
- High dependency of behavior on training data: System behavior is highly dependent on the training data.
Actual & Projected Global AI Validation markets:
The projections for the AI Validation and Verification (V&V) industry in the table below were calculated under the assumption that a user of AI to produce content would be willing to expend a percentage of the total spent on the AI application, to have the AI and its output validated and/or verified. The assumption is conservative. The cost to validate or verify the output, of especially Generative AI, would likely be much higher than the cost of a subscription to a generative AI solution or app including ChatGpt or Dalle.
The table below contains the 2032 projections for AI and a breakdown of the total market into AI’s two primary segments, Generative AI and Traditional AI. The two combined comprise the projected $2.7 Trillion market for AI in 2032. The total market for AI Validation is projected to reach $158 billion by 2032. AI Validation projections are based on the following assumptions:
- Generative AI validation equivalent to 10% of $1.3 trillion Generative AI market for 2032.
- Traditional AI validation equivalent to 2% of $1.4 trillion Traditional AI market for 2032.
Sources: Bloomberg Intelligence | Market.US
Generative AI Validation will grow faster and account for the largest proportion of the AI Validation market by 2032. The difference in proportions is because Generative AI and Traditional AI are completely different:
- Traditional AI is disciplined and performs specific tasks based on predefined rules and patterns. For this reason, validations and verifications of the AI is less of a priority and are less frequent.
- Generative AI has no limits and seeks to create data and content that mimics human-created content. For this reason, and most especially the high probability for hallucinative content to be created, all AI generated output to be utilized for commerce must be validated and/or verified.
A good analogy for AI is generative is consumer and traditional is industrial. For another comprehensive explanation for the difference between traditional and generative AI, the Forbes article below is highly recommended.
“The Difference Between Generative AI And Traditional AI: An Easy Explanation For Anyone”, July 24, 2023
The Traditional AI validation segment of the AI Validation industry is more similar to industrial software. Software for industry is purchased outright and can also be available via a license. The testing event only occurs after the software is installed. Since the frequency of testing the software is low the recurring revenue stream that is received from testing is minimal.
Generative AI Validation is exactly the opposite. Testing is conducted every time content for commerce is generated from the generative AI application. Each and every output from the generative AI application that is used for commerce will have to be validated. Thus, the frequency of tests for generative AI output will be much higher than traditional AI.
The global software testing market is a conservative example for comparing revenue projections for the AI validation industry. For 2002 the global market for software was $589.6 billion. Global software testing for 2022 was $45 billion. Software testing was equivalent to 7.6% of the total software market for 2022.
The utilization of 7.6% of the $2.7 Trillion for AI by 2032 equates to the AI Validation industry being at $201 billion also by 2032. Thus, an argument can be made that the $158 billion projection in the above table is conservative.
Additionally, those utilizing generative AI applications to create marketing and promotional content will be willing to pay for their AI generated content to be tested to maximize the ROI on their expenditures to broadcast or distribute the ad or content that is created. The testing of the advertisements or promotional content will be a prerequisite prior to a significant amount being expended to broadcast or air content.
EmotionTrac™, which I have covered since 2016, is well positioned to be among the early leaders of the Generative AI segment of the AI Validation Industry, projected to reach $130 billion by 2032. The company’s patented content testing technology and solutions have proven to be effective for testing all types of content and media including:
- Video
- Billboards
- Animatics
- Story boards
- Visual media
- Radio
- Copy
EmotionTrac’s tests are automatically created and deployed to a minimum of 100 or more consenting panelists throughout 130 countries. Test results are generally available within an hour. Solutions are capable to test approximately 200 demographic targets.
The probability is also high for EmotionTrac to become the standard for validating generative AI output:
- 80% minimum gross profit margin for all products and solutions since inception – EmotionTrac has proprietary technology and systems to support the low-price solutions or applications which will be needed to validate the output from generative AI that is slated for commerce. EmotionTrac’s lowest priced offering to date, its advertisement testing solution is $600.
- Offerings have proven to add value – Repeat customers, including the largest U.S. personal injury law firm Morgan & Morgan since 2021 confirm that EmotionTrac’s automated content testing solutions provide sufficient value.
The probability is very high for EmotionTrac to become recognized as an early leader of the AI Validation Industry. Should this happen, EmotionTrac’s valuation could easily jump to $1 billion and become recognized as a unicorn almost instantly. There are now three ways for EmotionTrac to reach a billion-dollar valuation. To reiterate, my 12/15/23 report, “Two Paths For EmotionTrac Shares to Increase by 25X” was about the two verticals that EmotionTrac has been selling its solutions into and that either could enable EmotionTrac to reach a valuation of a billion.
The AI Validation vertical that was just discovered also provides the potential for EmotionTrac to reach a $10 billion valuation and for its share price to multiply by 250X pretty quickly. OpenAI, the developer and owner of ChatGpt raised 1$ billion from Microsoft in 2019 at a $14 billion valuation. OpenAI’s valuations since 2019 were not based on its financials. It produced revenue of $44,885 for 2022. Instead, the valuations were and continue to be based on its being recognized as the leader of AI. In December 2023, Bloomberg reported that OpenAI was in discussions to raise capital from investors at a $100 billion valuation.
EmotionTrac was discovered after I conducted research on the companies in the table below to find possible common denominators. My identification of the common traits of the four enabled the identification of EmotionTrac. The conceptual stage idea, which possessed similar traits as the four startups, was a finalist for Florida Atlantic University’s best business plan award. I initially recommended the shares (former name, Jinglz) on 11/30/2017. See “Jinglz: Killer App Investing Opportunity” report.
The value in 2017 for a $25,000 seed round investment into the startups in the above table which was made as early as 2008 and as late as 2012 ranged from $40,000,000 to $500,000,000. What was fascinating was the maximum time for the three in which $25,000 increased to over $100 million was nine years. My continuing research of the digital startups resulted in my discovery of the ongoing transformation of the global economy to digital from industrial. To understand why the transformation has and continues to create Dynasty Wealth building opportunities read my report “Third Transformation for Economy since 18th Century Creating Opportunities to Build Almost Instant Dynasty Wealth”. ( report link: https://dynastywealth.com/third-transformation-for-economy-since-18th-century-creating-opportunities-to-build-almost-instant-dynasty-wealth/ )
To understand how its possible for a digital company to almost instantly reach a $10 billion valuation view 3:47 seconds “Digital disruptor companies have the potential to get $10 billion valuations quickly” video below:
Dynasty Wealth and its investor members have made investments in EmotionTrac and at steadily increasing valuations and share prices since 2016. For information on how to invest in EmotionTrac click here.
Dynasty Wealth LLC (DW) has been paid to assist and advise EmotionTrac since 2016. Under the agreement DW has received cash, shares and stock options as compensation for the services it has rendered and continues to provide to EmotionTrac. For terms and conditions see Financial Relations Agreement.
Michael Markowski, a 47-year financial markets veteran, is the Director of Research for Dynasty Wealth. He conducts empirical research of the past, which he then utilizes to develop algorithms to predict the future. His research of Enron’s Financial Statements after its infamous bankruptcy led to the development of a Cash Flow Statement algorithm. The algorithm was utilized to predict a “day of reckoning” for Lehman, Bear Stearns, Merrill Lynch, Morgan Stanley and Goldman Sachs in a September 2007, Equities Magazine article. Michael’s research of prior market crashes led to the development of the Bull & Bear Tracker (BBT) algorithm. From 2018 to 2022, the BBT gained 177% vs. the S&P 500’s 50%. His predictions of all periods of heightened market volatility from 2008 to 2022 and that S&P 500 at March 23, 2020 had reached its bottom which was exact are media verifiable.