Plus: How To Use Testing To Build Trust In AI |
As technology advances and companies learn from AI, industrial leaders say it’s time to put their findings to work, bringing sophisticated AI systems online to leverage data and increase efficiency, productivity and output. Honeywell CTO Suresh Venkatarayalu shared his company’s progress and plans to do just that in a keynote address this week at Honeywell’s Future of Energy Summit in Washington, D.C. “I want to paint a picture on the breakthrough opportunities that we may have by leveraging data and AI in the industrial sector, and the way we potentially see that transforming,” Venkatarayalu said. Using AI, he said, there is likely to be a huge impact on the energy sector—reductions of unplanned downtime, addressing a forecast shortage in qualified workers, and increased efficiency. Many energy plants have been moving toward automation for decades—and with those moves, more access to data has been unlocked. Honeywell, which says its HVAC and other building infrastructure products are present in 25% of buildings worldwide, is using that data to train its AI-based knowledge systems on plant conditions, operations and outcomes. This allows the AI system to be more predictive, letting employees know what may happen, where operations should shift, and how to make adjustments before a problem occurs. Venkataralayu said Honeywell has run a number of pilot projects during the last two years to figure out the best ways to deploy AI. The results have been promising: A 1% to 4% enhancement in production, a 10% decrease in unplanned downtime, and a 4% boost in uptime of assets. With projects like these done by Honeywell, as well as other industrial companies in other sectors, AI is on the cusp of transforming industry. Venkatarayalu predicted that 2025 and 2026 will be the years that these big initiatives start making a difference. The time may be near that the general public will start to understand AI is capable of much more than chatbots. Because generative AI systems tend to operate in black boxes, it makes sense to test them before they are deployed in order to make sure responses are appropriate and unbiased, and meet regulatory standards. I talked to Philipp Adamidis, CEO and cofounder of AI testing company QuantPi, about when and how to put your systems to the test. An excerpt from our conversation is later in this newsletter.
If you like what you read here, you can easily share it online and on your social media pages. This newsletter, and all previous editions of Forbes CIO, can be found on our website here. |
|
In today’s CIO newsletter: |
|
The economy as a whole may be precarious, but the tech sector is doing well. In the last week, Microsoft, Meta and Google have reported expectation-smashing earnings. On Wednesday, Microsoft posted its best-ever quarterly revenues and profits: $70.1 billion in revenue and $25.8 billion in net income. Some of its biggest gains came from cloud offerings, with Azure and other services seeing revenue growth of 35% in constant currency. On the earnings call, CFO Amy Hood said much of that was from the non-AI parts of the business, and represents more growth in large enterprise customers. The slowing economy and new tariffs weren’t discussed much in the earnings, but CEO Satya Nadella said on the call that software is a resource to fight economic pressures to do more with less. Microsoft’s stock went up more than 8% on Thursday morning. Meta saw $42.3 billion in revenue, a 16% year-over-year increase. The lion’s share of its income came from advertising, while its Reality Labs division posted a $4.2 billion loss. On the earnings call, CEO Mark Zuckerberg highlighted how AI is transforming Meta’s business, such as improvements to advertising and customer outreach tools. Meta is increasing its planned spend on capital expenditures for the fiscal year to $64 billion to $72 billion, up from $60 billion to $65 billion, which CFO Susan Li said is to meet the long-term demands of Meta’s AI services and give the company more flexibility. Meta’s stock traded more than 5% higher on Thursday morning. Google parent Alphabet brought in $90.2 billion in revenue, posting a year-over-year top line expansion of 12%. On the earnings call, CEO Sundar Pichai declined to speculate about tariff impact, but said the lack of exemptions will “obviously cause a slight headwind to our ads business in 2025.” Big tech earnings continue today, with Amazon and Apple both reporting after markets close. |
|
Updates are vital to keeping your enterprise secure from cyber attacks, but it can be hard to make sure everyone actually does them. Most security patches require users to restart their programs or systems—actions that busy employees may decide to put off until later, unknowingly leaving their workstations open to security risks. On July 1, Microsoft is launching a new “hotpatch” security update service, which will perform the necessary updates without forcing users to stop what they’re doing and restart, writes Forbes senior contributor Davey Winder. But the service won’t be free, and will cost enterprises $1.50 per CPU core per month—potentially enough for companies to weigh whether the change is worth it. The traditional restart-required security updates will remain available at no cost. |
|
Presented by Philip Morris International | We’re delivering a smoke-free future, today. |  | At PMI, our mission is clear: to reduce smoking by replacing cigarettes with better smoke-free alternatives for legal-age adults who smoke. We’ve transformed to deliver a smoke-free world and are making significant progress.
| |
|
Last week, Nvidia made its NeMo microservices generally available, writes Forbes senior contributor Janakiram MSV. The toolkit, which is meant to help build enterprise AI agents, includes five different services: NeMo Customizer, which handles LLM fine-tuning; NeMo Evaluator, assessing AI models against customized benchmarks; NeMo Guardrails, implementing safety controls for compliance and appropriateness; NeMo Retriever, enabling information access across systems; and NeMo Curator, processing and organizing data for model training. Several large companies have already used Nvidia’s NeMo tools,which work with several data storage partners and enterprise partners. |
|
|  | QuantPi cofounder and CEO Philipp Adamidis. QuantPi |
| How Testing Can Make Sure AI Meets (And Surpasses) Expectations |
|
|
|
Generative AI models often operate in a “black box,” making the specifics of their decision-making processes difficult for the everyday person to understand. Philipp Adamidis, who started QuantPi with Antoine Gautier and Artur Suleymanov in 2020, says his company’s initial plan was to build AI models—but they couldn’t explain to people how they worked. They pivoted their company’s mission to testing AI models to figure out how those black boxes handle data, as well as testing those AI systems for bias, robustness and accuracy. I talked to Adamidis, QuantPi’s CEO, about the importance of testing and where common pitfalls are. This conversation has been edited for length, clarity and continuity. In general, is there anything you’ve seen that inherently makes a system more or less trustworthy? Adamidis: It’s a lot about honesty and transparency. Many users don’t have a problem if a system doesn’t work perfectly everywhere, but you want to know about it. Reporting all the weaknesses of your model that you know about is the first step that builds up trust for everyone. Regulators will not shut your models down immediately when there is a small weakness, but if someone intentionally hides weaknesses, that’s another topic. The strengths and the weaknesses of models or systems is very important. A lot of biases come from the data, which reflects very often the biases in our societies. It’s not always the model that has the bias. Sometimes the models make the bias worse, but some models reduce it. Running tests and analysis on the training data is important. What are you asked to look for, and what do you find? There’s a lot of interesting findings because you first start with what you already know, but there’s a lot of unknowns. Maybe a customer comes and says we saw a certain bias with respect to an age group, so we will start there. But with our technology, we have out of the box a ton of other tests, and we can cross test certain attributes. We can look at a certain age group with a certain skin tone or ethnicity, and that allows you to look at a lot of different, even smaller subgroups and do analysis on the behavior of the model there. They get very, very deep insight. It’s very typical that we produce a million test results. We automatically summarize them and aggregate them into strengths and weaknesses reports where we highlight certain things. The software also automates a lot of the analysis of the test results, but it still gives very deep insight also for the data scientists to manually search if they want. Very often they start with the concerns around ethical compliance, regulatory bias concerns, and then they learn testing is actually extremely helpful to improve the model’s performance in other dimensions as well. This is why I always say the intrinsic motivation should be to test your models for a higher quality of your product—to satisfy the customers, to make sure that you have a good relationship with your partners, with your ecosystems by building high quality products. When a company starts putting together a system, are there any things that they should be concentrating on to try to reduce the issues that might come out in a test like yours? I motivate everyone to start with the testing very early on in the development process. Very often, companies come just before a deployment deadline and say, two weeks before the system goes live, we have identified so many problems. Start testing the models early on, when you have your first so-called candidate model, when you have the first data and try different data sets. This is how [you get] monitoring throughout the whole life cycle of such an AI model or AI system. For example, you are adding guardrails. Adding guardrails can make the model better in one dimension, but can reduce the quality somewhere else. So you want to test whenever you do changes. When the data changes, the model changes, the context or scope of application changes, when you think about new test scenarios that you have not thought of before, all of that triggers reassessment or retesting. Start early on, then the whole development process and the time to safe deployment will decrease dramatically, because it’s much easier and costs you much less to recover mistakes at the beginning than at the very end.
|
|
|
Send us C-suite transition news at forbescsuite@forbes.com. |
|
In today’s enterprise environment, the most important point to secure is the browser, which is inherently difficult. Here are some ways to rethink your company’s security initiatives to put browsers first. It’s important to set a business goal, but reaching the desired outcome isn’t the only thing that should be rewarded. Leaders should keep tabs on the process, giving employees reinforcement and feedback the entire way through. |
|
|
Supporting your favorite professional sports team is fun, but cybersecurity professionals recommend you don’t do it with your passwords. According to a new report by GlobalDots, which team was represented the most often in hacked passwords? |
A. |
Dallas Cowboys |
B. |
Orlando Magic |
C. |
New York Yankees |
D. |
Boston Red Sox |
|
Check if you got it right here. |
|
|