Illustration: Sarah Grillo/Axios
Big announcements from OpenAI and Salesforce on Thursday highlight the tech industry's desire to give bots more decision-making capabilities, despite concerns about the technology's limitations.
Why it matters: Increasing generative AI's autonomy and reasoning capabilities could improve efficiency, but also increase risk.
Driving the news: OpenAI on Thursday announced o1 (previously code-named Strawberry), a new model that pauses to evaluate different ways of responding before starting to answer a question.
- The result, OpenAI says, is much better at handling complex queries, especially around math, science and coding.
- Salesforce, meanwhile, debuted Agentforce, its effort to move from using genAI as a copilot to improve human productivity and into a world where autonomous AI agents are empowered to take action on their own, albeit with guardrails and limits.
Zoom in: Early customers say these more powerful AI systems are showing results.
Thomson Reuters, whose legal unit CoCounsel had early access to o1, says it has seen the new model do better on tasks that require more analysis as well as strict adherence to instructions and data in specific documents.
- "Its careful attention to detail and thorough thinking enables it to do a few tasks correctly where we have seen every other model so far fail," CoCounsel product head Jake Heller told Axios.
- Answers take longer, Heller said, but he added that most of the time "professionals want the most thorough, detailed and accurate answer — and they would much rather wait for it than get something wrong and quick."
Wiley, which has been using an early version of Agentforce, said the technology is allowing it to answer more questions without having to involve humans.
- "We've seen an over 40% increase in our case resolution when you compare the agent to our old chatbot," Kevin Quigley, a senior manager at Wiley, said during a Salesforce event on Thursday.
What they're saying: Executives at Salesforce and elsewhere say the key to ensuring safety is to impose strict limits on the purview and decision-making powers given to AI agents.
- "You don't want to just give AI unlimited agency," Salesforce chief ethical and humane use officer Paula Goldman told Axios. "You want it to be built on a set of guardrails and thresholds and tested processes. That's where you're going to get good results, and otherwise, you're inviting a lot of risk for your company."
- EqualAI CEO Miriam Vogel told Axios that using AI agents for low-stakes tasks is reasonable, but cautioned, "We do not want to move into AI agents prematurely in areas where advice could impact someone's benefits, safety, etc." To do so "is inviting liability and potential harms."
- "With AI agents having access to the enterprise data and having this intelligence with their reasoning and the planning capabilities, we feel that's going to be a revolution," ServiceNow VP of platform and AI innovation Dorit Zilbershot told Axios.
- ServiceNow announced its own AI agent push earlier this week. "But we know that with that power comes a lot of responsibility," she said. One of ServiceNow's key guardrails is that by default, all of an AI agent's planned actions have to be approved by a human. Once a business is confident that the agent is behaving properly, they can choose to have it act autonomously, Zilbershot said.
Yes, but: Autonomous bots are in danger of simply warring against each other.
- "Even if we get bias and hallucinations down to an acceptable level, many proposed AI agents' use cases don't make sense because they will set up arms-race conditions that drive up costs for everyone, but only benefit the arms dealers," Phil Libin, co-founder and former CEO of Evernote, told Axios.
- "LLMs are incomplete, but they can be an important part of a system that has other, non-LLM ways of grounding them to reality and values," Libin added.
Between the lines: Even OpenAI's use of the term "thinking" to describe what is happening before o1 responds is a misnomer, said Hugging Face CEO Clement Delangue.
- "An AI system is not 'thinking', it's 'processing,'" Delangue wrote on X. "Giving the false impression that technology systems are human is just cheap snake oil and marketing to fool you into thinking it's more clever than it is."
The bottom line: Before giving AI more autonomy, experts say the industry needs to address the technology's tendency to make up information and its propensity for bias.
Sep 12, 2024 - Technology
The slow road to smart robots
Illustration: Sarah Grillo/Axios
Building AI-powered robots that can flexibly operate in the real world is going to take much longer than Silicon Valley believes and promises, according to the former head of Google's robotics moonshot project, writing in Wired.
Why it matters: Today's generative AI revolution rests on the assumption that a multitude of long-awaited technologies — including humanoid robots, self-driving cars and superintelligent digital brains — are right around the corner.
57 mins ago - Health
Pregnancy changes the brain: study
Illustration: Tiffany Herring/Axios
Repeated scans revealed how a woman's brain underwent significant and sometimes lasting neurological changes during pregnancy that may help build parental instincts.
Why it matters: The findings published Monday in Nature Neuroscience could help researchers understand why some new parents develop postpartum depression and other neurological conditions that appear during or are worsened by pregnancy.
1 hour ago - Politics & Policy
Scoop: Obama hits TikTok to register young voters
Vice President Harris, former President Obama and President Biden during an East Room event on the anniversary of the Affordable Care Act in 2022. Photo: Chip Somodevilla/Getty Images
Former President Obama is taking to TikTok today in search of eligible voters who aren't yet registered to pull the lever for Vice President Harris.
Why it matters: Democrats want to press their advantage with young Americans.