Featured Article

What exactly is an AI agent?

The answer depends on who you ask

Kommentar

Illustration of a robotic agent helping workers do their jobs.
Image Credits: girafchik123 / Getty Images

AI agents are supposed to be the next big thing in AI, but there isn’t an exact definition of what they are. To this point, people can’t agree on what exactly constitutes an AI agent.

At its simplest, an AI agent is best described as AI-fueled software that does a series of jobs for you that a human customer service agent, HR person or IT help desk employee might have done in the past, although it could ultimately involve any task. You ask it to do things, and it does them for you, sometimes crossing multiple systems and going well beyond simply answering questions.

Seems simple enough, right? Yet it is complicated by a lack of clarity. Even among the tech giants, there isn’t a consensus. Google sees them as task-based assistants depending on the job: coding help for developers; helping marketers create a color scheme; assisting an IT pro in tracking down an issue by querying log data.

For Asana, an agent may act like an extra employee, taking care of assigned tasks like any good co-worker. Sierra, a startup founded by former Salesforce co-CEO Bret Taylor and Google vet Clay Bavor, sees agents as customer experience tools, helping people achieve actions that go well beyond the chatbots of yesteryear to help solve more complex sets of problems.

This lack of a cohesive definition does leave room for confusion over exactly what these things are going to do, but regardless of how they’re defined, the agents are for helping complete tasks in an automated way with as little human interaction as possible.

Rudina Seseri, founder and managing partner at Glasswing Ventures, says it’s early days and that could account for the lack of agreement. “There is no single definition of what an ‘AI agent’ is. However, the most frequent view is that an agent is an intelligent software system designed to perceive its environment, reason about it, make decisions, and take actions to achieve specific objectives autonomously,” Seseri told TechCrunch.

She says they use a number of AI technologies to make that happen. “These systems incorporate various AI/ML techniques such as natural language processing, machine learning, and computer vision to operate in dynamic domains, autonomously or alongside other agents and human users.”

Aaron Levie, co-founder and CEO at Box, says that over time, as AI becomes more capable, AI agents will be able to do much more on behalf of humans, and there are already dynamics at play that will drive that evolution.

“With AI agents, there are multiple components to a self-reinforcing flywheel that will serve to dramatically improve what AI Agents can accomplish in the near and long-term: GPU price/performance, model efficiency, model quality and intelligence, AI frameworks and infrastructure improvements,” Levie wrote on LinkedIn recently.

That’s an optimistic take on the technology that assumes growth will happen in all these areas, when that’s not necessarily a given. MIT robotics pioneer Rodney Brooks pointed out in a recent TechCrunch interview that AI has to deal with much tougher problems than most technology, and it won’t necessarily grow in the same rapid way as, say, chips under Moore’s law have.

“When a human sees an AI system perform a task, they immediately generalize it to things that are similar and make an estimate of the competence of the AI system; not just the performance on that, but the competence around that,” Brooks said during that interview. “And they’re usually very over-optimistic, and that’s because they use a model of a person’s performance on a task.”

The problem is that crossing systems is hard, and this is complicated by the fact that some legacy systems lack basic API access. While we are seeing steady improvements that Levie alluded to, getting software to access multiple systems while solving problems it may encounter along the way could prove more challenging than many think.

If that’s the case, everyone could be overestimating what AI agents should be able to do. David Cushman, a research leader at HFS Research, sees the current crop of bots more like Asana does: assistants that help humans complete certain tasks in the interest of achieving some sort of user-defined strategic goal. The challenge is helping a machine handle contingencies in a truly automated way, and we are clearly not anywhere close to that yet.

“I think it’s the next step,” he said. “It’s where AI is operating independently and effectively at scale. So this is where humans set the guidelines, the guardrails, and apply multiple technologies to take the human out of the loop — when everything has been about keeping the human in the loop with GenAI,” he said. So the key here, he said, is to let the AI agent take over and apply true automation.

Jon Turow, a partner at Madrona Ventures, says this is going to require the creation of an AI agent infrastructure, a tech stack designed specifically for creating the agents (however you define them). In a recent blog post, Turow outlined examples of AI agents currently working in the wild and how they are being built today.

In Turow’s view, the growing proliferation of AI agents — and he admits, too, that the definition is still a bit elusive — requires a tech stack like any other technology. “All of this means that our industry has work to do to build infrastructure that supports AI agents and the applications that rely upon them,” he wrote in the piece.

“Over time, reasoning will gradually improve, frontier models will come to steer more of the workflows, and developers will want to focus on product and data — the things that differentiate them. They want the underlying platform to ‘just work’ with scale, performance, and reliability.”

One other thing to keep in mind here is that it’s probably going to take multiple models, rather than a single LLM, to make agents work, and this makes sense if you think about these agents as a collection of different tasks. “I don’t think right now any single large language model, at least publicly available, monolithic large language model, is able to handle agentic tasks. I don’t think that they can yet do the multi-step reasoning that would really make me excited about an agentic future. I think we’re getting closer, but it’s just not there yet,” said Fred Havemeyer, head of U.S. AI and software research at Macquarie US Equity Research.

“I do think the most effective agents will likely be multiple collections of multiple different models with a routing layer that sends requests or prompts to the most effective agent and model. And I think it would be kind of like an interesting [automated] supervisor, delegating kind of role.”

Ultimately for Havemeyer, the industry is working toward this goal of agents operating independently. “As I’m thinking about the future of agents, I want to see and I’m hoping to see agents that are truly autonomous and able to take abstract goals and then reason out all the individual steps in between completely independently,” he told TechCrunch.

But the fact is that we are still in a period of transition where these agents are concerned, and we don’t know when we’ll get to this end state that Havemeyer described. While what we’ve seen so far is clearly a promising step in the right direction, we still need some advances and breakthroughs for AI agents to operate as they are being envisioned today. And it’s important to understand that we aren’t there yet.

More TechCrunch

Zepto co-founder Aadit Palicha told a group of analysts and investors on Tuesday that the three-year-old Indian delivery startup anticipates growth of 150% in the next 12 months, a remarkable…

Zepto, snagging $1 billion in 90 days, projects 150% annual growth

VerSe Innovation, India’s content tech startup, has acquired digital marketing firm Valueleaf Group to bolster its presence in the Indian digital ad space.

India’s VerSe buys Valueleaf to boost digital marketing

Astrobotic’s Peregrine lunar lander failed to reach the moon because of a problem with a single valve in the propulsion system, according to a report on the mission released Tuesday.…

One busted valve led to the failure of Astrobotic’s $108M Peregrine lunar lander mission

Meta and Spotify are exploring deeper music integration in Meta’s Instagram app. New findings indicate the companies are testing a feature that would allow users to continuously share what music…

Meta and Instagram spotted developing a new social music-sharing feature

In Latin American countries like Brazil and Chile, messaging platform WhatsApp has become one of the most popular apps to use to buy things online. It was even the e-commerce…

How Techstars, Meta helped profitable LatAm startup Mercately raise a $2.6M seed

Before entrepreneur and investor Mike Lynch died along with six others after the yacht they were on capsized in a storm last week, the party was celebrating Lynch’s victory in…

Will HP still demand $4B from Mike Lynch’s estate?

How many times does the letter “r” appear in the word “strawberry”? According to formidable AI products like GPT-4o and Claude, the answer is twice. Large language models (LLMs) can…

Why AI can’t spell ‘strawberry’

The SEC has updated its limits to the amount of money a “qualified venture fund” can raise to $12 million from $10 million.

The SEC just made life a little easier for smaller VCs

Tinder removed the U.S. military ads, saying the campaign violated the company’s policies.

The US military’s latest psyop? Advertising on Tinder

Welcome to TechCrunch Fintech! This week, we’re looking at the craziness that is Bolt’s proposed fundraise, how much money Synapse’s founder has raised for his new venture, just how much…

Just how much cash does Stripe have?

In an effort to improve its security measures, Lyft announced Tuesday a new rider verification pilot program to help drivers verify riders’ identities and ensure that they are indeed who they say…

Lyft follows in Uber’s footsteps with a rider verification program

Update: The Polaris Dawn launch has been pushed back a day and is now planned for Wednesday, August 28 after a helium leak was detected ahead of its takeoff. After…

Polaris Dawn will push the limits of SpaceX’s human spaceflight program — here’s how to watch it launch live

Meta will be shutting down Spark AR, its platform of third-party AR tools and content, effective January 14, 2025.

Creators are angered by Meta’s Spark AR shutdown, saying they’ll be out of work with little notice

Waymo said Tuesday it will start offering riders 24/7 access to curbside pickups and drop-offs at Phoenix Sky Harbor International Airport terminals 3 and 4 — yet another example of…

Waymo expands its curbside robotaxi service to Phoenix airport

Some believe open source AI is a way to break out of the familiar proprietary software quagmire that the technology has predictably fallen into. Hugging Face’s Irene Solaiman and AI2’s…

Is open source AI possible, let alone the future? Find out at TechCrunch Disrupt 2024

It’s back-to-school season, and that often means a surge in expenses. Or perhaps you’ve recently graduated and are navigating the job hunt. Either way, your wallet might be feeling the…

Students and recent grads: Save on TechCrunch Disrupt 2024 tickets

Snapchat is officially rolling out native support for iPad, the company announced in the app’s latest release notes. Since Snapchat’s launch in 2011, the social networking app has only been…

13 years later, Snapchat finally rolls out native support for iPads

At the end of the six-month effort, the startup is aiming to have prototype parts to show to NASA.

Whisper Aero is working with NASA to bring its ultra-quiet tech to outer space

A group of hackers linked to the Chinese government used a previously unknown vulnerability in software to target U.S. internet service providers, security researchers have found.  The group known as…

Chinese government hackers targeted US internet providers with zero-day exploit, researchers say

Elon Musk’s X has already declared it aims to compete with LinkedIn for job listings and PayPal for payments. Now, it wants to take on the likes of Zoom, Google…

X is testing a video conferencing tool

San Francisco-based data infrastructure startup Cribl has raised $319 million in a Series E funding tranche led by new investor GV (Alphabet’s corporate venture arm) with participation from GIC, CapitalG,…

Data infrastructure startup Cribl raises $319M at a $3.5B valuation

Apple has struck a deal with Airtel to provide the Indian telecom giant’s subscribers with exclusive offers for its music streaming service. The partnership, announced on Tuesday, will also see…

Apple strikes telecom deals to reach more users in India

GrubMarket, the $3.6 billion food delivery and supply chain startup backed by Tiger Global, BlackRock and nearly 100 other investors, has snapped up another food delivery startup on its consolidation…

Food delivery is seeing more consolidation: GrubMarket snaps up FreshGoGo

Coined as the “Everyday Influencer” platform, Mavely is a social commerce app that enables users to earn commissions by sharing and recommending products from more than 1,250 brands, including Adidas,…

Mavely’s platform for everyday influencers is taking off

Supio uses generative AI to automate bulk data collection and aggregation for legal teams. It emerged from stealth Tuesday with a $25 million investment.

Supio brings generative AI to personal injury cases

Planera, scheduling and planning software for commercial construction projects, has raised $13.5 million to expand its reach and help general contractors with more features.

Planera raises $13.5M to help solve the gnarly problem of scheduling for construction contractors

The world of metal 3D printing has been in-flux this past year, the most notable example being Nano Dimension’s acquisition of Desktop Metal.

Markforged adds metal printing to its industrial 3D printer

nOps sells software designed to “optimize” the budgets that businesses allocate to cloud products and services.

nOps lands $30M to optimize AWS customers’ cloud spend

When Pavel Durov, founder and CEO of messaging app Telegram, was arrested on August 24, French authorities did not respond to requests for comment. The secrecy of pre-trial investigations and…

Paris court explains why it’s arrested Telegram founder Pavel Durov

Given India’s language diversity, digital content companies already face a challenge in trying to show and translate content accurately. Google is facing a similar problem with AI Overviews recently rolled…

Google’s AI Overviews in Hindi need a quality upgrade