WTF is an Autonomous Agent?
If you’re not directly working in the AI space, chances are that your introduction to the concept of autonomous agents was through some influencer sprouting beginning with:
“Move over ChatGPT…”
“The tool you haven’t heard about that is bringing us one step closer to AGI “
“You are not going to believe what AI is doing now!! ”
Engagement threads aside, autonomous agents begun amassing significant mainstream traction in March 2023 upon the launch of AutoGPT. AutoGPT initially launched as an augmented version of ChatGPT that could: assign itself tasks, browse the internet, store both long-term and short-term memory, summarise local files and (if you were lucky) execute upon the tasks it set itself after the user initialised the overarching objective.
AutoGPT represented the first instance of what we can describe as a general autonomous agent. To avoid complication, for the rest of this article assume the definition of an ‘autonomous agent’ as being any non-human entity that:
- Has the ability to assign their own tasks
- Can operate independently of user input once provided an objective function (e.g increase my newsletter subscribers by 10k this year)
- Has the ability to search for new information beyond what it was trained on
- Has capacity for both long-term and short-term memory
These functionalities all existed to some degree within AutoGPT. However, there are a variety of other additional capabilities that will separate agents from ChatGPT-style chatbots going forward. Take these to include things like:
- Use or access of personal tools (e.g. email, credit cards, CRMs, social media etc)
- Can independently communicate, coordinate and collaborate with other autonomous agents to complete tasks
To get a practical understanding of some examples for how agents initialise, prioritise and execute these tasks, I recommend Matt Schlicht’s primer on agents.
The Agent as ‘Employee’: A simple heuristic for understanding autonomous agents
Much has been made of ‘AI taking human jobs’ in media discourse.
Comparatively little has been spent discussing the elevation in the role of humans in economic activity involving AI.
The key paradigm shift that will occur as artificial narrow intelligence (i.e purpose-built agents) comes to occupy routine decision-making jobs like law, accounting, tutoring etc will not be that humans become unemployed. The big change will be that all humans have the opportunity to become managers.
Say I were to start a company today. For example, I will be starting a social media platform for sober-curious people. In the past, if the alpha version of this platform gained enough traction, I would be forced into hiring for a number of roles in order to help the platform scale: product managers to talk to users and define what they want, engineers to write the code to meet user needs, marketers to get the product in the hands of more users, legal counsel to ensure that the entire operation was compliant and so on and so forth.
In a world with effective agents that pass the ‘Employee test’ (i.e the output of the agent in a specific role would appear identical to that of a human to the outside observer), I could feasibly run this entire operation solo.
Credit: Matt Schlicht, Octane AI
So if we are all going to become managers of these complex organisations, what tools need to be in place for us to do it effectively, safely and profitably?
The Autonomous agents space today: Upside & Issues
A Mini-Map of ready-to-use agents protocols
Common Problems
There has been a lot of progress in AI over the last few months. But there is so much further to go.
If you were to play around with these tools now, you would notice many shortcomings that would prevent them from meeting the ‘Employee test’.
Personalisation
Firstly, a lot of the written output generated by these agents tends to lack creativity or orginality in ‘voice’. This matters little if the agent is designed to prepare legal documents or create a market analysis for a new product. However, it matters significantly if they are used in brand-building tasks or for elevating the reputation of it’s human ‘manager’.
In order to overcome this personalisation hurdle, agents will need an understanding of the intended voice of the manager. The best way to accomplish this is by training the agents on the manager’s outputs, both public (e.g. blog posts) and private (e.g. emails). The big barrier here then becomes trust in the agents to act responsibly on this data. What mechanisms need to be put in place to ensure this trust?
Reliability
Possibly the most cited drawback of this first-generation of autonomous agents has been there tendency to hallucinate.
For the uninitiated, hallucination in the context of artificial intelligence refers to the machine’s tendency to confidently assert incorrect answers. In the case of the current generation of agents, this forms a disappointing match with its tendency to get stuck in loops whereby it continues returning to previous tasks rather than progressing towards an output.
In order for agents to achieve any reasonable level of ubiquity among a mainstream audience (let alone be left to their own devices in a working context), they will need to achieve a degree of reliability that matches and eventually surpasses that of a human agent’s ability to get work done and fact-check their work. For an example of the side effects of hallucination in practice, look no further than the recent example of Steven Schwartz, a New York lawyer who used precedents falsely and confidently asserted by ChatGPT as real cases in a brief for a real-world case.
In terms of opportunities in this realm, the first is obvious. Create agents that don’t hallucinate. Secondly, however, there will be interim markets for i) reliability testing for agents and ii) markets and demand for products that can rigorously stress test these agent protocols against adversarial attacks. These concepts will help engineer agents that achieve the level of reliability and security necessary for widespread adoption.
Personalisation
As it stands, if you and I were both to type the same prompt into ChatGPT at the same time, we would get an identical output. This is an extremely limiting feature for AI utility for a number of reasons. Firstly, universal answers to any given prompt will turn the centralised leaders in the AI space into monoliths of human knowledge and output. If everyone comes to depend on increasingly advanced tools for their work and play, everything will trend towards a uniform standard dictated by whatever the winning models were trained on. If this sounds worrying it’s because it is.
More importantly, it makes things boring. For non-objective issues or for tasks that need to be performed in a certain style, it only makes sense that agents would leverage some understanding of the user in order to tailor their arguments and outputs. This will be pivotal to a future where we are comfortable allowing them to act on our behalf.
There needs to be some validation that our agents will provide a faithful representation of who we are before the agent existed.
The State of Personal AI in one image.
Security
The idea of having machines operate in your place, with your sensitive data and possibly even as a talking, typing replica of you sounds threatening to even the most progressive of technologists. Giving agents access to enterprise-level data is a whole other can of worms still.
So how can security & privacy risks be mitigated to the extent that i) individuals can trust agents to act on their behalf and ii) enterprises have watertight guarantees that their information and activities are safe from attacks, misappropriation or exploit?
Some privacy mechanisms are already in place today and may be more basic than you think. Two-factor authentication will be table stakes for private agent usage. Access control mechanisms are already being built for enterprise chatbot usage. Persistent information protocols like Arweave are laying the foundations for decision-making traceability to reduce the ‘black box’ effect that machine outputs may suffer from.
As for agent-specific risks, more opportunities for protection against misbehaviour or manipulation are outlined in the RFS below.
Interoperability
Further extending the above analogy of ‘agent as employee’ – to form an effective organisation, these agents need to be able to coordinate with one another effectively. Beyond the organisational level, to entirely reshape business ecosystems and broader economies, organisations of agents are going to have to learn how to coordinate with other agentic organisations as well. The coordination problems from here begin getting complex (much as they have in human society).
There are already some early signs of promising agent-to-agent communications protocols, led by the work being done at CAMEL.
An example set-up of a conversation between a PA agent and Influencer Agent on CAMEL
Agent-to-agent communication is great, but only represents the first baby step in terms of coordinating active agents.
A few examples of some more coordination problems to think about:
- How can my personal ‘influencer’ agent collaborate with Alice/Bob’s personal ‘influencer’ agent?
- How will my personal agents be able to judge the reputation of other agents? What kind of filters will be in place for them to gauge whether other agents may be acting suspiciously?
- What kind of standards are needed to allow my agents to interact with other people’s agents that may be utilising different software or operating under different regulatory strictures?
It is my belief that once this hodgepodge of issues is solved, humanity is well on its way to something resembling an ‘employee-less’ society. What is the point of doing basic, repetitive tasks once we are outright second best?
This loss of employee-ism is somehow often conflated with a loss of meaning when AI becomes ubiquitous. I believe almost the exact opposite to be the case. When AI “tAkEs oUR joBs”, it will be a paradigm shift that takes every individual from serf to manager. Everyone will have a suite of highly capable and reliable companions at hand to go out and bring what they want to see into the world. The key “job” of the human in this world is to manage these companions to execute on the vision that you want to see.
Hence the title of this post. The penetration of autonomous agents will result in a great promotion of all individuals from task monkeys to managers.
What is missing?
User-friendly agent deployment tools. The overarching principle of agents, and ultimately artificial intelligence as a whole, is that of task automation.
Using that definition, Zapier is technically the global leader in personal agent deployment, UiPath the global leader in enterprise deployment. But anyone who has used either of these two tools would be able to tell you the amount of time and effort required to successfully utilise these tools. How can we obfuscate away all the pain, time and consulting fees spent on getting these systems to work?
Alex Lieberman of Morning Brew’s vision for user-friendly automation system deployment tools
Alex Lieberman’s idea above presents a starting point for thinking about the future of automation at both personal and enterprise levels. The problem with Zapier is that requires users to reverse engineer their workflows. What about tools that are trained to intuitively or by training user workflows?
Excel Macros is not actually a bad approximation for how user session recording can be used to apply repeatable sets of tasks that users can execute with minimal thought. What if such user session recordings could be applied across the entire browser to operate across different apps and allow users to offload 50+% of their workday and eventually all non-creative tasks?
In order to effectively manage a personal universe of agents, the most basic opportunity to capitalise on is a CRM for individual agents. Such a CRM would act as a single source of truth for the instructions/prompts/intentions of each individual agent at an individual’s disposal. This could be integrated with separate dashboards for agent performance management.
For users who want even less involvement in the deployment or customisation of their agents, there is an opportunity for out-of-the-box generic agent packages. As agent productivity is proven, there is strong likelihood of a market developing for people to purchase pre-trained sets of agents that can perform pre-defined sets of tasks depending on role description.
The design space for personal agent deployment has a good path to follow in the footsteps of existing developer deployment tools like Log10 or Superagent. As good as these tools are for managing agent deployments, they are not designed for end users with little knowledge of how agents or automations work.
Incentive mechanisms for agent resource allocation. Regardless of how superintelligent agents may become, they will always rely on resources to stay in operation. Machine brains need nutrition just as human brains do. And just as entire human economies developed around the need to put food on our plates, entire agent economies will be devoted to allocating scarce resources to where they are demanded most.
In agent economies, markets will be needed to allocate things like:
- Compute Power. Which agents get what quality CPU/GPU resource for what tasks at what time?
- Memory. How can agents with high demands rent or buy memory ‘real estate’ of less demanding agents?
- Energy. How can power supply best be allocated to agents based on where they are hosted and how much energy they consume? How is this billed – to the agent, the server, the operator or the individual?
- Connectivity. How is access to WiFi, cellular networks or other communication channels determined en masse?
Sensors and actuators are two more, extensible pieces of the puzzle that are not universal but will be in demand by agents. As such, they are covered in another section below.
I have long been of the opinion that crypto will form the means of exchange for these economies. Programmatic digital agents will want programmatic digital currency.
Off the back of this assumption, how can cryptographic networks be designed that best allocate and incentivise the sharing of these scarce, API-able resources between agents such that they are always being leveraged to their maximum potential? The market size for such a token that could feasibly make itself the means of exchange for a digital replica of the human economy is, understandably, enormous. As we have seen in the crypto space, there will be derivative opportunities in helping these networks scale – what is needed to bundle agent requests? Will we need new forms of settlement for non-human transactions?
Low-hanging fruit here is also the fact that agents will likely require agent specific wallets. Just as protocols like Worldcoin race to develop proof-of-humanity, we may even see reverse proofs-of-machine in order to operate certain agent-specific protocols like digital wallets.
Many questions to be asked for the reinvention of a non-human economy.
Marketplaces for functional agents. We are at a stage in the lifecycle of autonomous agents now where efforts are extremely developer focused. Eventually, the market will shift to providing off-the-shelf solutions that provide little in the way of customisation but are easy to deploy and manage.
This creates an immense opportunity for any early movers looking to build exchanges or marketplaces for autonomous agents. Not only will people be able to buy generic forms of agents to deploy quickly and cheaply, but developers can earn from the development of more advanced and specific agents for different user needs. Just as with any category of good or service in the past, we will see differing levels of price points for differing levels of prestige.
Developers (or developer agents) will find ways to charge premium pricing for extremely high-touch, white glove agent solutions to provide to people who want the best-in-class. Similarly, enterprises will be willing to pay more for agents with gold-standard guarantees of security & privacy.
There will also be thriving marketplaces for agent rental – people who are taking on temporary tasks may not want to spend market rates on new agents . People will be willing to rent out agents on the basis that they can be trained and improved upon as they are being utilised. We may see financial markets arise for people staking their agents as collateral for other parties to rent out in exchange for cash upfront and the possibility of agents being trained through utilisation over the rental period.
Just as in the current web, the marketplace space will not be limited to one dominant player. There will be a rich ecosystem of different markets for different users, different price points and different specifications. This makes the agent marketplace market an incredibly rich design space.
Universal task coordinators. This is almost an extension of the ‘New Zapier’ point above. With an army of agents at one’s disposal, how can we ensure that they are all on the same page?
Tools will be required in order to ensure that each individual agent’s initial goals and task prioritisation output matches overall ‘organisational’ or universe goals.
The vision for task coordinators in this sense may in the short-run be a human-in-the-loop system that can monitor agent activity (possibly through a CRM as discussed above) to ensure this and set them on the right track. Eventually, based on this human feedback, protocols can be designed that with sufficient accuracy can map agent priorities’ as a fit to broader organisational or individual aims.
IoT Agents. I will not the first or the last person to pose the question: Where do AI and spatial computing interconnect?
I already touched on this point briefly in my last article. As we progress towards entirely personalised user agents, it is inevitable that they will come to participate in our natural, lived environment (in addition to our augmented/virtual/extended ones).
The most obvious application of this will be in in the internet of things. Cooking a steak and need it just the way your partner likes? Your agent can take care in the kitchen. Left the chicken in the freezer for too long? Issue of the past, your agent has told you that 3 hours ago and got someone on the task. Rather than just being a notification ‘nudge’ system like the phone is today, these IoT agents will have an understanding of your personal context to act at the convenient and correct intervals.
Human-Agent Replicas. Character AI has already raised $100mm and amassed over 50k subscribers on the back of allowing you to simply talk with chatbots styled as celebrities. Talk to an LLM that reads bedside stories as though it were Morgan Freeman. Ask LLM Elon Musk about his thoughts on diesel tractors. And so on and so forth.
As neat as this is as a toy, it is just scratching the surface of what AI will enable people to do with regards to putting yourself in someone else’s shoes, talking to whoever you want or demand or, with the help of agents, living almost exactly the way they do.
As autonomous agents begin to learn how to replicate human behaviour and train on lived experiences, they will begin to become representations of the humans who they represent. This presents a window into the possibility of being able to live in someone else’s body and mind.
For example, say Grimes wanted to go a step further than open sourcing her IP. Now she wants to give people the experience to live the way she does and experience things the way she does. If Grimes has been working with personal autonomous agents for a sufficient period of time, these agents can be replicated for others to use and experience ‘The Life of Grimes’.
Depending on how detailed of a look at someone these agents can get with regards to their access to biomarkers and the like, this kind of replication can also be applied to sharing lived experiences and feelings (i.e qualia) with one another. Brave new world, indeed.
As a sidenote, it is an inevitability that as autonomous agents begin to gain prominence, some will learn to leverage this tool better than others. This will allow those who have put in the time to teaching and directing their agents to act in their best interests to create markets for their agents’ replication for other people to use, thus avoiding the learning curve and technicalities of training their own agents.
Agent-Specific Networks for Sensors and other Actuators. In order to realise their full executional potential, most agents will need access to some form of physical actuators to bring their intelligence to life.
Networks that create markets for different agents to gain access to different actuators at a given time will be required to smooth out inevitable demand for bringing this intelligence to the real world. This will become especially crucial once autonomous agents transition from the world of information work to the world of heavy industry.
There will be more autonomous agents than robots for the near future because the constraints on building software are less than those for building hardware. As such, there is a scarcity of physical ‘bodies’ to perform the physical work that these agents will want to conduct. This creates a de facto labour market for the agentic economy.
Which of the economic mechanisms of the human labour market will apply once this becomes the case? What will robot owners be able to charge agents for rental of their physical time? Will robots (or other actuators) need to be standardised or specialised based on the specifications of the agent? There are many open questions ripe for disruption as machine actuators governed by artificial intelligence come to occupy lived environments.
Just as it is in web2, data is a crucial commodity that people are willing to pay for access to the channels that provide it (e.g Google Ads). In the autonomous age, sensors will be a crucial ‘vendor’ for data. Cameras, GPS, LiDaR and a whole suite of other sensors will be relied upon for providing agents with real-time decision-making data. As such, we will need i) networks for facilitating data exchange between agents and ii) oracles that allow agents to communicate this data in real-time.
Agent Standardisation and Interoperability Protocols. In order for autonomous agents to effectively coordinate with one another, universal standards are required to overcome digital ‘language barriers’ as it were. What needs to be in place for this to be achieved?
Middleware systems are one solution, whereby they act as intermediaries that ‘translate’ messages from one agent to another. Alternatively, people can build interoperability APIs or SDKs that simplify integration or translation processes. Thirdly, there is a lot of space for developing a new kind of market for bottom-up standards for new kinds of schemas that agents may just be beginning to encounter or which have proven troublesome for agent coordination in the past.
Context DAO presents a good example for how this is already being done in the web3 space.
Agent Testnets for Advanced Applications. In order to fully trust agents with personal tools or information, individuals will create safe sandbox environments to understand how they work.
It is quite likely that testnets trend towards becoming public goods for AI safety, but it is an ambitious and impactful project to pursue nonetheless.
Public voting mechanisms for machine ethics. I discussed in a previous piece the continuing need for human-in-the-loop marketplaces to ensure that people can participate in the economic upside and voting procedures associated with responsible AI development.
Some more ideas:
- Reputation scoring systems for agents —> a la Black Mirror’s ‘Nosedive’
- ‘Agent Resources’ software for managing agents
- Incentive networks or protocols for stress testing vulnerability to adversarial attacks
Some of my favourite resources on Agents
This piece would have been impossible to write without the inspiration of the following resources below:
Toys
AgentGPT (personal agents in the browser)
Cognosys (personal agents in the browser)
AiAgent.app (personal agents in the browser)
CAMEL (agents that interact with one another)
Chirper (agents only social network)
Newsletters & Podcasts
Matt Schlicht’s AI Newsletter
Lunar Society with Dwarkesh Patel
Twitter Feeds
This article was originally published by Archie Whitford on Hackernoon.