The Age of Agent Experience

109 points by bobfunk 5 months ago

xnx 5 months ago

The popularity of agents that run from users' devices is going to push sites that don't have logins to add them and sites with logins to add tougher captchas.

javasquip 5 months ago

I think the underlying assumption in this is an important question to consider. Should we treat agents as we would have treated bots over the decades. I do believe that treating agents like traditional bots of old misses an important aspect. Traditional bots are doing something with the intent to serve some external entities gain (scraping content, attacks, etc.). Agents, while leveraging similar systems, are serving a site's end consumer. When I use an agent to shop, I'm still the customer of the shop. As the shop owner, I want to give the best experience therefore it's in my best interest to provide an AX that supports them providing a good experience to the end user. Because my target customer is now using an agent to help make a purchase, if I shut my door to their delegated system, I'm telling them to shop somewhere else that does support this.
We are early enough in this evolution to help direct the ship in a way that serves the end user, web owners/creators, and the agent.
- throwup238 5 months ago
  
  I think economic incentives are going to get in the way of that, as is tradition. Amazon’s dev teams in charge of the retail web interface might want to make it easier to sell you more products regardless of interface but there’s always a competing VP with more influence that wants to juice their KPIs by stuffing more advertising down the user’s throat, so they drive top down decisions that impede agents.
  It’s almost inevitable since everyone wants more growth and advertising is almost always seen as free money left on the table by decision makers.
  
  javasquip 5 months ago
  
  I agree! That said, they won't turn down the money through affiliate systems and resellers either.
  The economic incentives, the brand control needs, etc. are important dynamics and I don't think it's all in their court alone. It's a combination of where the market goes (the platforms and systems they prefer) and the capabilities unlocked by those platforms.
  With that, this evolution will follow the propagation of agent usage. So we will see a lot more initial adoption of AX principles and patterns from developer tools because the software industry has be the most infiltrated by the rise of agentic workflows. As that expands, the nature of markets and meeting user needs will drive adoption of AX.
  
  ori_b 5 months ago
  
  Yes, but competing with that -- imagine how much easier it would be to phish an agent into buying a product on the user's behalf.
  
  plagiarist 5 months ago
  
  That's my reaction to the GP's comment. Shop owners will not optimize for agent ease of use. They will optimize for convincing agents to make a purchase. This will play out like SEO, with everyone other than the bad actors losing out.
  
  javasquip 5 months ago
  
  There are a few layers to this worth considering.
  - In this world the information delivered to agents should align with content delivered visibly to the human web. This is essentially how the bulk of SEO overloading is detected. There needs to be a way to validate this and establish trust - completely solvable. These techniques penalize these schemes from the outset. (this is probably not the best forum to go too deep into that)
  - We're assuming agents have full buying decisions here. I do not believe we will see that as common place for a long time. Even if we did, the same systems for PCA compliance are in play and the interfaces pushed by both payment gateways and shopping carts protect against duplicate purchase attempts. Those attempting to abuse this fall more into the malicious actor camp.
  - phishing and malicious actors are going to do what they have always done. There are some very important security, access control, and compliance measures we should put in place for the most sensitive of actions - as we always have where most existing ones still apply. The agent experience and the ecosystem in general will have to evolve to have verifiable trust patterns. So that when a human delegates to an agent to do something, the human can have confidence and ways to validate interactions.
  I'll be the first to admit that I don't have all of the answers here but with agents becoming the new entry point or delegation tool for the next generation of digital users, these are questions we have to answer and solve for. It starts by focusing the industry around the domain of this problem, that is AX. How to do it effectively and what needs to evolve to achieve it... that's where the work is.
- Covenant0028 5 months ago
  
  > Agents, while leveraging similar systems, are serving a site's end consumer. When I use an agent to shop, I'm still the customer of the shop. As the shop owner, I want to give the best experience therefore it's in my best interest to provide an AX that supports them providing a good experience to the end user.
  This is fine until the agent decides to order something the customer did not want. This is inherent to the concept of an agent. Due to the probabilistic nature of LLMs, and the fact that no agent will ever be perfectly able to predict exactly what you want at the time you want, this scenario is inevitable.
  As the shop owner, this would result in an increased numbers of returns. You could recommend that the user must approve the purchase, but given that you do not define these agents, there is no way for you to ensure that the user is actually following your advice.
  
  javasquip 5 months ago
  
  There are ways to ensure that the end user provides authorization. While the shop owner does not control the agent it does control purchase authorization - primitively that could look like requiring a pin/cvv, confirming via text sent code, etc. This concept can recursively assume that an agent can do these things on the user's behalf but this is where limits come in, compliance regulations, etc. It's not in the shop's or the agent's interest to integrate poorly within these flows. That said, this is where we should establish the conventions that we can enforce consistency and compliance as well as validate them. It wouldn't be hard to imagine that an agent must prove they are operating correctly before they can initiate actions such as purchase requests and then the agent's authority is known and can be held accountable for misuse.
8338550bff96 5 months ago

There are no websites that I visit now that don't have a login that I would still visit if they suddenly started putting up captchas
wslh 5 months ago

I cannot see the difference in the access mechanism between an agent and what we use today for APIs consumption. The agent, whatever it is, is basically a client, P2P node, etc.
- hacker7070 5 months ago
  
  Exactly I also believe that UI would get redundant. In fact agents don't even need to make decisions looking at visual like we use web. Imagine your browser being an agent that takes decisions, it knows which get requests to fetch data from and how to make payments too.
IncreasePosts 5 months ago

Wouldn't the agent just send a notification to the user's phone and say "can you solve this please?"
- bronco21016 5 months ago
  
  Apparently this is how the "automated" solvers work. Would love to find a source describing how all of this works. One website I frequent uses Datadome and their captcha has a timer on it. I'm assuming this is a factor in "human-ness". Are we all going to be tied to our phones solving captchas as fast as possible?
- Covenant0028 5 months ago
  
  It's more likely that the user will need to ask the agent to solve the CAPTCHA, because right now AI bots are better at solving CAPTCHAs than humans are.
maxwellg 5 months ago

This is why I'm so bullish on OAuth for sites with logins - you get a strong real user identity to tie the agent's behavior back to. This means you have (some) proof that the agent is helping your end users consume more of your site, and you can also revoke access to agents that misbehave.
mtrovo 5 months ago

We might live in a world where veto-ed assistants get VIP access to use the websites impersonating their owners without much second thought as long as you're at least on the paid Flash Max Pro™ plan.
vlan0 5 months ago

Ya, webauthn with hardware requirement would kill it too. Gotta physically touch it. It’ll be gross when someone starts to automate that too.
- IncreasePosts 5 months ago
  
  One time I duct taped a cooked sausage to a USB fan and arranged it so the sausage was continually slapping my passive touch two-factor authenticator. Is that the kind of gross you were talking about?
  
  xnx 5 months ago
  
  https://www.vice.com/en/article/this-piece-of-meat-just-swip...
- tomjen3 5 months ago
  
  It would also kill it for 99% of humans.
  My entire extended family has two yubikeys: My key and my spare key.
whazor 5 months ago

Captcha solvers are already quite cheap. AI could make it cheaper, but for a single user, I don't think it would make a difference.

turnsout 5 months ago

Not to take the bait on this bit of content marketing ("the future of agents is OAuth, says company that sells OAuth solution"), but: I disagree with the premise that agents should basically use the same APIs and auth mechanisms that humans & apps currently use.

I realize there's a strong impulse to not "reinvent the wheel," but what we have currently is unsustainable. Specifically, the fact that every API uses a slightly different REST API and its own unique authentication & authorization workflow. It worked fine for the days when application developers would spend a few weeks on each new integration, but it totally breaks down when you want to be able to orchestrate an agent across many user-defined services.

I think a simple protocol based on JSON and bog-standard public key encryption could allow agents to coordinate and spend credits/money based on human-defined budgets.

pr337h4m 5 months ago

We're finally putting the 'agent' in 'user agent'

Terr_ 5 months ago

And the agent actually works for a large corporation with zero fiduciary duty to the user.
tomrod 5 months ago

Legit chuckle from me!

contextfree 5 months ago

Back when REST was a new hot buzzword and people were debating its true meaning, I remember thinking that some of the arguments for HATEOAS only really made sense if your client apps were going to be some kind of AI graph navigators. So I wonder if being particular about HATEOAS makes more sense now?

abhshkdz 5 months ago

Good read, thanks for sharing! I'd love for OAuth to be augmented with agent-friendly scopes. Completely agree that it's a standard that doesn't need to be reinvented. But in how things are today, there's two broad areas where OAuth doesn't quite cut it:

1) long tail of websites that don't have APIs, so the only way for an agent to interact with them on the user's behalf is to log in more conventionally, and

2) even if a website has APIs, there may be tasks to be done that are outside the scope of the provided APIs.

Thoughts?

jelambs 5 months ago

author of the post here, yeah this is a really good point. I think we're going to see more people investing in building OAuth compatible apps and more thorough APIs to support agent use cases. but of course, not every site is going to do so, so agents will in many cases just be doing screenscraping effectively. but I think overtime, users will prefer using applications that make it easier and more secure for agents to interact with them.
I was an early engineer at Plaid and I think it's an interesting parallel, financial data aggregators used to use more of a screenscraping model of integration but over the past 5+ years, it's moved almost fully to OAuth integrations. would expect the adoption curve here to be much steeper than that, banks are notoriously slow so would expect tech companies to move even more quickly towards OAuth and APIs for agents.
another dimension of this, is that it's quite easy to block ai agents screenscraping, we're able to identify with almost 100% accuracy open ai's operator, anthropic's computer use api, browswerbase, etc. so some sites might choose to block agents from screenscraping and require the API path.
all of this is still early too, so excited to see how things develop!
- bboygravity 5 months ago
  
  If website haven't been able to make even consistent logins and forms for humans to use, what makes you think they will be able to make usable API's for agents to use?
  I've tried making a Firefox extension that fills webforms using an LLM and the things website makers come up with the break their own forms for both humans and agents are just insane.
  There are probably over a 1000 different ways to ask for someone's address that an agent (and/or human) would struggle to understand. Just to name an example.
  I think agents will be able to get through them easily, but NOT because the websites makers are going to do a better job at being easier to use.
- danielbln 5 months ago
  
  Interesting, what's are the heuristics for blocking? User agent? Something playwright does, metadata like resolution or actual behavior?
  
  sethhochberg 5 months ago
  
  The user agent is pretty low hanging fruit, but these days even your most standard captchas / bot detection algorithms are looking at things like mouse movement patterns - a simple bot controlling a mouse might be coded to move the cursor from wherever it is to the destination in the shortest path possible; a human might try for the shortest path, but actually do something that only approximates the most direct path based on their dexterity, where the cursor began, the mouse they’re using, etc.
  Tools in this space rely a lot on human use of a computer being much slower, less precise, and more variable than machine use of a computer.
  
  jelambs 5 months ago
  
  we're looking at signals from the network, device, and browser as well as patterns across requests to identify these agents. in some cases, like operator today, it's quite trivial to identify based on the user agent but that's quite easy to mask if they wanted to.
  behavioral data like mouse movements, shortest path, etc is helpful but likely to result in less of a deterministic signal compared to device intelligence based on those signals of where and how the request is being made.
  we'll have a more in depth blog post on what we're seeing with this next week too.

imcotton 5 months ago

I also think OAuth could be used to better serve AX in the age of agent, but before the whole industry find the PMF, shall we not leave the humans (us) behind? Thus I made one for breaking the grip of big IdPs and offer a more secure and easier authentication solutions for humans [1].

You can find its dogfooding demo on the Show HN [2].

[1]: https://sign-poc.js.org

[2]: https://news.ycombinator.com/item?id=42076063

babyshake 5 months ago

How does computer use APIs affect this? Isn't the whole idea that a UI that a human can use should also be usable by an agent without a lot of special accommodation made? For high volume automation and API is a lot more efficient, but for lots of typical automation (automating what would take a human 20-30 minutes of work to do themselves) this doesn't matter too much.

maxwellg 5 months ago

I think OAuth can complement computer use. Imagine if an agent went through an OAuth flow to get an access token, and was able to use that access token to interact with the same UI that a human interacted with. You'd get a few benefits:
- The human wouldn't need to share their password information with the agent
- Services would be able to block or ask for approval when agents take sensitive actions. Maybe an e-commerce site is happy to let an agent browse and add items to a cart, but wants a human in the loop for checkout.
- Services would be able to attribute any actions taken to the agent on behalf of the user. Did Joe approve this expense report, or did Joe's agent approve this expense report?

svilen_dobrev 5 months ago

hm. API stands for Application Programming Interface. Which IMO is not same as Application Agentic Interface.. similar to how it is not Application's Human Interface. Maybe closer than that.

But, parsing documentation? And, believing it blindly? hah. Maybe ressurect Semantic web as well..

semi-extrinsic 5 months ago

> Maybe ressurect Semantic web as well
This gave me a chuckle. I believe the current hype term along this line is "ontologies".
bobfunk 5 months ago

Yeah interestingly API's in their current form are rarely very good for agents. In many cases tools like Operator using a virtual browser and screenshotting are better for agent interactions than API specs.
This shows we need to build better approaches to agent interactions that are not at the level of "run a virtual browser", but that encodes much more of the workflows available than raw API's do today.
- svilen_dobrev 5 months ago
  
  for anything more complex than single throw-this-data-there, probably a wizzard-like workflow would be better. The client initiates it but then the server leads it instead of being 100% passive, e.g. "enter (date|name)" >Then> "enter (amount & currency)" >Then> whatever-else. i am not sure if any such thing exists as protocol; usual REST APIs are just an alphabet with client-driven alphabet-punching that can be combinatorially applied without any order ; the server may very well know the correct order but cannot elegantly enforce it.

satisfice 5 months ago

AI agents do not have agency. This is just another sloppy and disturbing way that AI people show their disrespect or incompetence about the nature of humans.

If you think AI has agency then you must think all software has agency. AI is just software.

To those of you who say humans are just software: try deactivating a human and see what happens. Note that this is a different experience than deactivating AI.

jelambs 5 months ago

Thank you for sharing!