Interesting. I see from the video example it took a lot of steps and there is a lot of output for a simple task. I'm thinking this probably doesn't scale very well and more complex tasks might have performance challenges. I do think it's the right direction for AI coding.
neversettles 48 minutes ago [-]
Yeah, I suppose to esafak's point, perhaps a benchmark for browser agent QA testing would be needed.
esafak 3 hours ago [-]
Is there a benchmark for this? If not, you ought to (crowd?)start one for everybody's sake.
In browser MCP, looks like cursor controls each action along the way, but actually what we wanted was a single browser agent that had a high quality eval that could perform all the actions independently (browser-use)
GreenGames 3 hours ago [-]
This is very cool! Does your MCP server preserve cookies/localStorage between steps, or would developers need to manually script auth handshakes?
neversettles 3 hours ago [-]
Between steps it would preserve cookies, but atm when the playwright browser launches, it starts with a fresh browser state, so you'd have to o-auth to log in each time.
We're adding browser state persistence soon, hoping to enable it so once you sign in with google once, it can stay signed in on your local machine.
- but we found that Laminar came out with a better browser agent (& a better eval): https://www.lmnr.ai/ so we're looking to migrate over soon!
How does this compare to browser mcp (https://browsermcp.io/)?
We're adding browser state persistence soon, hoping to enable it so once you sign in with google once, it can stay signed in on your local machine.