LLMs control Robots with ERC-8004 discovery | Composable and distributed systems group
Mon, 2026-03-02
Sharing our experimental call summaries.
Al-generated digests of Yak Collective study groups.
Key resources discussed:
YakRover Discovery Architecture
yakrover-8004-mcp GitHub repo
Stream Link:
Demo Goal and Overall Architecture
The session walked through a working implementation of ERC‑8004 for discovering and remotely controlling physical robots via LLMs, with a live demo of a small self‑balancing “tumbler” robot.
The core architecture:
A set of robots and sensors, each exposed over HTTP.
Each robot is wrapped as an MCP (Model Context Protocol) server, so that LLMs can call robot actions as “tools.”
A single FastAPI “gateway plane” aggregates MCP servers behind one HTTP(S) endpoint.
The FastAPI gateway is exposed to the public internet via an ngrok tunnel.
ERC‑8004 is used on‑chain to register and later discover these MCP endpoints without a centralized registry.
In effect, the system provides a way to discover robots on‑chain, add them as MCP tools into an LLM session, and teleoperate them from anywhere.
From Individual Robots to a Unified MCP Gateway
The implementation evolved through several stages:
Per‑robot MCP servers:
Initially, each robot had its own MCP server:A self‑balancing robot with four HTTP endpoints:
forward,backward,left,right.A humidity/temperature sensor, exposed as an MCP tool that calls the robot’s readout endpoint.
A drone, controlled via a Python SDK wrapped in an MCP server.
Gateway consolidation with FastAPI:
Maintaining separate MCP servers became unwieldy, so everything was pulled behind a single FastAPI gateway:The gateway loads multiple MCP servers internally.
Exposes them externally as a unified HTTP interface.
URL structure:
/<robot-name>/mcpfor each device.
Public exposure via ngrok:
Using the free ngrok tier:The FastAPI server is tunneled under a random ngrok URL.
Each robot’s MCP endpoint hangs off that base URL (
https://<random>.ngrok.app/<robot>/mcp).Anyone with the URL can load the MCP tool into an LLM and talk to the robot.
There is also a “fake rover” that simulates the HTTP behavior of the physical self‑balancing robot, used for development and testing.
On‑Chain Discovery with ERC‑8004
ERC‑8004 is used here as an on‑chain registry for autonomous agents / services (in this case, robots):
Registration scripts:
Scripts built on top of an “agent SDK” register the FastAPI gateway URL on chain.
Once registered, the robots become discoverable to anyone who knows how to query ERC‑8004.
Discovery scripts:
A complementary script scans the ERC‑8004 registry and discovers robots that have been registered.
After discovery, the script can:
Optionally merge the discovered endpoints into a local MCP JSON config.
So when you start Claude (or another LLM with MCP), those discovered MCP endpoints show up as tools in your session.
No centralized server:
The key design goal: discovery is entirely via ERC‑8004, not via a centralized directory or registry service.
This provides a composable way to publish arbitrary MCP‑enabled robots or agents and have others discover them on chain.
LLM Integration and Live Teleoperation
Once the MCP JSON is set up correctly, the flow for a user is:
Clone the repo and place the provided JSON into the project folder.
Save it as
.mcp.json(or similar), so Claude can pick it up at startup.Start Claude; the MCP config is loaded, and the robot tools appear.
In the live demo:
Participants asked Claude to:
Check if the “tumbler” robot was online.
Move the robot forward and backward.
Turn it around twice.
Query the temperature and humidity.
Perform “funny dance”–style sequences (which sometimes got the robot to fall over).
There was some observable lag, but commands were reliably executed.
The presenter shared a camera view of the robot so remote participants could see the physical motions.
A toy concurrency experiment: two participants tried issuing conflicting commands (“move forward” vs “move back”) at the same time to see which would win. As currently implemented, there is no mutual exclusion; whoever’s command lands last effectively overrides the other. A future feature under consideration is some explicit mutual exclusion / “control lock” so only one user can command the robot at a time.
Setup Friction and auth Token Issues
The demo also surfaced practical issues in configuring MCP + auth:
MCP JSON creation:
There is a script option (
--add-mcpor similar) that is supposed to auto‑generate the MCP JSON config.In practice, it failed to write the file; users had to create
.mcp.jsonmanually viananoand paste in the provided JSON.It was suggested that the script should either write the file correctly or fail loudly with a clear error.
Bearer token for authorization:
The FastAPI gateway currently checks an HTTP
Authorization: Bearer <token>header.This is session‑level, not per‑user.
The initial MCP JSON lacked the auth header, so calls failed silently or confusingly.
Debugging revealed that:
The auth check must be done as early as possible in the request path.
If the auth token is missing, the gateway should error out immediately with an explicit, but not overly revealing, message.
Participants noted that the “correct” behavior here is:
Don’t send empty or placeholder auth headers.
If no bearer token is configured, fail fast with a clear “missing auth” error instead of letting the system half‑configure itself.
Security and API Hygiene
The group then zoomed out to more general security and API design questions for this kind of publicly exposed robot service.
Key security recommendations (not specific to MCP, but aligned with its spec):
Least privilege:
Design endpoints and permissions so that each component can do only what it strictly needs to.Strict input validation:
Validate all incoming parameters. This is especially critical when LLMs mediate user input, since they may generate unexpected payloads.Encrypt everything:
Expose only HTTPS on the internet‑facing side.
Avoid HTTP→HTTPS redirects where possible; redirects can introduce odd edge cases and weaken security guarantees.
Internally, the current implementation uses plain HTTP between the FastAPI gateway and robots. Long term, the ideal is TLS end‑to‑end, but the group acknowledged that many physical robots have very limited compute and may not practically support TLS.
Auth checks up front:
Perform authentication and authorization checks at the start of the request pipeline.
Ensure that “bad bearer token” requests cannot traverse any meaningful code paths.
Follow MCP security guidance:
MCP’s spec itself includes a security section with recommendations (input validation responsibilities, etc.).
Not all security‑relevant suggestions are centralized in that section, so a thorough read and keyword search (“input validation”, “security”, etc.) is advised.
An important nuance: the most critical exposure is the public internet boundary (FastAPI + ngrok). In contrast, unencrypted internal links between gateway and robots might be acceptable in a controlled lab environment, though encrypting those too would be best‑practice if the hardware supports it.
ERC‑8004 and Reputation/Spam Challenges
Beyond the working demo, the group discussed broader concerns with ERC‑8004’s handling of reputation and spam.
Key points:
Reputation spam and Sybil attacks:
The ERC‑8004 security considerations mention that representation and verification layers are vulnerable to spam, but participants felt this was under‑emphasized:Reputation systems can be overwhelmed by sock‑puppet identities.
Attackers could mass‑register low‑quality or malicious agents/robots.
Malicious verification URLs:
ERC‑8004 allows publishing verification URLs:There is nothing inherent preventing someone from registering a URL that serves malicious content (e.g., JS‑heavy pages that try to exploit clients).
Hashing content doesn’t fully solve this, since the page may load additional content at runtime.
Openness vs. spam resistance trade‑off:
The more open and permissionless the registry, the easier it is to spam.
The spec hand‑waves at “sophisticated algorithms” and “trusted validators” for reputation scoring, but this was criticized as essentially punting the problem to others.
Economic friction as a partial solution:
A recurring suggestion in reputation systems is to charge a small fee per registration or per message, to destroy the economics of large‑scale spam.
But humans and organizations are typically very resistant to any per‑interaction fees, even tiny ones (e.g., decades of failed attempts to introduce per‑email fees to fight spam).
In crypto settings, this tension reappears:
On the one hand, gas fees serve as spam friction.
On the other, many ecosystems push for gasless or sponsor‑paid transactions, re‑introducing spam incentives.
Examples and partial mitigations:
Farcaster’s registration model was mentioned as a partial solution:
Requires some fee and non‑trivial setup friction.
Currently protected more by obscurity and poor UX than by deep anti‑spam design.
NFT airdrop spam was raised as an example: once a wallet address is public, it becomes a target for unsolicited tokens.
Overall, the group agreed that ERC‑8004’s current text acknowledges these issues but offers mostly hand‑waving about “future sophisticated solutions,” with no clear mechanism baked into the standard itself.
UX and Ideal Workflow for Robot Users
There was also a more user‑experience oriented discussion: if this were to be a public “serve your robots to the world” service, what would an ideal flow look like?
Current workflow for a new user:
Discover the robot via the ERC‑8004 discovery script.
Use script arguments to generate or augment a local MCP JSON file.
Manually fix the MCP JSON if the script fails.
Ensure the auth bearer token is correctly configured.
Start Claude and begin sending natural language commands.
Feedback and ideas for improving the flow:
The setup is conceptually clean—once Claude is running, “just talk to the robot”—but brittle in the early steps.
Scripts should:
Fail clearly if required environment variables (like auth tokens) are missing.
Either reliably write
.mcp.jsonor explicitly tell users to copy/paste JSON into the right place.
Longer‑term, more user‑friendly front‑ends (web/iOS) would hide MCP config and auth details, with the backend orchestrating MCP + payments + discovery.
The group generally liked the overall flow once things were wired up, especially how natural it felt to teleoperate a robot via ordinary language.
Reflections on ERC‑8004 Maturity and Ecosystem Fit
In the final minutes, the group stepped back to situate ERC‑8004 in the broader “agents + on‑chain registry” landscape:
Spec status:
ERC‑8004 is still immature though at least some parts are live on Ethereum mainnet. There is a testnet environment where the “tumbler” robot registration can be inspected.Legibility and aggregation concerns:
One participant noted that if the goal is legibility and reputation of agents, aggregation becomes crucial, and it isn’t obvious:How aggregators will form.
How they will present and score thousands of agents.
How spammy or malicious agents will be filtered.
Documentation and discoverability pain:
A meta‑critique: even for technically inclined users, figuring out how ERC‑8004 works and what tooling exists (e.g., explorers listing tens of thousands of agents) required effort. This was framed as a broader ecosystem problem, not unique to 8004.
There was no strong convergence on solutions, but broad agreement that ERC‑8004 is interesting and usable today for concrete demos (like robots), while the harder questions of large‑scale reputation and spam resistance remain unresolved.
Wrap‑Up
Key takeaways:
A concrete ERC‑8004 + MCP + FastAPI + ngrok stack can expose physical robots as LLM tools discoverable via on‑chain registration.
Consolidating per‑robot MCP servers behind a single FastAPI gateway simplifies management and gives one internet‑facing surface to secure.
Live teleoperation via Claude works well once MCP configuration and auth tokens are correct; the primary pain is in initial setup and brittle scripts.
Security best practices—HTTPS‑only, strict input validation, early auth, least privilege—are essential when exposing robot control endpoints to the public internet.
ERC‑8004’s handling of reputation and spam is acknowledged but under‑specified; the spec largely defers to future “trusted validators” and external algorithms.
Economic friction (fees) remains the most obvious anti‑spam tool, but is politically and UX‑wise unpopular.
Open questions explicitly surfaced:
How should ERC‑8004‑style registries manage spam and Sybil attacks at scale without sacrificing openness?
What is the right user‑facing abstraction for multi‑user robot control (locks, queues, arbitration) over a shared MCP endpoint?
How far should encryption extend (end‑to‑end all the way to each robot) given hardware constraints on embedded devices?
What would a robust, low‑friction “ideal” workflow look like for non‑expert users wanting to discover and control robots via ERC‑8004 and MCP
Interested in distributed and composable systems? We meet weekly on Mondays, at 1600 UTC: https://www.yakcollective.org/join
New here? Start here for some background context:
1) About Yak Collective
2) Online Governance Primer
Call chat on Yak Collective Discord:
https://discord.com/channels/692111190851059762/1477881485561303060


