OpenClaw and Agent Infrastructure | Composable and distributed systems group
Mon, 2026-02-23
Sharing our experimental call summaries.
Al-generated digests of Yak Collective study groups.
Key resources discussed:
https://blog.exe.dev/show-dont-tell + personal hands-on experiences
OpenClaw and Agent Infrastructure Discussion
First round experiences (actual users only):
Jenna: Running Spacebot in VM on exe.dev
Avoiding dev environment customization trap
Back to vanilla Claude Code for production work
Greenfield projects showing most success
Nathan: Limited OpenClaw experience
Wrestling with proper sandboxing approaches
Reluctant to use exe.dev due to local data/bespoke tools
Good specs = good results on small projects
Others passed (no hands-on experience yet)
Framing: Open-Ended Agents, Trust, and Personal Adoption Curves
This CADS session centered on “OpenClaw”–style agents: long-lived AI processes with broad permissions, often running on cloud VMs and wired into personal or corporate systems. The group mostly had not yet used these systems hands-on; instead, the discussion explored why, what the real use cases might be, and what changes in security, process, and infrastructure they imply.
The conversation repeatedly contrasted:
Short-lived, “sandboxed” tools like Claude Code running on a local machine
Long-lived, broadly-permissioned agents living on cloud infra, wired into email, APIs, and other services
Across backgrounds (security, infra, “digital Amish” power users, early adopters), people wrestled with whether and how to give such agents real autonomy.
Current Usage: Mostly Claude Code, Very Little “OpenClaw”
In the first pass around the room, most people reported:
Little or no direct use of OpenClaw-style systems
Several participants explicitly said they had not run OpenClaw or similar.
The main active tool in this space was Claude Code, often used via a terminal.
A participant’s experience with Claude Code for ‘greenfield’ projects
Built a substantial system: a “big ass” scraper that uses the SAM.gov API and ~20 US maritime port sites to surface bids for a naval architect client.
Included a usable dashboard and Microsoft auth integration, assembled step-by-step with Claude Code.
Emphasis: for large, from-scratch (“greenfield”) projects with clear specs, the tools “really shine.”
Another participant’s use of Claude Code for a data-porting project
One-shot spec-driven transformation of personal data, including inferring time zones and weather from under-specified location data.
It “aced it” aside from one bug that traced back to a mistake in the spec, not the model.
Reinforced the emerging rule: high-quality specs → high-quality results.
Others keeping things “vanilla”
Some use Claude Code or Codex inside editors/IDEs as part of daily workflow.
Deliberate avoidance of further customization or complex orchestration to prevent getting sucked into dev environment tweaking.
Overall: hands-on usage is concentrated in local, bounded tools. Very few have yet crossed into running full-blown, always-on agents with broad access.
Personal Adoption Styles and Risk Appetite
The group exposed a spectrum of attitudes toward adopting these more open, integrated systems:
Conservative / “digital Amish” perspective
Only adopt new tech with very clear, solid use cases.
Minimal personal tech footprint compared to peers; reluctance to give “unbridled access” to personal or work data.
Corporates constraints also matter:
Limited to older, less capable models via chat-only interfaces.
No API access due to spend concerns.
Result: ideas for agents are constrained by what’s actually allowed.
Aggressive early-adopter perspective
Actively trying to adopt new tech “as soon as possible.”
Already:
Building MCP (Model Context Protocol) servers for robots.
Setting up speech via 11 Labs and custom voices.
Buying a dedicated SIM and phone specifically for agent experiments.
Main blockers: time and designing sane sandboxing (separate email, segregated data, etc.).
Middle-ground pragmatists
Interested but wary of cost blow-ups and operational surprises.
One concrete example: an “open Claude” instance wasn’t fully shut down and kept polling for days, incurring ~USD 70 in unintended cost. This was seen as a relatively lucky outcome compared to others’ horror stories.
This diversity of adoption styles became a recurring theme: some people intrinsically think and operate in a “sandbox-within-a-single-tool” mode; others natively think in terms of stitching together many services and automations.
Where These Agents Shine: Greenfield Apps and Personal Tools
A recurring observation was that today’s tools excel in a fairly specific niche:
Greenfield projects with clear specs
High leverage when you can define the problem cleanly and let the agent generate the scaffolding:
Example 1: the SAM.gov + maritime port scraper and dashboard.
Example 2: one-shot data transformation job with an explicit spec.
For non-coders who understand software conceptually, this can be especially powerful.
“Quick personal projects” / small throwaway apps
A participant emphasized that where people are having fun and really easy successes are: quick, one-off personal tools. Easy to build and equally easy to discard.
For those who haven’t tried anything yet, the recommendation was to start with some small, low-risk personal need.
By contrast, long-lived, critical-path integrations (e.g., with corporate infrastructure and sensitive data) remain uncomfortable territory for many in the group.
Agents as “Corporate People”: A Process and Permissions View
One of the more coherent frames offered was to treat agents not as magic tools nor as “people” in a strong philosophical sense, but as:
People in the corporate sense — corporate agents with roles, permissions, and review processes.
Key points in this framing:
Agents should be slotted into existing governance practices
Assign roles and permissions using principle of least privilege:
What data they can access.
What systems they can act on.
Require review for impactful actions:
Analogous to humans needing code review or approvals before production changes.
Recent failures often look like process failures
Stories about agents deleting emails or affecting production services frequently trace back to:
Missing or bypassed review steps.
Over-broad, unreviewed permissions.
So agents take the blame for what are actually process and governance gaps.
Productivity gains may come from enforced rigor
To use agents safely, organizations will need:
Better, more current documentation (not the usual stale READMEs).
Clearer, enforced processes and permissions.
This infrastructure of clarity will also benefit humans, not just agents.
Several people noted that many engineers already ignore human-written docs when onboarding because they’re out-of-date; a move toward “docs for agents” might be the forcing function that fixes this.
This view suggests that the transformative part of agents may be less about their raw “intelligence” and more about how they drive formalization of messy practices.
Security and Permissions: From Local Sandboxes to “Permission Soup”
A big chunk of the discussion unpacked how permissions evolve as you go from local tools to fully “open” cloud agents.
1. Local Claude Code: Fuzzy but Bounded
Running Claude Code on your own laptop:
It usually asks permission when writing to new folders or when using certain tools (e.g., some
pipinstalls).However, the boundaries are somewhat illegible:
The file-system boundary is semi-clear.
The boundary around invoking other services/commands is much less clear — many tools run without explicit prompts.
Net effect: you rely on operating-system-level protections and ad-hoc prompts, which are not obvious or uniform.
2. Escalating OpenClaw Capabilities
A participant proposed thinking of “OpenClaw” as relaxing boundaries step by step:
Root on a machine
Instead of a normal user account, the agent gets superuser control on a box.
Implementation details aside (sudo, direct
su, etc.), this greatly expands what it can do locally.
Move the box to the cloud
The machine now lives in a remote environment you can’t power-cycle physically or locally
kill -9.You lose your last-resort physical kill switch.
Hand it credentials for external services
API tokens (e.g., Tailscale).
OAuth tokens for Google or other third-party services.
Now the agent’s “reach” extends well beyond the machine into account-level ecosystems.
Run into other people’s policies
Example discussed: giving OpenClaw permission to use Google’s “antigravity” OAuth service was reported (on Reddit) to lead to instant, irreversible bans.
Now your agent’s behavior is bounded by the security policies of other actors, not just your own.
Agents colliding with each other
Hypothetical but likely future: Alice’s agent and Bob’s agent operate in overlapping resource or permission spaces.
Possible issues:
Race conditions and classic distributed-systems conflicts.
“Dumb agent dynamics” akin to the Three Stooges: agents fighting or undoing each other’s work.
This progression leads toward what the participant remarked as a
permission soup of illegible Venn diagram boundaries with multiple overlapping data and skill domains, each governed by different policies.
3. Data Boundary vs. Skills/Program Boundary
A useful conceptual split:
Data boundary: What data can the agent read, write, and delete (and which of those are reversible).
Skills / program boundary: What actions it can take and which tools/services it can invoke, and where those capabilities are hosted (local vs. remote).
The group noted that while, in principle, code is just data, it is still practical to think in terms of these two boundaries for designing and reasoning about agent architectures.
Personality Mirroring: You Build Agents in Your Own Image
Venkat introduced a psychological/behavioral hypothesis:
Agent capabilities you can safely use depend on the kind of person you are operationally.
If you are naturally a “sandbox thinker” (e.g., Matlab: rich internal environment, poor external integration), you will:
Be comfortable with contained agents doing well-defined tasks.
Struggle to set up, govern, or even imagine sprawling multi-service agents.
If you are someone like Anuraj who:
Keeps track of dozens of tools and paradigms.
Constantly wires services together.
Is comfortable living in a web of integrations.
Then building open-ended, multi-service agents is much more aligned with how you already operate.
You can only design agent “personalities” close to your own
The agents you can create and reason about will mirror:
Your tolerance for complexity.
Your habits around API glue, documentation, and error handling.
People may have different “modes” they can project:
Example: Venkat’s “roving librarian” mode:
Agent that walks the web and local storage, collects papers (arXiv, etc.), and organizes a growing personal library.
This mirrors his existing human behavior: hunting and organizing papers.
Speculative implication: the eventual population of agents may exhibit a rich diversity of “operational personalities,” reflecting the psychographic diversity of their human designers.
Economic and Market Effects: Inference Demand and Local Compute
The group briefly sketched likely market-side consequences:
Increased demand for inference
Always-on agents inherently drive more model calls.
Many individual users cannot afford persistent 24/7 agents, but:
There will be niches where 24/7 is justified.
Burst or scheduled usage may be a pragmatic middle ground.
Pressure for local inference
To control cost and latency (and sometimes privacy), we should expect more:
Local models.
Edge and microcontroller deployments.
The participant mentioned two particularly small models that:
Fit in under ~900 KB and ~20–30 KB respectively.
Run on ESP32 microcontrollers with WiFi.
This hints at a future where even tiny embedded devices run narrow, specialized agents.
The group did not deeply quantify these effects but agreed they are directionally important.
Authentication and Identity: Humans vs. Agents
In the latter part of the discussion, the group focused on authentication as the main bottleneck.
Existing Human-Facing Landscape
Passkeys, hardware tokens, and SMS 2FA
Passkeys (e.g., FIDO2/webauthn-based) are widely seen as:
Stronger and more usable than passwords.
Earlier optimism envisioned everyone carrying hardware keys (YubiKeys), but:
Many users do not want another physical key.
People prefer centralized accounts (single identity providers), despite:
Security experts warning about single points of failure.
Pragmatic view on SMS 2FA
SMS is weak as a second factor, but:
It is still far better than no 2FA.
The realistic alternative for most users is not “YubiKey everywhere,” but “nothing.”
Concern: in security culture, “perfect beats significantly better” often blocks incremental improvements.
Fragmented schemes: OAuth, passkeys, keypairs, 2FA apps, magic links
The space is:
Badly fragmented.
Heavily reliant on OAuth.
Corporate vs. personal:
Major providers (e.g., Google) treat corporate and personal identity stacks differently.
Corporate offerings often lag in features (e.g., limited sharing of LLM notebooks).
Why Agents Need Something Different
A participant pointed out:
Agents “speak text”; humans “speak images”
Agents:
Consume and produce text/API calls as their native medium.
Humans:
Perceive web content primarily as images/layouts, even when reading text.
Implication:
Interfaces and auth mechanisms designed for human image-based interactions (CAPTCHAs, QR codes, etc.) do not translate cleanly to agents.
Keypairs for agents, passkeys as human-facing wrappers
Likely pattern:
Under the hood, “everything is keypairs.”
Humans interact via passkeys (a friendly abstraction over keypairs).
Agents use keypairs more directly, without the UI layers.
Ephemeral keypairs as a design improvement
Many people treat SSH keys as permanent, reused across machines.
One security practice discussed:
Tie SSH keys to devices.
Create many disposable keys (easy to add to
authorized_keys).This makes key compromise more tractable and encourages frequent rotation.
Hypothesis: agents could benefit from ephemeral keypairs:
Short-lived, scoped credentials.
Easier to revoke and reason about than long-lived secrets.
MCP Servers as Authentication Chokepoints for Agents
A concrete architectural suggestion:
Do not give agents raw API keys or keypairs.
Instead:
Place an MCP server in front of each protected resource or service.
The MCP server:
Presents a text/API interface to the agent.
Internally injects and manages the real authentication credentials.
Acts as a central choke point to:
Enforce policy.
Limit scope.
Revoke access rapidly if something misbehaves (e.g., mass email deletion).
Analogy offered:
For humans:
Passkeys are a user-friendly auth shim over keypairs.
For agents:
MCP servers could be the equivalent shim, translating agent actions into authenticated operations while hiding the underlying secrets.
The group agreed this space is under-developed and worth a dedicated future session, explicitly on authentication protocols.
Email, Identity, and the Need for “Agent Accounts”
Another practical thread:
Need for separate identities/accounts for agents
Multiple people mused about:
Giving agents their own email accounts.
Segregating personal vs. agent data and permissions.
Today, cheap or small organizations often “abuse” free personal accounts for light organizational use. Agents might intensify this pattern.
Gaps in consumer identity products
Current options:
Google Workspace (overkill for just wanting multiple “family” or “team + agents” emails).
Some niche providers with family plans.
Problems:
Corporate account stacks are often feature-limited vs. personal accounts.
Some AI-specific features (LLM notebooks, sharing primitives) are poorly integrated or blocked cross-tenant.
Speculation: agents may drive new consumer/account models
For instance:
Plans explicitly designed for:
A human “owner” + several agent identities.
Clear boundaries and shared billing.
A tool called “Agent mail” was mentioned in passing as a direction, but not explored in detail.
Group Reflection: Why So Little “Tire Kicking”?
A participant observed an interesting dynamic:
Despite being:
Highly technical.
Deeply interested in agents conceptually.
Most of the group has:
Done relatively little hands-on experimentation with open agent frameworks, apart from a few.
Stuck to local tools (Claude Code) and conceptual discussion.
This was framed not as a failure but as:
A sign of:
Legitimate caution.
Time constraints.
Mismatch between hype and concrete personal/work use cases.
A reminder:
This is a transitional, post-“forest fire” ecosystem:
Early agent frameworks like OpenClaw are “fast colonizers” (weeds after a fire).
The more stable “second wave” architectures and security patterns are still forming.
The group is already trying to model and anticipate that second wave rather than fully committing to the first.
Wrap-Up
Key takeaways
Most real usage in the group is with bounded, local tools like Claude Code, not fully open, always-on agents.
Greenfield, spec-driven projects (e.g., scrapers, data pipelines) are where these tools currently shine.
A compelling frame is to treat agents as corporate “people” with roles, permissions, and review workflows, not just as deterministic tools.
The evolution from local tools to cloud-hosted, root-level, broadly credentialed agents leads to a “permission soup” that’s hard to reason about.
Agent designs seem to mirror their human creators’ operational personality: sandbox thinkers vs. multi-service orchestrators.
Authentication and identity are emerging as the key bottleneck, with promising ideas around:
Ephemeral keypairs,
Passkeys for humans,
MCP servers as auth chokepoints for agents.
Economic and embedded trends point toward a rise in local and microcontroller inference for narrow, always-on agent roles.
Open questions explicitly surfaced
What should the second-wave architecture for agents and permissions look like, beyond today’s ad-hoc OAuth + API-key mix?
How should we design auth and identity schemes specifically for agents, rather than repurposing human-facing mechanisms?
What is the right granularity and lifecycle for ephemeral keys in agent systems?
How will account and email models evolve to support multiple agent identities per human?
To what extent will the main AI productivity gains come from better process and documentation rather than from the agents’ raw capabilities?
Next steps
The group explicitly proposed a dedicated session in a couple of weeks focused on authentication protocols for agents and humans.
Participants were invited to share reading suggestions on authentication and identity in the meantime.
Interested in distributed and composable systems? We meet weekly on Mondays, at 1600 UTC: https://www.yakcollective.org/join
New here? Start here for some background context:
1) About Yak Collective
2) Online Governance Primer
Call chat on Yak Collective Discord:
https://discord.com/channels/692111190851059762/1475329096769736849



If I had infinite time I would definitely spin up an open claw or similar instance on a dedicated Mac mini like all the cool kids but sadly life and work intervene. Thanks for the nice summary. I enjoy these little async logs