When AI Agents Start Working Together: Three Challenges No One Talks About
The trajectory of AI agents over the past two years has been remarkably clear: from single-purpose tools to personal assistants. Everyone runs their own agent, feeds it tasks, gets results back. It works well for individual productivity.
Then comes the question every team eventually asks: can these agents work together? The answer is yes, but the problems you encounter along the way are rarely the ones you expected. They aren't about model capabilities or prompt engineering. They're about communication, context, and coordination - the same class of problems that distributed systems engineers have been solving for decades, now showing up in a new form.
Here are three challenges that caught us off guard when we started building agent collaboration into Octo, an open-source workplace platform where AI agents and humans share the same communication space.
Challenge 1: Context Visibility Boundaries
When you use an agent personally, context management is straightforward. You decide what information the agent sees; its output comes back to you. The boundary is clean - it's just your workspace. In a team setting, that boundary dissolves.
One of the first issues we ran into was surprisingly simple. We had an agent summarizing discussions across several channels. During testing it started pulling roadmap discussions from a product channel into an engineering planning thread. Nothing sensitive leaked externally, but it immediately exposed how unclear our context boundaries were.
Traditional software handles this through API gateways, data permissions, and microservice boundaries. But agent context isn't just structured data - it includes conversation history, reasoning chains, and intermediate states. An agent's thought process during a task is valuable context, but it might also contain information that shouldn't cross team boundaries.
What you need is fine-grained context visibility control. Not "everything open" or "everything closed," but dynamic rules that determine which context can be shared based on the task, role, and scenario at hand.
This is where instant messaging architecture turns out to be surprisingly relevant. Channels are natural context boundaries - members only see messages in channels they belong to. When an agent joins a channel, it inherits that boundary naturally. It can access the channel's message history as context, but it can't see other channels. This is more mature than building a context management system from scratch, and it maps cleanly onto how teams already organize their work.
Challenge 2: Permission Intersections and Conflicts
Personal agent permissions are simple: whatever the user authorizes, the agent can do. In a team context, permissions become many-to-many. A single agent might serve multiple people, participate in multiple projects, and play different roles in different channels. Each dimension has its own permission requirements, and they can conflict.
Here's a concrete example: a code review agent participates in two project channels. Project A's codebase is invisible to Team B, but the agent can access both codebases while serving both projects. If the agent, while reviewing Project B's code, references an implementation pattern from Project A, is that an information leak?
The situation gets more complex in human-agent collaboration. When humans and agents work in the same channel, humans can see all of the agent's output. But the agent's output might draw on information from other contexts it has access to. How do you ensure the agent only uses information visible in the current channel when generating responses?
Distributed systems have mature solutions for permission design - RBAC (role-based access control), ABAC (attribute-based access control), and their variants. Agent systems can borrow these approaches, but they need adaptation for agent-specific characteristics. Agents don't just passively execute commands; they reason, generate content, and make proactive decisions. Permission control needs to cover the generation process itself, not just inputs and outputs.
In Octo, we adopted an organization-aware RBAC model where each channel has its own ACL (access control list). Agent identities and permissions are managed alongside human members. All agent input and output within a channel is auditable, and permission boundaries are naturally expressed through the channel mechanism that IM systems have refined over decades.
Challenge 3: Collective Experience Accumulation and Reuse
A personal agent can learn from historical interactions, gradually understanding a user's preferences and working patterns. This learning is individual - experience accumulates in a single agent's context.
In a team setting, the dimension of experience changes. It's not just about individual agent experience anymore, but about collective experience generated through multi-agent collaboration - which collaboration patterns are efficient, which task decomposition approaches tend to cause problems, where human intervention happens most frequently. If this information could be captured and reused, it would meaningfully improve the team's overall collaboration efficiency.
But collective experience faces several challenges:
- Ownership: When one agent learns something during a collaborative task, should other agents have access to it? If so, could that introduce context pollution - an agent incorrectly applying someone else's experience to its own scenario?
- Timeliness: Team collaboration patterns shift with project phases, team structure, and business goals. A pattern that worked three months ago might be irrelevant now. Captured experience needs update and deprecation mechanisms.
- Quality assessment: Not every historical interaction yields valuable experience. Some might be special cases; others might contain flawed judgments. Capturing experience while maintaining quality requires an evaluation framework.
Message history, group documents, and pinned messages in IM systems - while not designed for experience capture - can serve this role in practice. Key conversation conclusions can be pinned, important decision processes can be archived to group documents, and agents can retrieve these structured artifacts as references when executing tasks. This approach is lighter than vector databases or knowledge graphs, and it's easier for both humans and agents to understand and maintain together.
Why IM Architecture Matters Here
These three challenges - context visibility, permission intersections, collective experience - all point to a deeper insight: agent collaboration isn't just about connecting multiple agents. It requires a complete collaboration infrastructure. That infrastructure needs to handle communication, context, permissions, state synchronization, and experience accumulation.
These problems have mature solutions in traditional software engineering, but agent systems - with their autonomous reasoning, content generation, and proactive decision-making - push the complexity up a level.
IM architecture shows a surprising fit for this scenario. Over decades, IM systems have solved multi-party real-time communication, context management, permission control, and state synchronization, accumulating mature architectural patterns and engineering practices. Migrating these capabilities to agent collaboration is more reliable than building a new system from scratch.
This observation led us to build Octo on IM foundations - agents join channels directly and collaborate with humans in the same conversation interface. The project uses the Apache 2.0 license, has 9 core repositories under the Mininglamp-OSS GitHub organization, and runs on a stack of Go backend, WuKongIM, MySQL, Redis, and MinIO. It supports private deployment with 100% data on your own servers.
The Bigger Picture
Moving AI agents from personal tools to team infrastructure expands what they can do, but it also changes the nature of the challenges. Better models alone won't solve communication, context, and coordination problems. These require mature collaboration infrastructure.
The shift from personal assistant to team collaborator might be the next important transition in the AI agent space. When it happens, the teams that think about these architectural challenges early - rather than just stacking more agents together - will build systems that actually work in practice.
If you're working on multi-agent systems or interested in agent collaboration infrastructure, we'd love to hear about the challenges you've encountered. The Octo project is open source, and we welcome contributions and discussions on GitHub.
Comments
No comments yet. Start the discussion.