Building Real-Time Communication for Enterprise-Grade Agentic AI Automation

When an enterprise deploys agentic AI to automate workflows, two requirements are paramount: reliability and visibility. Operations teams need to trust that automated tasks will execute consistently while maintaining complete oversight of the process. This means being able to monitor task execution in real-time, intervene when necessary, and maintain a comprehensive audit trail of all actions taken by our AI agents. Building this level of automation presents significant technical challenges.

Our AI agents need to coordinate with multiple system components while executing tasks - gathering environmental context, making decisions, and taking actions - all in a fluid, real-time manner that mirrors human operation. Additionally, we need to ensure that operations teams can observe and interact with these automated processes as they happen, maintaining the control and visibility enterprises require.

To address these challenges, we've designed an architecture with four core components:

The Orchestrator is our core component that coordinates every aspect of the platform, from integrations with enterprise systems of record to managing the fleet of agents. It ensures precise workflow management and maintains the overall state of automated operations.
The Worker Agents are specialized AI Agents that operate directly on enterprise devices - whether they’re computers, virtual machines, containers, or other supported systems. Using both Human Interface Device (HID) emulation and API integrations, they can seamlessly interact with any enterprise application, while continuously collecting information about the environment.
The AI Engine represents our advanced agentic AI “brain”, managing a sophisticated pipeline of multimodal pre-processing, task execution memory, planning, reasoning, and action verification. It explains every step of execution and intelligently determines when human intervention is needed, providing full transparency into its decision-making process.
The Console serves as the central command center for the platform, enabling teams to configure the system, manage their fleet of worker agents, including registration, credentials and authorizations, develop and edit task automations, and monitor task executions in real-time. It maintains comprehensive audit trails of all automated operations, providing the visibility and control enterprises require.

Opnova Platform Deployment Overview

Enabling seamless real-time communication between these components requires a robust and secure infrastructure. We’ve implemented this using two key technologies: websockets and a message broker.

The backend components keep persistent connections to an internal broker during their lifecycle. These connections enable efficient Remote Procedure Calls (RPC) between components and provide the load balancing necessary for enterprise-grade operations. This architecture also enables platform observability, since it allows monitoring the messages flowing within our system.

For secure AI agent communication, each Worker Agent establishes a websocket connection to our platform when deployed. This connection, maintained throughout the agent's lifecycle, enables secure bi-directional communication through enterprise firewalls. Worker agents can be deployed behind a firewall or NAT without the need for special firewall configurations to enable this communication.

When a task execution starts, the Orchestrator schedules a Worker Agent to execute the task and requests metadata from the agent, routed via the broker and through the websocket. When the agent responds, the message is processed by the Orchestrator and routed to the AI Engine. The AI Engine will then process the message, execute its sophisticated pipeline and respond with the next action command that is then received by the Orchestrator and routed to the agent. This feedback loop will continue until the task execution finishes. If a user is connected to the console, these message exchanges will also be delivered to the console through the websocket connection for live monitoring.

The real-time capabilities of our platform enable sophisticated human-in-the-loop interactions that maintain the right balance of automation and human oversight. Operational teams can seamlessly participate in task execution through approvals, take direct control of agents when needed, or provide additional context to guide agent decision-making. This flexibility ensures that enterprises maintain appropriate control while maximizing the benefits of automation.

All these communication channels happen over TLS, ensuring that sensitive enterprise data remains protected both within and outside the platform infrastructure. Moreover, our architecture supports customer-native deployment, meaning all sensitive operations and data remain within the enterprise's own infrastructure boundaries.

Real-time communication has been a foundational principle of our architecture from day one. By leveraging these key technologies in our distributed architecture, we provide the foundation for enterprise-scale automation while ensuring reliability and security in every interaction. Each component can scale independently to meet growing demands, supporting larger agent fleets, more complex workflows, and increasing numbers of concurrent automations. As enterprises expand their use of agentic AI to eliminate rework, our platform will grow with them—maintaining the speed, security, and control that organizations can trust with their critical operations.

Cheers,

Tiago Melo

Co-Founder and Chief Architect

Author

Tiago Melo

31 October, 2024

Building Real-Time Communication for Enterprise-Grade Agentic AI Automation

Author

Related

Building Trustworthy Agentic AI for Enterprise Automation

Does Your Process Automation Pass the CEO Test?

Technology

Company

Legal