Hermes Agent by Nous Research: the AI agent that actually cares about security

April 6, 2026·16 min readAI

I ran OpenClaw for about a month before I started paying attention to what it actually had access to.

The pitch was compelling. An open-source AI agent that connects to your messaging apps, runs on your own hardware, automates tasks around the clock. I set it up on a small VPS, linked it to Discord, installed a handful of skills from ClawHub, and let it handle some recurring tasks. It worked. It was fun. I didn't think much about what was happening underneath.

Then the security reports started dropping.

The OpenClaw problem

OpenClaw grew faster than almost anything in open-source history. Over 300,000 GitHub stars by early April 2026, a massive skill marketplace in ClawHub, tutorials everywhere. Peter Steinberger built something people genuinely wanted: an AI agent that lives in your messaging apps and does things for you.

The problem is what "does things for you" actually means. OpenClaw runs shell commands. It reads files. It accesses browser data. It stores your API keys in plaintext at ~/.openclaw/credentials/. And by default, it has no command allowlist, no approval requirements, and no restrictions on what it can execute. You install it, and it can do anything your user account can do.

I didn't fully appreciate this until January 2026, when the ClawHavoc campaign hit. Between January 27 and 29, a single threat actor uploaded 341 malicious skills to ClawHub. These weren't subtle. They included keyloggers and Atomic Stealer malware. One skill had over 340,000 installs before anyone caught it. It silently exfiltrated credentials and installed a cryptominer.

That was just the beginning. By February, researchers had found 1,467 malicious skills total on ClawHub. Snyk's ToxicSkills audit found that 36% of all ClawHub skills contained detectable prompt injection. In March, OpenClaw disclosed 9 CVEs in 4 days, including CVE-2026-32922, a privilege escalation flaw rated CVSS 9.9 that allowed full system access through token scope misuse. There was also CVE-2026-25253, a WebSocket hijacking vulnerability rated CVSS 8.8. Clicking a single malicious link while OpenClaw was running on your machine gave the attacker full control of the agent, including shell access.

Security researchers found between 63,000 and 135,000 exposed OpenClaw instances on the public internet.

I looked at my own setup. My API keys were sitting in a plaintext file. The agent had unrestricted shell access. I had installed skills from ClawHub without checking what they did. I was exactly the kind of user these attacks were designed for.

That's when I started looking for alternatives.

Enter Nous Research

If you've spent any time running open-source models locally, you've probably used a Hermes model without thinking about it. Nous Research has been training instruction-tuned models since the early days of open-weight LLMs, and their Hermes line has become one of the most widely used model families in the local AI space.

Hermes 3 is available on Ollama in sizes from 3B up to 405B, built on Llama 3.1 (with the smaller variants using other bases like Llama 3.2). These models are designed to be neutrally aligned and highly steerable, meaning they follow the system prompt faithfully rather than imposing heavy-handed content filters. You tell the model what kind of assistant it should be, and it listens.

Hermes 4, released in August 2025, added hybrid reasoning. The family spans multiple base architectures: the 14B is built on Qwen 3, the 70B and 405B on Llama 3.1. All of them can toggle between standard responses and deeper chain-of-thought reasoning by including or omitting a <think> tag. The training dataset expanded to roughly 19 billion tokens across 5 million samples, with reasoning traces up to 16,000 tokens long. These are serious models built by people who understand what makes a good foundation for downstream applications.

That matters because in February 2026, Nous Research released something different: an agent built on top of that foundation.

What Hermes Agent actually is

Hermes Agent is a self-improving AI agent. It launched as v0.1.0 on February 25, 2026, and by early April it was already at v0.7.0. The tagline is "the agent that grows with you," which sounds like marketing until you see it working.

The basics are familiar. It's a CLI-first agent that can also connect to Telegram, Discord, Slack, WhatsApp, Signal, and email through a gateway process. It runs shell commands, searches the web, reads and writes files, and operates on a loop. You give it tasks, it executes them.

Where it diverges from OpenClaw is in three areas: how it learns, how it handles security, and how opinionated it is about defaults.

Installation

Getting started is genuinely simple:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc
hermes

The installer detects your system, pulls dependencies (Python, Node.js, ripgrep, ffmpeg), clones the repo, sets up a virtual environment, and drops a global hermes command in your path. The only prerequisite is Git. The whole process takes about 5 minutes.

Running hermes drops you into an interactive CLI conversation. The first time, it walks you through model configuration. I pointed it at OpenRouter and set Claude Sonnet 4.6 as my default model. Later I switched between models depending on the task:

hermes model    # choose your LLM provider and model
hermes tools    # configure which tools are enabled
hermes gateway  # enable messaging platforms
hermes setup    # configure everything at once

The CLI is the primary interface, and it's good. Clean, responsive, no unnecessary chrome. After a week I added Discord as a secondary channel with hermes gateway, mostly so I could send it tasks from my phone. The multi-channel architecture is smart: one gateway process receives messages from all platforms and routes them into the same session store. A conversation started in the CLI can continue on Discord with full context.

Model flexibility

Hermes Agent doesn't lock you into Hermes models. It supports over 200 models through OpenRouter, plus direct connections to the Nous Portal, OpenAI, and custom endpoints. I used Claude Sonnet 4.6 for complex reasoning tasks and the latest GLM model for general conversation, both via OpenRouter. For lighter tasks, I pulled models down through Ollama and ran them locally.

If you've read my post on running local LLMs on a consumer GPU, the setup is the same. Ollama handles the model serving, Hermes connects to it as a provider. If you want a fully local, fully private agent, you can run the entire stack on your own hardware without any external API calls.

The learning loop

This is what separates Hermes from everything else I've tried.

Most AI agents are stateless between sessions. You close the conversation, the context is gone. The next time you start a task, you're starting from scratch. OpenClaw has some session persistence, but it's basic. Hermes takes a fundamentally different approach.

graph LR
    A[Complex Task] --> B[Skill Created]
    B --> C[Skill Used on Similar Task]
    C --> D[Skill Refined]
    D --> C
    E[Periodic Nudge] --> F[Memory Persisted]
    F --> C

Skills as procedural memory

I wrote about how agent skills work recently in the context of coding assistants. Hermes takes the concept further. After you complete a complex task (typically one involving 5 or more tool calls), Hermes can autonomously create a skill. A skill is a structured markdown document that captures the procedure, the pitfalls encountered, and the verification steps. It's procedural memory, written down.

The next time you ask for something similar, Hermes doesn't start from scratch. It finds the relevant skill, follows the procedure, and skips the mistakes it already made. And here's the part that surprised me: skills improve during use. If Hermes follows a skill and discovers a better approach or a new edge case, it updates the skill document. The procedure gets tighter over time.

I noticed this after about a week. I had it set up a monitoring check for one of my services. The first time was a standard back-and-forth. The second time I asked for something similar, it was noticeably faster and asked fewer clarifying questions. It had written itself a playbook and was following it.

Cross-session memory

Hermes separates memory into four layers, each stored on disk and loaded at specific moments. The implementation uses FTS5 full-text search with LLM summarization for cross-session recall. The agent can search its own past conversations and surface relevant context without you having to re-explain things.

There's a mechanism called a periodic nudge that drives this. At set intervals during a session, the agent receives an internal prompt asking it to evaluate whether anything from the current conversation is worth persisting. It scans recent activity and writes to memory files if something crosses the threshold. You don't manage this. It happens in the background.

The result is an agent that builds a model of who you are across sessions. Not in a creepy way. In a practical way. It remembers your preferences, your project structure, your common requests. After three weeks, my Hermes instance felt noticeably more useful than it did on day one.

User profiling with Honcho

Hermes uses something called Honcho's dialectic approach for user modeling. In practice, this means it doesn't just remember facts about you. It builds a structured understanding of how you work. What kind of tasks you delegate, how much detail you provide, what level of autonomy you expect. This feeds back into how it communicates and what it assumes.

I'm not sure I'd call it a personality, but it's more than a preference file. After a few sessions, the agent's responses started matching my communication style without me having to tune a system prompt. It was picking up patterns.

Security done right

This is where the comparison gets stark.

OpenClaw's security model is "personal assistant." One trusted operator, potentially many agents. The problem is the defaults. Out of the box, there are no guardrails. You have to actively harden it, and most people don't. The security documentation exists, but the default install is wide open.

Hermes inverts this. The defaults are secure, and you opt out of protections rather than opting in. Here's what that looks like in practice:

Seven-layer defense

Hermes implements what it calls a defense-in-depth security model. Seven layers, each addressing a different attack surface:

User authorization. Unknown users receive a one-time pairing code (8 characters, cryptographic randomness, 1-hour TTL). The bot owner approves via the CLI. Rate-limited to prevent brute force: 10-minute cooldown, max 3 pending codes, 5-attempt lockout.
Dangerous command approval. The agent checks commands against a curated list of dangerous patterns: recursive deletes, permission changes, filesystem formatting, SQL destructive operations, piping remote content to shells. If a match is found, you have to explicitly approve. The default is manual mode, where every dangerous command requires human confirmation.
Container isolation. When running in Docker, the agent drops ALL Linux capabilities and selectively adds only three (DAC_OVERRIDE, CHOWN, FOWNER). "No-new-privileges" blocks escalation. Process limit: 256. The container becomes the security boundary, and dangerous command checks are skipped inside it because destructive commands can't escape.
MCP credential filtering. Error messages are sanitized to strip GitHub PATs, OpenAI-style keys, bearer tokens, and any parameters containing "token," "key," "password," or "secret." Credentials are mounted read-only in Docker containers.
Context file scanning. Hermes scans AGENTS.md, .cursorrules, and similar files for prompt injection attempts, credential leakage patterns, and invisible Unicode characters before processing them.
Cross-session isolation. Sessions can't access each other's data. Cron job storage paths are hardened against path traversal attacks.
Input sanitization. Working directory parameters are validated against an allowlist to prevent shell injection.

SSRF protection

This is always active. Hermes blocks requests to private networks (RFC 1918), loopback addresses, link-local, CGNAT spaces, cloud metadata hostnames, and reserved addresses. DNS failures are treated as blocked. Redirect chains are re-validated at each hop. This is fail-closed by design.

The fail-closed philosophy

The approval timeout is 60 seconds by default. If you don't respond to a dangerous command prompt, the command is denied. Not approved. Denied. This is the opposite of how most tools handle timeouts, and it's the right call. An agent with shell access that defaults to "yes" when you're not paying attention is a liability.

There is a YOLO mode (--yolo) for when you want to disable approval prompts. It's a per-session toggle, not a permanent setting. You have to actively choose to lower the guardrails every time.

HermesHub vs. ClawHub

The skill marketplace tells the whole story. ClawHub is an open marketplace where anyone can upload skills. As of early 2026, 36% of them contained detectable prompt injection. The ClawHavoc campaign uploaded 341 malicious skills in 3 days before anyone noticed.

HermesHub provides security-scanned skills. Skills go through a vetting process before distribution. The Hermes agent itself includes a Skills Guard that scans skill content for suspicious environment access patterns before installation. Is it perfect? Probably not. But it's a fundamentally different approach from "upload whatever you want and hope users read the code."

The CVE scoreboard

As of early April 2026:

OpenClaw: 9 CVEs disclosed in a 4-day window in March 2026. Maximum severity: CVSS 9.9. The WebSocket hijack (CVE-2026-25253) could be triggered by clicking a single malicious link.
Hermes Agent: Zero agent-specific CVEs.

Hermes is younger, so this comparison isn't entirely fair. A two-month-old project has had less time to accumulate vulnerabilities. But the architectural approach suggests this gap isn't just about age. Hermes was designed with security as a core concern, not bolted on after incidents forced the issue.

Side-by-side comparison

Here's how they compare on the features that matter:

Feature	OpenClaw	Hermes Agent
Default security posture	Open (no restrictions)	Locked down (fail-closed)
Command approval	None by default	Manual approval for dangerous commands
Credential storage	Plaintext files	Read-only mounts in containers, redacted in logs
Skill marketplace	ClawHub (unvetted)	HermesHub (security-scanned)
Self-improving skills	No	Yes, autonomous creation and refinement
Cross-session memory	Basic persistence	Four-layer memory with FTS5 search
User modeling	None	Honcho dialectic profiling
Messaging platforms	WhatsApp, Discord, Telegram, Slack, more	Telegram, Discord, Slack, WhatsApp, Signal, email
Model support	Any via API	200+ via OpenRouter, plus local via Ollama
GitHub stars	300,000+	~22,000 (growing fast)
First release	November 2025 (as Clawdbot)	February 25, 2026

The honest downsides

I'm recommending Hermes, but I'm not going to pretend it's perfect after three weeks.

The ecosystem is smaller. 22,000 stars versus 300,000. That means fewer community tutorials, fewer third-party integrations, fewer people on forums who've hit the same problem you're hitting. When I ran into an issue with Discord message formatting, I couldn't find a Stack Overflow answer. I had to read the source code.

It's two months old. The pace of development is impressive (v0.1.0 to v0.7.0 in under two months), but that's also a warning. APIs change. Configuration formats shift. A skill you write today might need adjustments after the next release. This is early-adopter territory.

Fewer pre-built skills. HermesHub has fewer skills than ClawHub. That's partly a security feature (the vetting process slows things down), but if you need a very specific integration, OpenClaw might have it and Hermes might not. The flip side is that the skills on HermesHub are less likely to steal your credentials.

Some messaging integrations are less mature. Discord worked well for me. I've heard mixed reports about WhatsApp and Signal setup compared to OpenClaw's more battle-tested implementations. If your primary use case is a WhatsApp bot, do your own testing.

The learning loop takes time. The self-improving skills system is Hermes's strongest feature, but you don't see it on day one. It took about a week of regular use before the skill creation and memory system started producing noticeable improvements. If you install it, run one task, and judge it, you're missing the point.

Coming from OpenClaw

If you're already running OpenClaw and want to switch, Hermes includes a built-in migration command:

hermes claw migrate

This imports your memories, skills, API keys, messaging settings, and persona files from an existing OpenClaw installation. I didn't use this since my OpenClaw setup was minimal, but it's there for people who've invested more heavily in the ecosystem.

The verdict

OpenClaw solved a real problem. It proved that people want AI agents that live in their messaging apps and do real work. The project grew faster than almost anything in open source history, and for good reason. The core idea is sound.

The execution, specifically around security, is not. An agent that has shell access and your API keys needs to be secure by default, not secure after you read the hardening guide. The ClawHavoc campaign, the CVEs, the exposed instances: these weren't edge cases. They were predictable consequences of shipping fast without guardrails.

Hermes Agent is built differently. The defaults are secure. The approval system is fail-closed. The skill marketplace is vetted. The container isolation drops capabilities instead of granting them. These are deliberate architectural choices, not patches applied after something went wrong.

And beyond security, the learning loop is genuinely new. After three weeks, my Hermes instance is measurably better at the tasks I give it than it was on day one. It created skills from my workflows, remembers context across sessions, and adapted to how I communicate. OpenClaw doesn't do any of that.

If you're starting fresh with an AI agent in 2026, start with Hermes. If you're on OpenClaw and the security reports have you nervous, the migration path exists. The ecosystem is smaller and the project is younger, and I'd still pick it every time.

Hermes is the agent I trust with my terminal. After everything I've seen this year, that's not something I say lightly.

Sources

July 5, 2026·10 min readAI

Why I built solidifai: parametric CAD through Claude Code

Starting every 3D printed part from a blank Fusion 360 viewport got old, so I built CAD for Claude Code.

April 11, 2026·15 min readAI

What agentic coding actually looks like

Agentic coding changed how I build software. Not in the way the hype suggests.

March 29, 2026·10 min readAI

How I would design an ad platform for LLMs

A technical breakdown of how a middleware ad layer for LLM APIs could work, why the economics demand it, and whether it should exist at all.

Enjoying the blog? Subscribe via RSS to get new posts in your reader.

Subscribe via RSS