Webinar

Live Demo: Claude Code autonomously investigates Cobalt Strike infection via LimaCharlie MCP

Eric Capuano, Founder of Digital Defense Institute

The most revealing moment in Eric Capuano's demonstration is not anything the AI does. It is the one thing it is not allowed to do. Capuano, a founder of the Digital Defense Institute, connects Claude Code to LimaCharlie's MCP server, asks it to surface recent detections worth investigating, and then steps out of the way. The agent runs a full Cobalt Strike investigation across two Windows endpoints, live and unscripted, and reaches the point of recommending that both hosts be isolated. There it stops, because Capuano has drawn a line it cannot cross alone. That line, between the actions an agent may take on its own and the ones a human must approve, is what the session is actually about. Everything else is evidence for why getting that line right matters more than the model behind it.

The model is cheap, the context is the moat

Capuano is candid that the language model by itself is unremarkable. Run it cold, he says, with no prior instructions and no context, and it is "not terrible," even "pretty decent." That is faint praise, and he means it as such. What turns a competent reasoner into something that does real investigative work is not a smarter model but two things he supplies around it.

The first is reach. The MCP layer is what lets the agent touch the platform at all. With LimaCharlie's MCP server connected, Claude Code can pull detections, write and iterate on LCQL, enumerate live processes, fetch loaded modules, and check the current state of an endpoint rather than trusting historical data alone. Capuano validates the connection before he begins and notes that the agent has already read in the full set of exposed tools and their arguments, so it understands what is possible before it acts instead of discovering it mid-case.

The second is knowledge, encoded as layered instruction files. Claude Code reads a CLAUDE.md from the directory it launches in, and Capuano builds a hierarchy on top of that: universal instructions in his home directory, a project file, and nested files holding LCQL examples, a corpus of detection rules, and a collection of sample LimaCharlie events so the agent knows what real telemetry looks like before it writes a query. He keeps a chat archive of investigations that went well, exporting the strong runs so the agent can replay the most effective steps. The piece that matters most is the feedback loop. When a live case shows the agent veering somewhere strange or missing something, Capuano does not just correct it in the moment. He tells it to write the lesson back into the instruction files so that the guidance is not limited to this one investigation but carries into every future one. The model is interchangeable. The accumulated context is what no vendor ships you.

What autonomous triage actually buys you

The investigation is the proof that the stack does more than answer questions. The agent forms an early hypothesis of Cobalt Strike, reasoning that rundll32 making outbound network connections without the DLL you would expect in its command line is classic beacon behavior. Then it behaves like an analyst who refuses to take historical data at face value. It identifies the external IP as Google Cloud infrastructure, flags rundll32 running as SYSTEM as a possible sign of privilege escalation, pulls the OS versions (Server 2022 and Windows 10 Pro), and confirms against the live state of the host that the suspect process, PID 3788, is still running.

Because the process is still active, it runs a find-strings against process memory looking for reflective loader artifacts and default beacon named pipes. That command times out, and the agent recovers without help, pivoting to enumerate the process's loaded modules instead. There it finds wininet.dll, a strong indicator of HTTP-based C2. When a thirty-minute query comes back empty, it widens the window to an hour and catches the beaconing, then chases parent-process lineage to start running down persistence. Capuano, who knows the ground truth in this lab, calls the final summary "100% accurate." The value here is not that the agent is brilliant. It is that the unglamorous, time-consuming work of triage, the iterating on queries and the recovering from dead ends, happened without an analyst babysitting it.

The line a human draws, and why it scales

The case ran away on its own precisely because Capuano had authorized it to. He lets the agent run read-only commands without asking, LCQL queries and endpoint enumeration, where the worst outcome is a query that pulls too much data and that the agent handles fine. What he deliberately withholds is anything that changes an environment. So when the agent reached high confidence and aggressively recommended isolating both hosts, it prepared the command and waited. The permission model is per-command and customizable, and Capuano stresses that this is the part that lets you "really make it yours."

That control is also what makes the whole approach defensible at scale, which is where Maxim picks it up in the closing exchange. A single product with one web console that works beautifully for an enterprise, he argues, is a non-starter for a service provider with five thousand customers. The unlock is treating prompts and agents as infrastructure as code, with agents operating on variables, so an operator can automate a large share of the job without standing up new infrastructure for every client. For a provider, he says plainly, that kind of scaling is "life and death." The demonstration shows why ownership of the context and control over the actions are not separate problems. One lets you encode your expertise everywhere at once. The other lets you trust the result enough to act on it.

See what agentic SecOps looks like in your environment

LimaCharlie gives MSSPs and MDRs a fully programmable SecOps Cloud Platform, with transparent usage-based pricing, API-first integration across every telemetry source, and the infrastructure to run multi-tenant operations at scale.

United States

440 N Barranca Ave #5258
Covina, CA 91723

Canada

5307 Victoria Drive #566
Vancouver, BC V5P 3V6

Stay up-to-date on all things LimaCharlie with our monthly newsletter.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

use cases

Use Case Catalog

EDR / XDR

SIEM / Log Analytics

Incident Response

SOAR / Automation

SecOps Engineering

Cloud Security

Products

Agentic SecOps Workspace

Grid

Viberails

resources

Blog

Case Studies

Data Sheets

Webinars

Events

Community

Podcast

Defender Fridays

Documentation

Solutions

MSSPs

Enterprise

Builders

Company

About Us

Careers

News & Press

Ask AI about LimaCharlie

Status

Trust