CAI with MCP in Action CAI with MCP in Action

Other case studies

The use case

Modern web application security faces escalating challenges: complex JavaScript frameworks, intricate API ecosystems, and increasingly sophisticated attack vectors that demand rapid assessment. Security professionals work under tight deadlines to identify vulnerabilities in applications that evolve continuously, with manual testing often struggling to keep pace with development cycles. The need for automated, intelligent security assessment has never been more critical.

To address these demands, security teams are adopting CAI, an advanced AI security framework that integrates with web browsers via MCP through Developer Tools. CAI's autonomous agents perform comprehensive security assessments by analyzing application behavior, identifying vulnerabilities, and executing exploitation strategies that would typically require hours of manual testing. The result is accelerated vulnerability discovery, consistent testing coverage, and enhanced ability to identify complex security flaws.

This case study demonstrates how CAI integrated with Google Chrome enables AI-assisted web application security assessments that are thorough, efficient, and reproducible. We showcase real workflows where CAI autonomously analyzes the OWASP Juice Shop application, identifies security weaknesses, and successfully exploits vulnerabilities all under human supervision. These capabilities represent a significant advancement in automated security testing, particularly for complex web applications where traditional scanning tools often fall short.

Get CAI
CAI with MCP in Action

In this demonstration, CAI autonomously identifies and exploits two major security flaws in the OWASP Juice Shop application by interacting directly with Chrome’s Developer Tools. It first uncovers poor error handling by analysing input validation, crafting malformed requests, and triggering unhandled exceptions that reveal sensitive system details. It then exploits a directory traversal weakness in the file download feature, manipulating file paths to bypass access controls and retrieve a restricted document. Together, these outcomes illustrate how inconsistent error management and broken authorisation can expose critical data, and demonstrate CAI’s ability to detect and validate such vulnerabilities rapidly through MCP.

Cybersecurity AI (CAI), the framework for AI Security

CAI is the leading open source framework that democratizes advanced security testing through specialized AI agents. Backed by the EU and used by thousands of researchers, CAI brings autonomous analysis, vulnerability identification, and exploitation capabilities to complex environments. When integrated with web browsers via MCP, CAI enables fast, reproducible, AI-assisted security assessments supporting security professionals as they tackle advanced web application challenges.

As cybersecurity moves toward autonomous operations by 2028, CAI's human-supervised, AI-powered approach becomes essential for scaling security assessments across both traditional web applications and emerging technologies where complexity and attack surfaces are rapidly expanding.

Get CAI

About OWASP Juice Shop

OWASP Juice Shop is an intentionally insecure web application written in JavaScript/Node.js, designed specifically for security testing and education. It encompasses the entire OWASP Top Ten and numerous other security vulnerabilities, making it an ideal benchmark for security testing tools and methodologies. Running within a Docker container, it provides a consistent, isolated environment that accurately represents modern web application architectures while allowing safe exploitation of vulnerabilities.

Security professionals using Juice Shop confront challenges such as identifying complex business logic flaws, bypassing authentication mechanisms, exploiting client-side vulnerabilities, and navigating intricate API endpoints. By leveraging AI-driven frameworks like CAI within their testing workflows, security teams demonstrate a forward-thinking approach, expanding the boundaries of automated security testing and enabling more thorough assessments of web applications in less time.

Learn about OWASP

Time for the exercise

Human: 600 min vs CAI: 3.5 min

x171 FASTER



Cost

Human: 154.80 € vs CAI: 0.17 €

x911 CHEAPER

🎯 THE CHALLENGE

Penetration testers and security researchers face persistent bottlenecks in web application security assessments:

  • Manual analysis of complex application architectures (e.g., JavaScript frameworks, SPAs, and microservices), requiring deep expertise to map attack surfaces
  • Time-consuming vulnerability discovery in large codebases, where business logic flaws and authorization bypasses evade automated scanners
  • Iterative exploit development for multi-step attacks (e.g., chaining XSS with CSRF or IDOR), demanding manual debugging and payload refinement
  • Reconnaissance overhead to catalog endpoints, APIs, and hidden functionality before identifying viable attack vectors
  • Expertise-dependent tasks like crafting context-specific payloads (e.g., SSTI templates or deserialization gadgets) or analyzing client-side vulnerabilities

These challenges intensify in modern web ecosystems, where applications evolve rapidly, APIs proliferate, and security flaws manifest in subtle interactions between frontend, backend, and third party components. The question arose: Could AI autonomously accelerate the most labor intensive steps of web application testing, while maintaining the rigor of human analysis?

🛡️ THE SOLUTION

Security teams integrated CAI with Google Chrome via MCP (DevTools) to automate the most technically demanding and repetitive aspects of web application testing. Under human supervision, CAI executed:

  • Autonomous application mapping, identifying endpoints, APIs, and hidden functionality through dynamic DOM analysis and network traffic inspection
  • Intelligent vulnerability discovery, probing for inconsistent error handling, authorization flaws, and business logic weaknesses across the OWASP Juice Shop's JavaScript frontend and Node.js backend
  • Automated exploit development, crafting payloads to trigger unhandled errors (e.g., malformed API requests) and bypass access controls (e.g., directory traversal via path manipulation)
  • Iterative payload refinement, adjusting parameters, headers, and request structures in real-time based on application responses (e.g., tuning file paths to evade filters)
  • Validation and documentation, confirming successful exploitation (e.g., capturing stack traces or exfiltrated files) and generating structured reports

Each challenge was solved in minutes instead of hours, demonstrating how AI-driven browser automation accelerates vulnerability validation while maintaining the precision of manual testing.

🔬 KEY ARTIFACTS

  • Vulnerability Analysis Reports: Detailed documentation of the Error Handling and Confidential Document flaws, including CAI's methodology for identifying inconsistent error responses and directory traversal vectors
  • Automated Exploit Payloads: HTTP request templates and JavaScript snippets crafted by CAI to trigger unhandled errors (e.g., malformed API payloads) and bypass file access controls (e.g., path-traversal sequences)
  • Browser Interaction Logs: Full MCP/DevTools session logs capturing CAI's real-time DOM manipulation, network traffic inspection, and JavaScript execution within Chrome
  • Exploitation Workflow Documentation: Step-by-step breakdowns of CAI's reasoning loops, including how it iteratively refined payloads based on application feedback (e.g., adjusting file paths to evade filters)
  • Proof of Exploit Evidence: Video clips and screenshots showcasing successful exploitation: stack traces exposed via the Error Handling challenge and the exfiltrated confidential document from the restricted directory
  • Remediation Guidance: Initial recommendations generated by CAI for mitigating identified vulnerabilities, such as input sanitization and consistent error-handling implementation

All findings were consolidated into a single continuous-assessment dashboard accessible to security teams, technical leads, and operations managers.

✅ RESULTS ACHIEVED

  • Rapid vulnerability resolution: Both the Error Handling and Confidential Document challenges in OWASP Juice Shop were solved in under 10 minutes total, demonstrating CAI's ability to accelerate exploit development by 10-15x compared to manual testing
  • Automated exploit generation: CAI autonomously crafted working payloads for inconsistent error handling (triggering stack trace exposure) and directory traversal (accessing restricted files), eliminating hours of manual trial-and-error
  • End-to-end workflow validation: Successful integration of CAI with Chrome's MCP/DevTools proved the feasibility of AI-driven browser automation for real-world web security assessments
  • Human-supervised safety: All exploits were executed under human oversight, ensuring auditable reasoning loops and responsible vulnerability disclosure practices
  • Scalability demonstrated: The approach validated CAI's adaptability to modern web architectures (JavaScript/Node.js), with clear applicability to broader domains like API security and cloud-native applications
  • Foundation for expansion: Results confirmed CAI's potential to extend beyond Juice Shop to complex enterprise web apps, where similar vulnerabilities (e.g., IDOR, insecure error handling) remain prevalent

KEY BENEFITS

🤖 Accelerated Exploit Discovery
⚡ Automated Complex Testing
🎯 Human-Supervised Security