AgentDojo-Inspect

Metadata Updated: March 14, 2025

AgentDojo-Inspect is a codebase created by the U.S. AI Safety Institute to facilitate research into agent hijacking and defenses against said hijacking. Agent hijacking is a type of indirect prompt injection [1] in which an attacker inserts malicious instructions into data that may be ingested by an AI agent, causing it to take unintended, harmful actions.AgentDojo-Inspect is a fork of the original AgentDojo repository [2], which was created by researchers at ETH Zurich [3]. This fork extends the upstream AgentDojo in four key ways:1. It adds an Inspect bridge that allows AgentDojo evaluations to be run using the Inspect evaluations framework [4] (see below for more details).2. It fixes some bugs in the upstream AgentDojo's task suites (most of these fixes have been merged upstream). It also removes certain tasks that are of low quality.3. It adds new injection tasks in the Workspace environment that have to do with mass data exfiltration (these have since been merged upstream).4. It adds a new terminal environment and associated tasks that test for remote code execution vulnerabilities in this environment.[1] Greshake K, Abdelnabi S, Mishra S, Endres C, Holz T, Fritz M (2023) Not what you?ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection (arXiv), arXiv:2302.12173. https://doi.org/10.48550/arXiv.2302.12173[2] Edoardo Debenedetti (2025) ethz-spylab/agentdojo. Available at https://github.com/ethz-spylab/agentdojo.[3] Debenedetti E, Zhang J, Balunovi? M, Beurer-Kellner L, Fischer M, Tramèr F (2024) AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents (arXiv), arXiv:2406.13352. https://doi.org/10.48550/arXiv.2406.13352[4] UK AI Safety Institute (2024) Inspect AI: Framework for Large Language Model Evaluations. Available at https://github.com/UKGovernmentBEIS/inspect_ai.

Access & Use Information

Public: This dataset is intended for public access and use. License: See this page for license information.

Downloads & Resources

AgentDojo-Inspect source code (GitHub)

Visit page

Landing PageLanding Page

Visit page

References

https://www.nist.gov/news-events/news/2025/01/technical-blog-strengthening-ai-agent-hijacking-evaluations

Dates

Metadata Created Date	March 14, 2025
Metadata Updated Date	March 14, 2025
Data Update Frequency	irregular

Metadata Source

Data.json Data.json Metadata
Download Metadata

Harvested from NIST

Additional Metadata

Resource Type	Dataset
Metadata Created Date	March 14, 2025
Metadata Updated Date	March 14, 2025
Publisher	National Institute of Standards and Technology
Maintainer	Tony Wang
Identifier	ark:/88434/mds2-3690
Data First Published	2025-02-18
Language	en
Data Last Modified	2025-02-06 00:00:00
Category	Information Technology:Cybersecurity
Public Access Level	public
Data Update Frequency	irregular
Bureau Code	006:55
Metadata Context	https://project-open-data.cio.gov/v1.1/schema/data.json
Schema Version	https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby	https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id	5b325865-e5bd-42b8-8bf9-9cec1a70d72a
Harvest Source Id	74e175d9-66b3-4323-ac98-e2a90eeb93c0
Harvest Source Title	NIST
Homepage URL	https://data.nist.gov/od/id/mds2-3690
License	https://www.nist.gov/open/license
Program Code	006:052
Related Documents	https://www.nist.gov/news-events/news/2025/01/technical-blog-strengthening-ai-agent-hijacking-evaluations
Source Datajson Identifier	True
Source Hash	3e1d2fa4ae2ff673a1a442690b8ca71b3743bae405b294706ec30e1ef60307f8
Source Schema Version	1.1

Didn't find what you're looking for? Suggest a dataset here.

Data Catalog