The Trojan Detection Challenge 2023 (LLM Edition) is a NeurIPS 2023 competition focused on advancing methods for detecting hidden functionality in large language models (LLMs). The competition features two tracks: Trojan Detection and Red Teaming. The Trojan Detection Track challenges participants to identify triggers for hidden behaviors (trojans) in LLMs. The Red Teaming Track tasks participants with developing automated methods to elicit specific undesirable behaviors. There's a $30,000 prize pool, and winning teams will be invited to co-author a publication and present at a NeurIPS workshop. The competition uses open-source LLMs (Pythia and Llama-2-chat) and encourages participants to share their methods.
• red teaming
• competition
• llm safety
• trojan detection
The rules are available here: [Here](index.html#rules).
Yes, with participant consent if urgently needed.
Contact us at tdc2023-organizers@googlegroups.com
The competition is open to the public.
You can register anytime during the competition.
Teams can have any number of members. Solo teams are allowed.
See the [Getting Started](start.html) page.
5 submissions per day during validation, 5 total in the test phase. Only one account per team.
We encourage sharing; winning teams must share with organizers.
Details are here: [Here](tracks.html#trojan-detection).
Details are here: [Here](tracks.html#red-teaming).
The baselines are well-known text optimization and red teaming methods from the academic literature.
For Trojan Detection, open-source Pythia LLMs are used for broader participation; Llama-2-chat for Red Teaming due to robustness.
We use the simplest trojan attack for its resemblance to the red teaming task, fostering connections between communities.
Both are used in the literature; "trojans" is used for better flow.
Average Rating: 0.0
5 Stars:
0 Ratings
4 Stars:
0 Ratings
3 Stars:
0 Ratings
2 Stars:
0 Ratings
1 Star:
0 Ratings
No ratings available.
Anonymous, uncensored AI chat with AES encryption and no logs. Offers free and pro plans.
View DetailsWayin AI summarizes videos, supports multiple languages, and allows interactive Q&A via chatbot and screenshot queries.
View DetailsPokecut is a free AI-powered photo editor with tools for background removal, changing, and enhancement. Pro plans offer extra features and credits.
View DetailsConnect your Github repos to ChatGPT & Claude for code assistance, bug finding, and documentation. Free trial available.
View DetailsCreate and interact with a customizable AI girlfriend. Features include AI chat, roleplay, and image generation. NSFW content available.
View DetailsA trivia website with questions in multiple categories. Play now and expand your knowledge!
View DetailsArbor is an automated carbon accounting platform that helps businesses measure, analyze, and reduce their product's carbon footprint quickly and accurately.
View DetailsPhotoLog offers secure, client-side encrypted media storage with mini-site creation, easy sharing, and various storage plans.
View DetailsAI-powered mobile app testing platform with a test automation cloud (Ptero) and a no-code test scenario authoring tool (Stego).
View DetailsAI-powered productivity assistant for ADHD and knowledge workers, centralizing notes, tasks, and AI tools to enhance focus and efficiency.
View Details