Skip to content

Automation framework

This section introduces our automation framework, and how we can use our framework.

Data Loader

Our framework uses two methods to load challenges: Docker containers as challenge servers or loading from local challenge files. Details of how our dataset is orgainzed can be found here

At the start of the challenge setup, the framework scans the challenge information to determine if a Docker container exists, then loads it from the docker-compose.yml file, pulls the image, and starts it running. Once the framework solves a CTF challenge, it stops all Docker containers and removes the loaded Docker images from the environment.

External Tools

Our framework providing models with access to domain-specific tools to improve their capabilities in solving CTF challenges. For example: * run_command: Enables the LLM to execute commands within an Ubuntu 22.04 Docker container equipped with essential tools * createfile: Generates a file inside the Docker container * disassemble and decompile: Uses Ghidra to disassemble and decompile a specified function in a binary * check_flag: Allows the LLM to verify the correctness of a discovered flag in a CTF challenge * give_up: Allows the LLM to stop its efforts on a challenge

For some LLM models that do not support built-in function calling, we have formatting module transforms prompt information into a format suitable for function calling (XML and YAML).The tools are under the tools folder.