Exercise: Finding Bugs with Atheris in Python
Goal
In this exercise, you will use Atheris, a coverage-guided fuzzer for Python, to find a bug in a small date conversion utility.
Unlike ordinary unit tests, Atheris repeatedly mutates inputs and uses coverage feedback to explore new execution paths. Your job is to build a fuzz target, run it, and use the discovered failure to debug the program.
Setup
Install Atheris:
1 | pip install atheris |
Create a working directory for this exercise and place the files below inside it.
Starter Code
Create a file called date_utils.py:
1 | from datetime import datetime |
Warm-Up
Before fuzzing, manually try a few examples:
1 | from date_utils import normalize_date |
Now try this:
1 | from date_utils import normalize_date |
What will happen? Can you fix the bug?
Now let’s use Atheris to try to expose more issues.
Atheris Primer
Atheris fuzzes a function usually named TestOneInput(data: bytes).
It repeatedly generates and mutates byte strings, then feeds them to your target.
A typical Atheris fuzzer has this shape:
1 | import sys |
In many cases, raw bytes are not directly useful.
Atheris provides FuzzedDataProvider to convert bytes into structured values such as integers, strings, and choices.
Task 1: Build a Minimal Fuzzer
Create a file called fuzz_date_utils_basic.py. You need to fill in the missing parts to make it work.
1 | import sys |
Run it like this:
1 | mkdir -p corpus |
You may notice this fuzzer spend a lot of time exploring obviously invalid strings.
We’ll fix it in the next step. See if the current version generates any interesting inputs
that crash your program.
Task 2: Write a Structured Fuzzer
The previous fuzzer is a good start, but it is not very targeted.
Now write a better fuzzer that generates structured date inputs and checks a semantic property.
Create a file called fuzz_date_utils_structured.py and implement this fuzzer.
We’ve already provided some helper functions.
1 | import sys |
Run it:
1 | mkdir -p corpus_structured |
Task 3: Investigate the Failure
If your structured fuzzer is effective, it should eventually report a failure.
When it does:
- inspect the crashing input
- reproduce it manually
- explain why the property failed
Task 4: Fix the Bug
Modify date_utils.py so that it correctly supports all three documented formats:
YYYY-MM-DDMM/DD/YYYYDD-MM-YYYY
After fixing it, rerun the structured fuzzer.
Expected behavior examples
| Input | Output |
|---|---|
2024-03-01 |
2024-03-01 |
03/01/2024 |
2024-03-01 |
01-03-2024 |
2024-03-01 |
25-12-2024 |
2024-12-25 |
After your fix, does the structured fuzzer still find failures?
Task 5: Apply the fuzzer to real-world python applications
Could you write a fuzzer and find real-world bugs in these popular python packages?
If you find any, create a PR and contribute to the open-source community!