Fuzzing101 with LibAFL - Part I: Fuzzing Xpdf

Nov 7, 2021 | 19 minutes read

Tags: fuzzing, libafl, rust, xpdf

Twitter user Antonio Morales created the Fuzzing101 repository in August of 2021. In the repo, he has created exercises and solutions meant to teach the basics of fuzzing to anyone who wants to learn how to find vulnerabilities in real software projects. The repo focuses on AFL++ usage, but this series of posts aims to solve the exercises using LibAFL instead. We’ll be exploring the library and writing fuzzers in Rust in order to solve the challenges in a way that closely aligns with the suggested AFL++ usage.

Since this series will be looking at Rust source code and building fuzzers, I’m going to assume a certain level of knowledge in both fields for the sake of brevity. If you need a brief introduction/refresher to/on coverage-guided fuzzing, please take a look here. As always, if you have any questions, please don’t hesitate to reach out.

This post will cover fuzzing Xpdf in order to solve Exercise 1. The companion code for this exercise can be found at my fuzzing-101-solutions repository

Other posts in the series: - Part I.V: Speed Improvements to Part I - Part II: Fuzzing libexif

Quick Reference

This is just a summary of the different components used in the upcoming post. It’s meant to be used later as an easy way of determining which components are used in which posts.

  "Fuzzer": {
    "type": "StdFuzzer",
    "Corpora": {
      "Input": "InMemoryCorpus",
      "Output": "OnDiskCorpus"
    "Input": "BytesInput",
    "Observers": [
      "ConstMapObserver": {
        "coverage map": "StdShMemProvider::new_map",
    "Feedbacks": {
      "Pure": ["MaxMapFeedback", "TimeFeedback"],
      "Objectives": ["MaxMapFeedback", "TimeoutFeedback"]
    "State": {
      "StdState": {
        "FeedbackStates": [
    "Stats": "SimpleStats",
    "EventManager": "SimpleEventManager",
    "Scheduler": "IndexesLenTimeMinimizerCorpusScheduler",
    "Executors": [
    "Mutators": [
      "StdScheduledMutator": {
        "mutations": "havoc_mutations"
    "Stages": ["StdMutationalStage"]

LibAFL Background

LibAFL, the Advanced Fuzzing Library, is a collection of reusable fuzzer components written in Rust. It is fast, multi-platform, no_std compatible, and scales well over cores and machines. LibAFL is written and maintained by Andrea Fioraldi and Dominik Maier (they also maintain AFL++). You can learn more about the motivation behind LibAFL and the different components in their rC3 talk.

Without further ado, let’s get started.

Exercise 1 Setup

Our first step is walking through the setup steps for Rust, Xpdf, and AFL++. If you’re wondering why we’ll need AFL++ when we plan to use LibAFL, it’s because we’ll use AFL++’s compiler for instrumentation once or twice before we try out some different instrumentation backends.

As far as setup, my assumption is that you’re on some flavor of linux. Any linux package manager commands will be given as debian-flavor commands (apt).

Install Rust

This one’s easy.

curl --proto '=https' --tlsv1.2 -sSf | sh

Further information is here in case you need it.

Install AFL++

I’m taking these steps directly from Exercise 1’s setup instructions.

Install dependencies

sudo apt-get update
sudo apt-get install -y build-essential python3-dev automake git flex bison libglib2.0-dev libpixman-1-dev python3-setuptools
sudo apt-get install -y lld-11 llvm-11 llvm-11-dev clang-11 || sudo apt-get install -y lld llvm llvm-dev clang 
sudo apt-get install -y gcc-$(gcc --version|head -n1|sed 's/.* //'|sed 's/\..*//')-plugin-dev libstdc++-$(gcc --version|head -n1|sed 's/.* //'|sed 's/\..*//')-dev

Checkout and build AFL++

cd $HOME
git clone && cd AFLplusplus
export LLVM_CONFIG="llvm-config-11"
make distrib
sudo make install

Test your installation

afl-fuzz -h

afl-fuzz++3.15a based on afl by Michal Zalewski and a large online community

afl-fuzz [ options ] -- /path/to/fuzzed_app [ ... ]

Required parameters:
  -i dir        - input directory with test cases
  -o dir        - output directory for fuzzer findings

Execution control settings:
  -p schedule   - power schedules compute a seed's performance score:
                  fast(default), explore, exploit, seek, rare, mmopt, coe, lin
                  quad -- see docs/
  -f file       - location read by the fuzzed program (default: stdin or @@)

Project Directory Setup

In order to keep us on track for the upcoming rust code/build setup, we’re going to deviate from the directory structure recommended by the Fuzzing101 README. Since we know that we’ll be creating multiple Rust projects (one for each exercise), we’ll start out with a Rust workspace.

The first step in creating a workspace for our Rust project directories is to simply make a directory.

cd $HOME
mkdir fuzzing-101-solutions
cd fuzzing-101-solutions

fuzzing-101-solutions will be the top level directory that houses all of our different exercise specific projects. In the top level directory, we need to create a Cargo.toml file that tells Rust and cargo that this is a workspace. Since all of our projects in the workspace are going to be fuzzers, we’ll globally modify the release profile settings as well.



members = [

lto = true
codegen-units = 1
opt-level = 3
debug = true
  • lto=true :: perform link-time optimizations across all crates within the dependency graph
  • codegen-units=1 :: controls how many “code generation units” a crate will be split into; higher codegen-units MAY produce slower code (max value 256)
  • opt-level=3 :: controls the level of optimizations; 3 == “all optimizations”
  • debug=true :: controls the amount of debug information included in the compiled binary; true == “full debug info”

After that, we can create our first solution project.

cargo new exercise-1

Having run the commands above, we’re left with a directory structure that looks like this.

├── Cargo.toml
└── exercise-1
    ├── Cargo.toml
    ├── .git
    │   ├── description
    │   -------------8<-------------
    ├── .gitignore
    └── src

Now we can proceed with the fuzz target setup.

Install Xpdf

Again, these steps are pulled directly from Exercise 1’s README. We’ll modify them a bit to take our slightly different folder structure into account.

Download Xpdf 3.02

The following set of steps will download Xpdf version 3.02 into a folder ultimately named xpdf.

cd fuzzing-101-solutions/exercise-1
tar xvf xpdf-3.02.tar.gz
rm xpdf-3.02.tar.gz
mv xpdf-3.02 xpdf

Our directory structure now looks like this

├── Cargo.toml
├── exercise-1
│   ├── Cargo.toml
│   ├── .git
│   │   ├── config
│   │   -------------8<-------------
│   ├── .gitignore
│   └── src
│       └──
└── xpdf
    ├── aclocal.m4
    ├── aconf2.h

The Fuzzing101 README recommends building Xpdf with gcc at this point as a test. Feel free to do so if you like.

Fuzzer Setup

The Goal

In order to write our fuzzer, we should take a look at our goal.

According to Fuzzing101:

the goal is to find a crash/PoC for CVE-2019-13288 in XPDF 3.02.

CVE-2019-13288 is a vulnerability that may cause an infinite recursion via a crafted file.

Since each called function in a program allocates a stack frame on the stack, if a function is recursively called so many times it can lead to stack memory exhaustion and program crash.

As a result, a remote attacker can leverage this for a DoS attack.

Alright, we know what we want to accomplish, and all of the dependencies have been gathered, let’s go!


Our first stop will be the Cargo.toml file in the exercise-1 directory. We’ll begin by telling cargo that we plan to use a build script, as well as adding LibAFL as a dependency.


name = "exercise-one-solution"
version = "0.1.0"
edition = "2021"
build = ""

libafl = "0.6.1"

Next up, we’ll add a file to our exercise-1 directory, resulting in a directory structure like what’s shown below.

├── Cargo.toml
├── exercise-1
│   ├── Cargo.toml
│   ├──

Build scripts are useful when one needs to perform some set of actions at build time. Typical uses are building/linking to external C libraries or doing some sort of code generation before building a project. By placing a file named in the root of our project directory, we’re telling cargo to compile and execute just before building our project.

The file is where we’ll configure and build Xpdf using AFL++’s compiler. More specifically, we’ll use alf-clang-fast, since that’s what Fuzzing101 recommends for this exercise. is essentially a program unto itself, so we’ll begin with its imports and its main function.

use std::env;
use std::process::Command;

fn main() {
    // todo 

Within the main function, we can configure when the build script should be run (after the initial build). We’ll do that the the rerun-if-changed directive. These directives tell cargo to re-run the build script if any of the files at the given paths have changed. More specifically, if their mtime timestamp has updated or not.


After that, we’re going to programatically execute the same commands that we would have run if we were to manually build Xpdf from the command line. Those commands are listed below to give you an idea of what is trying to accomplish.

# these are example commands that will be executed automatically by
# and were taken almost verbatim from Fuzzing101's README
cd fuzzing-101-solutions/exercise-1/xpdf
make clean
rm -rf install 
export LLVM_CONFIG=llvm-config-11 
CC=afl-clang-fast ./configure --prefix=./install
make install

Here’s what the first of the commands above looks like when written as a Command in Rust.

let cwd = env::current_dir().unwrap().to_string_lossy().to_string();
let xpdf_dir = format!("{}/xpdf", cwd);

// make clean; remove any leftover gunk from prior builds
    .expect("Couldn't clean xpdf directory");

The rest of the commands follow the same kind of pattern. The entirety of is shown below.


use std::env;
use std::process::Command;

fn main() {

    let cwd = env::current_dir().unwrap().to_string_lossy().to_string();
    let xpdf_dir = format!("{}/xpdf", cwd);

    // make clean; remove any leftover gunk from prior builds
        .expect("Couldn't clean xpdf directory");

    // clean doesn't know about the install directory we use to build, remove it as well
        .arg(&format!("{}/install", xpdf_dir))
        .expect("Couldn't clean xpdf's install directory");

    // export LLVM_CONFIG=llvm-config-11
    env::set_var("LLVM_CONFIG", "llvm-config-11");

    // configure with afl-clang-fast and set install directory to ./xpdf/install
        .arg(&format!("--prefix={}/install", xpdf_dir))
        .env("CC", "/usr/local/bin/afl-clang-fast")
        .expect("Couldn't configure xpdf to build using afl-clang-fast");

    // make && make install
        .expect("Couldn't make xpdf");

        .expect("Couldn't install xpdf");

If everything is configured correctly up to this point, we should be able to run

cargo build

After which, the xpdf/install directory should look like this.

├── bin
│   ├── pdffonts
│   ├── pdfimages
│   ├── pdfinfo
│   ├── pdftops
│   └── pdftotext
├── etc
│   └── xpdfrc
└── man
    ├── man1
    │   ├── pdffonts.1
    │   ├── pdfimages.1
    │   ├── pdfinfo.1
    │   ├── pdftops.1
    │   └── pdftotext.1
    └── man5
        └── xpdfrc.5

The binaries in the xpdf/install/bin folder were all compiled with afl-clang-fast, and as such, are instrumented for fuzzing!


I promise, we’re getting to the fuzzer soon, but before we do, we need a few sample PDF files to populate our input corpus. It’s just a few wget commands, so nothing too onerous.

cd fuzzing-101-solutions/exercise-1
mkdir corpus
cd corpus

That’s it! These three PDF files will make up the initial input for our fuzzer.

Writing the Fuzzer

Ok, now we’re closing in on the good stuff. will house our fuzzer’s logic. Ultimately, the fuzzer is going to be put together piece-by-piece using different components from LibAFL. The majority of this code was derived from the forkserver_simple example in the LibAFL repo. We’re using a forkserver along with afl-clang-fast to keep somewhat in-line with what one would expect to see when following along with Fuzzing101’s recommendation.

As we go through, we’ll attempt to dig in to why certain components were chosen and how they map to different fuzzing concepts.

Ok, I think that’s enough preamble, let’s get started.

Components: Corpus + Input


We’ll start building our fuzzer by creating our corpora (yes, I had to google the plural of corpus, don’t judge me). First up is the input corpus. The input corpus holds all of our current testcases. We’re using an InMemoryCorpus to prevent reads/writes to disk. Keeping our corpus in memory and preventing disk access should improve the speed at which we manipulate testcases.

While creating the input corpus, we need to define the Input type. The Input represents data received from some external source. In this case, we’re using the BytesInput, which means that our input corpus should contain items that can be represented as arrays of bytes. Those byte arrays will eventually be mutated by the fuzzer before being passed to the program being fuzzed.

let corpus_dirs = vec![PathBuf::from("./corpus")];

let input_corpus = InMemoryCorpus::<BytesInput>::new();

After that, we’ll move on to the output corpus. Testcases from the input corpus that cause a timeout are considered “solutions”. Our output corpus, which is of the type OnDiskCorpus, is the corpus in which we store those solutions. Said another way: any generated PDF that causes the program to hang will be stored in the output corpus.

let timeouts_corpus = OnDiskCorpus::new(PathBuf::from("./timeouts")).expect("Could not create timeouts corpus");

Component: Observer


The next component for our fuzzer is the Observer. An Observer can be thought of as something that provides information about the current testcase to the fuzzer. We’ll start with a simple Observer: the TimeObserver. The TimeObserver simply keeps track of the current testcase’s runtime. For each testcase, the TimeObserver will send the time it took the testcase to execute to the fuzzer by way of a Feedback component, which we’ll discuss shortly.

let time_observer = TimeObserver::new("time");

While the TimeObserver was simple, the next Observer is less so. In addition to execution time, we also want to keep track of the coverage map (this is a coverage-guided fuzzer after all). In order to build a coverage map, we need some shared memory. This piece of shared memory, AKA the coverage map, will be shared between the HitcountsMapObserver and the Executor (we’ll discuss the Executor a little later in the post).

First, we create a new instace of a StdShMemProvider, which provides access to shared memory mappings. We then use the StdShMemProvider to create a new shared memory mapping that is 65536 bytes.

const MAP_SIZE: usize = 65536;
let mut shmem = StdShMemProvider::new().unwrap().new_map(MAP_SIZE).unwrap();

After that, we need to save the shared memory id to the environment, so that the Executor knows about it.

shmem.write_to_env("__AFL_SHM_ID").expect("couldn't write shared memory ID");

Next, we get a mutable reference to the memory map. The reference we create is of type &mut [u8].

let mut shmem_map = shmem.map_mut();

Finally, we create our Observer, passing in the reference to the shared memory map and giving it the name shared_mem.

A HitcountsMapObserver needs a base object passed in as part of its constructor. The base we’re using is a ConstMapObserver. A ConstMapObserver is an optimization layer over a MapObserver. It allows for some performance gains by using a map size that’s known at compile time when deciding if a testcase is “interesting” (more on this in the Feedback section).

let edges_observer = HitcountsMapObserver::new(ConstMapObserver::<_, MAP_SIZE>::new(
    &mut shmem_map,

Phew! We’re done with Observers for now, let’s see what’s next.

Component: Feedback


After the Observers comes our Feedback. The purpose of a Feedback component is to classify whether or not the outcome of a testcase is interesting. When a Feedback determines that a testcase is interesting, the input that was used in that testcase is typically added to the Corpus. Feedback components may have a FeedbackState component tied to them as well. FeedbackState components represent the state of the data that the Feedback wants to persist in the Fuzzer’s State (both of which are coming up later).

For our fuzzer, we need to create a few different Feedback components. We’ll start with the Feedbacks that will keep track of our coverage map and execution time.

The MapFeedbackState tracks the cumulative state of the coverage map while getting updates from its associated Observer.

let feedback_state = MapFeedbackState::with_observer(&edges_observer);

The MapFeedbackState and its HitcountsMapObserver (discussed earlier) are passed into a MaxMapFeedback. The MaxMapFeedback determines if there is a value in the HitcountsMapObserver’s coverage map that is greater than the current maximum value for the same entry. If a new maximum is found, the Input is deemed interesting.

After creating the MaxMapFeedback, we also create a new TimeFeedback, which is then tied to the TimeObserver we saw earlier. You may be wondering how the TimeFeedback component helps to decide if an Input is interesting… Well, it doesn’t. TimeFeedback never reports an Input as interesting. However, it does keep track of testcase execution time by way of its TimeObserver. Due to the fact that it can never classify a testcase as interesting on its own, we need to use it alongside some other Feedback that has the ability to perform said classification.

With both of the new Feedback components created, we use the feedback_or macro to compile both Feedbacks into a single CombinedFeedback, which is joined together with a logical OR.

The feedback variable below is essentially saying, if the current testcase’s input triggered a new code path in the coverage map, we should probably save that input to the corpus.

The true passed to new_tracking says that we want to track indexes. The false says we do not want to track novelties.

let feedback = feedback_or!(
    MaxMapFeedback::new_tracking(&feedback_state, &edges_observer, true, false),

Alright, we’ve got the Feedbacks for coverage and time, now we’re going to add a second set of Feedbacks. These upcoming Feedbacks will also be used to determine if a testcase is interesting, but can be thought of more as a testcase’s solution or objective. They’ll be passed into our Fuzzer component later on in the code.

Similar to the first set, we need to create a MapFeedbackState. However, instead of tying it to an Observer, this one will create its own memory map of MAP_SIZE, consisting of u8’s.

const MAP_SIZE: usize = 65536;
// -------------8<-------------
let objective_state = MapFeedbackState::new("timeout_edges", MAP_SIZE);

Also similar to before, we combine two Feedbacks using a boolean operator, but this time it’s a logical AND. We’ll combine them using the feedback_and_fast macro. The Feedbacks in question are the TimeoutFeedback and our old friend MaxMapFeedback from earlier.

Recall that our goal is to find an infinite recursion bug. Since we’re looking for something that causes infinite recursion, we’re mostly interested in looking for testcases that make the target program hang. The objective variable below is essentially saying, if the given input triggered a new code path in the coverage map, AND, if the time to execute the fuzz case with the current input results in a timeout, our testcase meets our objective.

let objective = feedback_and_fast!(
    MaxMapFeedback::new(&objective_state, &edges_observer)

That’s it for Feedbacks, moving on to…

Component: State


Our next component is the State component, specifically the StdState. A State component takes ownership of each of our existing FeedbackState components, a random number generator, and our corpora.

let mut state = StdState::new(
    tuple_list!(feedback_state, objective_state),

Component: Stats


The Stats component defines how the fuzzer’s stats are reported. For now, we’ll use the simplest Stats representation: SimpleStats. SimpleStats will call println with SimpleStats::display as input in order to send a report to the terminal.

let stats = SimpleStats::new(|s| println!("{}", s));

Component: EventManager


The EventManager component handles the various Events generated during the fuzzing loop. Some examples of an Event are finding an interesting testcase, updating the Stats component, and logging. Once again, we’ll use the simplest type available for our current Fuzzer.

let mut mgr = SimpleEventManager::new(stats);

Component: Scheduler


During the fuzz loop, our fuzzer will need to acquire new testcases from the input corpus. The Scheduler component defines the strategy used to supply a Fuzzer’s request to the Corpus for a new testcase. For our fuzzer, we’re using the IndexesLenTimeMinimizerCorpusScheduler. The name is kinda scary, but it boils down to a minimization policy backed by a queue that is used to get test cases from the corpus. It will prioritize quick/small testcases that exercise all of the entries registered in the coverage map’s metadata.

The QueueCorpusScheduler is used as the IndexesLenTimeMinimizerCorpusScheduler’s backing queue.

let scheduler = IndexesLenTimeMinimizerCorpusScheduler::new(QueueCorpusScheduler::new());

Component: Fuzzer


The Fuzzer component contains our feedback, objectives, and a corpus scheduler. It’ll be the primary driver of our program, running the target program with the generated Input while triggering Observers and Feedbacks.

let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective);

Component: Executor


The Executor component is one of the few remaining. We’ll use a TimeoutForkserverExecutor. The TimeoutForkserverExecutor wraps the standard ForkserverExecutor and sets a timeout before each run. This gives us an executor that that implements an AFL-like mechanism that will spawn child processes to fuzz.

Part of creating our Executor is telling it what we want it to execute. In our case, we want to run the following.

./path/to/pdftotext INPUT_FILE

We’ll pass the path to pdftotext to the ForkserverExecutor constructor, along with the args it needs to run. We’ll use the double-@ symbol to tell the ForkserverExecutor that we want the BytesInput (covered earlier) generated from each testcase to be written to a file, and that the same file’s path should overwrite the @@ in the final command. Because we’re passing our generated Input via an on-disk file, we’ll set the use_shmem_testcase to false. Finally, we’ll pass our Observers into the ForkserverExecutor’s constructor, rounding out the call.

let fork_server = ForkserverExecutor::new(
    false,  // use_shmem_testcase
    tuple_list!(edges_observer, time_observer),

With that out of the way, we simply need to choose the length of our timeout, and pass the resulting Duration and ForkserverExecutor to the TimeoutForkserverExecutor’s constructor.

let timeout = Duration::from_millis(5000);

// ./pdftotext @@
let mut executor = TimeoutForkserverExecutor::new(fork_server, timeout).unwrap();

Components: Mutator + Stage


The final pieces of the puzzle are the Mutator and Stage components. We’ll register our StdScheduledMutator as a Stage using the StdMutationalStage component. A mutational stage is the stage in a fuzzing run that mutates the Input. Mutational stages will usually have a range of mutations that are being applied to the input one by one, between executions. In our case, the range of mutations we’ve chosen are the ever popular Havoc mutations. Each mutation is scheduled by the StdScheduledMutator as part of the mutational stage.

let mutator = StdScheduledMutator::new(havoc_mutations());
let mut stages = tuple_list!(StdMutationalStage::new(mutator));

Running the Fuzzer

That’s it for the individual components. All that’s left is for us to run the fuzzer. Recall that the Fuzzer takes ownership of a bunch of different components and essentially makes everything run. Knowing that, we pass in our stages, executor, the state, and the event manager. The code to make that happen is shown below.

    .fuzz_loop(&mut stages, &mut executor, &mut state, &mut mgr)
    .expect("Error in the fuzzing loop");

Build the Fuzzer

Ok, it’s been a long time coming, but the moment has arrived! With setup to take care of building Xpdf, we can build our fuzzer and the fuzz target with a single command.

cd fuzzing-101-solutions
cargo build --release

Commence Fuzzing!

After the build completes, we can kick off our artisnal, hand-crafted fuzzer.

cd exercise-1

On my machine, it took roughly 10 minutes to get a timeout. The objectives: 1 shows that we have 1 testcase that met the fuzzer’s objective (a timeout that also produced new coverage).

[Stats #0] clients: 1, corpus: 640, objectives: 1, executions: 568810, exec/sec: 1744


We can confirm that we’ve found a bug by using the PDF inside the timeouts folder (our output corpus).

./xpdf/install/bin/pdftotext ./timeouts/7e3a6553de5cce87
Error: PDF file is damaged - attempting to reconstruct xref table...
Error (677): Illegal character <2c> in hex string
Error (678): Illegal character <2c> in hex string
Error (679): Illegal character <2c> in hex string
Error (3245): Dictionary key must be a name object
Error (3248): Dictionary key must be a name object
Error (3255): Dictionary key must be a name object
Segmentation fault


\o/ Huzzah! We’ve created our own fuzzer using LibAFL that was specifically tuned to find a recursion bug in real-world software; pretty neat eh? The companion code for this exercise can be found at my fuzzing-101-solutions repository.

In the next post, we’ll tackle the second exercise. See you then!

Additional Resources

  1. Fuzzing101
  2. AFL++
  3. LibAFL
  4. LibAFL API Documentation
  5. LibAFL Book
  6. forkserver_simple example fuzzer
  7. fuzzing-101-solutions repository

comments powered by Disqus