Software engineering is expensive, summing to over one half of one percent of the US GDP annually. Software maintenance accounts for over two-thirds of that life cycle cost, and a key aspect of maintenance is fixing bugs in existing programs. Unfortunately, the number of reported bugs far outstrips available development resources. It is common for a popular project to have hundreds of new bug reports filed every day.
Software maintenance is expensive. GenProg reduces software maintenance costs by automatically producing patches (repairs) for program defects.
Many bugs can be fixed with just a few changes to a program's source code. Human repairs often involve inserting new code and deleting or moving existing code. GenProg uses those same building blocks to search for repairs automatically.
GenProg uses genetic programming to search for repairs. Our evolutionary computation represents candidate repairs as sequences of edits to software source code. Each candidate in a large population is applied to the original program to produce a new program, which is evaluated using test suites. Those candidates that pass more tests are said to have a higher fitness and are iteratively subjected to computational analogs of the biological processes of mutation and crossover. This process terminates when a candidate repair is found that retains all required functionality and fixes the bug. GenProg does not require special code annotations or formal specifications, and applies to unmodified legacy software.
GenProg has repaired many kinds of defects, including infinite loops, segmentation violations, heap buffer overflows, non-overflow denial-of-service vulnerabilities, stack buffer overflows, format string vulnerabilities, integer overflows — as well as the ubiquitous "my program produces the incorrect output" bug. While we evaluate primarily on C programs, GenProg can also repair programs in other languages, such as x86 or ARM assembly files or ELF binaries.
GenProg repaired 55 out of 105 bugs for $8 each in a recent systematic study that included multiple programs, totaling over 5 million lines of code, and supported by over 10,000 test cases.
GenProg has won multiple academic best paper awards (ICSE 2009, GECCO 2009, SBST 2009). In addition, this work was awarded the IFIP TC2 Manfred Paul Award "for excellence in software: theory and practice." Finally, GenProg won gold (in 2009) and bronze (in 2012) honors at the "Humies" Awards for human-competitive results produced by genetic and evolutionary computation.
This Summer 2012 teaser video highlights GenProg repairing 55 out of 105 bugs for $8 each:
This Summer 2012 teaser video highlights a human study of GenProg patch maintainability:
This December 2011 video demonstrates GenProg running on a smartphone embedded devices and discusses the associated challenges:
We recommend that you start with this paper:
These papers provide broad overviews or high-level arguments:
This paper describes the relationship between human developers and GenProg-generated patches:
This paper investigates properties of software that may relate to GenProg's success:
These papers highlight the application of GenProg to new subject areas and domains:
These papers describe optimizations of or parameter choices for GenProg:
Finally, these papers detail older versions of, or historical antecedents to, the GenProg system:
We are committed to reproducible research and to making our tools freely available. Since our tools, techniques and benchmark suites evolve and improve over time, there is no single all-inclusive package. Instead, for major papers, we try to make available a snapshot of the programs and benchmarks used in that paper, so that others can reproduce or extend those experiments. All archives contain READMEs with instructions on how to use the included tools or benchmarks.
Source Code and instructions for compiling and running our prototype tool are available. READMEs contain instructions for fulfilling dependencies, building, and running the tools; benchmark-specific instructions are archived separately.
Virtual Machine Images are available. Follow the instructions carefully; you will likely have to replace the default outdated version of GenProg that appears in these images with a more recent one.
Experimental Data Sets are available. These benchmarks allow for the replication of our research or the comparative evaluation of repair techniques.
Experimental Results are available. These are archives of the raw output of our tool (e.g., debug logs, repairs found, etc.) for various experiments.
Our human study data sets are available here:
In addition, we are indebted to a number of graduate and undergraduate researchers, without whom this project would not be possible: