cHash: Detection of Redundant Compilations via AST Hashing

Abstract

Software projects that use a compiled language are built hundreds of thousands of times during their lifespan. Hence, the compiler is invoked over and over again on an incrementally changing source base. As previous work has shown, up to 97 percent of these invocations are redundant and do not lead to an altered compilation result. In order to avoid such redundant builds, many developers use caching tools that are based on textual hashing of the source files. However, these tools fail in the presence of modifications that leave the compilation result unchanged. Especially for C projects, where module-interface definitions are imported textually with the C preprocessor, modifications to header files lead to many redundant compilations.

In this paper, we present the cHash approach and compiler extension to quickly detect modifications on the language level that will not lead to a changed compilation result. By calculating a hash over the abstract syntax tree, we achieve a high precision at comparatively low costs. While cHash is light-weight and build system agnostic, it can cancel 80 percent of all compiler invocations early and reduce the build-time of incremental builds by up to 51 percent. In comparison to the state-of-the-art ccache tool, cHash is at least 30 percent more precise in detecting redundant compilations.

Publication

USENIX Conference Best Paper Award
cHash: Detection of Redundant Compilations via AST Hashing.
Christian Dietrich, Valentin Rothberg, Ludwig Füracker, Andreas Ziegler, Daniel Lohmann; In: Proceedings of the 2017 USENIX Annual Technical Conference; USENIX Association, 2017. Best Paper Award.
[PDF] [Slides] [Raw Data] [BibTex]

The raw data and the source code used in this publication are available at gitlab.cs.fau.de.

Bibtex

@inproceedings{dietrich:17:usenix,
 address = {Berkeley, CA, USA},
 author = {Christian Dietrich and Valentin Rothberg and Ludwig Füracker and Andreas Ziegler and Daniel Lohmann},
 booktitle = {Proceedings of the 2017 USENIX Annual Technical Conference},
 entrysubtype = {Conference},
 link = {https://www.usenix.org/conference/atc17/technical-sessions/presentation/dietrich},
 location = {Santa Clara, CA, USA},
 month = {jul},
 publisher = {USENIX Association},
 title = {{cHash:} Detection of Redundant Compilations via {AST} Hashing},
 userd = {USENIX '17},
 venue = {Santa Clara, CA, USA},
 year = {2017}
}