MIT taps LLVM's parallel processing power for faster code

Researchers at MIT claim their modifications to the LLVM compiler framework can help more kinds of code take advantage of parallel processing, with little or no developer effort

MIT taps LLVM's parallel processing power for faster code
Leovinus (CC0)

If you want to go fast in a multicore, multiprocessor world, you gotta go parallel. Splitting workloads across CPUs and cores is looking like the last, best hope for speed boosts as the limits of Moore's Law loom.

Small wonder there's growing emphasis on empowering developers using cutting-edge software tools to exploit parallelism. This work encompasses everything from language design to the compiler toolchain.

Researchers from MIT's Computer Science and Artificial Intelligence Laboratory have developed techniques to automatically add parallel computing optimizations to existing code whenever possible.

These speedups come from modifications to LLVM, a popular and well-understood open source compiler framework used by everyone from Apple to Microsoft.

Going faster, side by side

MIT's findings are described in a paper to be presented next week at the Association for Computing Machinery's Symposium on Principles and Practice of Parallel Programming, entitled "Tapir: Embedding Fork-Join Parallelism into LLVM's Intermediate Representation."

LLVM uses intermediate representation (IR) as part of its compilation process. The language to be compiled is first translated into the IR, then the IR is itself turned into machine code, give or take a few steps. Thus, LLVM can reason more effectively about the code and perform optimizations that would otherwise be hard to implement. (Rust's compiler also uses an IR for similar reasons.)

Tapir, the MIT team's name for its custom IR changes to LLVM, "allows the compiler to optimize across parallel control constructs with only minor changes to its existing analyses and code transformations," according to the paper. In other words, Tapir was built on top of optimization tactics for parallelizing code that already existed inside LLVM. The changes to LLVM needed for this involved only "about 6,000 lines of LLVM's 4-million-line codebase," according to the paper.

Tapir provides a set of IR instructions to enhance fork-join parallelism (FJP), a mechanism found in many existing compilers. With FJP, "subroutines can be spawned in parallel and iterations of a parallel loop can execute concurrently on modern multicore machines." However, these optimizations only work for code that explicitly requests them; ordinary code, or "serial" code, isn't optimized this way.

Let the machines do the heavy lifting

How optimizations like this pay off in the real world is always an open question. MIT's team ran a series of benchmarks designed to test parallel performance and found that Tapir's optimizations were generally either as good as or better than manual optimizations to the source code. (One of the tested benchmarks yielded worse results than the non-Tapir optimizations, but only about 2 percent worse.)

MIT has worked on parallelism before. Last year, it announced Milk, a set of extensions to C/C++ that alleviate memory bottlenecks in big data applications. That project extended an existing system for enhancing parallelism in applications, OpenMP, also cited in the Tapir paper as a standard way for developers to enhance parallelism.

Developers will probably always have to do a certain amount of manual work to take advantage of parallelism, such as splitting a given program into two or more high-level components that run as discrete processes. Still, experiments like Tapir show there's room for many more kinds of automatic optimizations.

Copyright © 2017 IDG Communications, Inc.