clang (pronounced like klăng) is a compiler front end for the C, C++, and Objective-C programming languages. It uses the Low Level Virtual Machine (LLVM) as its back end. It is still under development; when finished, it will offer a potential replacement to the GNU Compiler Collection (GCC). Development is sponsored by Apple, and it is licensed using a BSD-like open source license.


In early 2005, Apple hired Chris Lattner and formed a team to work on the LLVM system for various uses within Apple's development systems. LLVM can replace most of the "lower levels" of the GCC toolchain, offering more aggressive optimization of GCCs three address code intermediate form (IF). LLVM allows code to be compiled statically, as it is under the traditional GCC system, or left for late-compiling from the IF to machine code in a just-in-time compiler in a fashion similar to Java.

Apple has made use of the LLVM system in a number of commercial systems. One of the most visible uses to date has been an OpenGL code compiler for Mac OS X that converts OpenGL calls into more fundamental calls for graphics processing units that do not support certain features. This was instrumental in allowing Apple to support the entire OpenGL API on computers using "integrated graphics" based on the Intel GMA chipsets, greatly increasing performance on those machines. For powerful chipsets the code is compiled to take full advantage of the underlying hardware, but on GMA machines, LLVM compiles the same OpenGL code into subroutines to ensure it continues to work properly. More recently, LLVM's ability to support a wide variety of underlying hardware was key in Apple's ability to offer high-performance applications on the iPhone, which is based on the ARM processor which had limited support in the existing GCC toolchain. LLVM is a part of the iPhone development kit, and is a part of Xcode 3.1.

While LLVM was initially targeted as part of the GCC toolchain, more recently there has been an interest in replacing other portions of the GCC system as well. GCC is a large and somewhat cumbersome system to develop. As one long-time gcc developer put it, "Trying to make the hippo dance is not really a lot of fun. Apple has an intense interest in improving performance in Objective C (ObjC), but this sees little development under the normal GCC development effort. Their choices for improving ObjC performance were to continue to make changes to the "Apple-branch" of GCC, limiting their abilities, or strike out on their own with a fresh approach.

Licensing was also a factor; LLVM was initially developed at the University of Illinois, Urbana-Champaign and released under a BSD-like license that makes it easy to use in commercial programs, whereas GCC is GPL licensed and has redistribution requirements. As the LLVM system can be "embedded" within commercial systems (like the OpenGL example above), having an entire toolchain based on a similar license would make the legalities much simpler.


clang is a new C-targeted compiler intended specifically to work on top of LLVM. The combination of clang and LLVM provides the majority of a toolchain, allowing the replacement of the whole GCC stack.

One of clang's primary goals is to better support incremental compilation to allow the compiler to be more tightly tied to the IDE GUI. GCC is designed to work in a "classic" compile-link-debug cycle, and although it provides useful ways to support incremental and interrupted compiling on-the-fly, integrating them with other tools is not always easy. For instance, GCC uses a step called "fold" that is key to the overall compile process, which has the side effect of translating the code tree into a form that does not look very much like the original source code. If an error is found during or after the fold step, it can be difficult to translate that back into a single location in the original source. Additionally, vendors using the GCC stack within IDE's used separate tools to index the code to provide features like code coloring and autocomplete, which took up time that would be better spent compiling.

clang was designed from the start to address these issues. The compiled code retains much more information than under GCC, and preserves the overall form of the original code. This makes it much easier to map errors back into the original source and point directly to problems, rather than "somewhere close". The error reports it offers are also much more detailed and specific. Additionally, the intermediate form is both human- and machine-readable, so IDE's can index the output of the compiler as it becomes available. This ensures that the compiler and IDE are looking at the "same thing". Since the compiler is always running, it can offer source code indexing, syntax checking, and other features normally associated with rapid application development systems. The parse tree is also much more suitable for supporting automated code refactoring, and, as it remains in a parsable text form at all times, changes to the compiler can be checked by diffing the IF.

Likewise, many portions of GCC are simply "old". The system was written in an era where single processor systems were almost universal, and thus doesn't support threading. Threading would have to be retrofitted in order to take full advantage of the now almost universal multi-processor hardware used during development. clang was designed from the start to be threaded, and has much better memory footprint and speed. As of October 2007, clang compiled the Carbon libraries well over twice as fast as GCC, while using about five times less memory and disk space.

Although development on GCC may be difficult, the reasons for this have been well explored by its developers. This allowed the clang team to avoid these problems and make a more flexible system. clang is highly modularized, based almost entirely on replaceable link-time libraries as opposed to source code modules that are combined at compile time, and well documented. This makes it much easier for new developers to get up to speed in clang and add to the project. In some cases the libraries are provided in several versions that can be swapped out at runtime, for instance the parser comes with a version that offers performance measurement of the compile process.

clang, as the name implies, is a compiler only for C and C-like languages. It does not offer compiler front-ends for languages other than C, C++ and Objective C. For other languages, including Java, FORTRAN and Ada, LLVM remains dependent on GCC. clang can be used or swapped out for GCC as needed, with no other effects on the toolchain as a whole .

Current Status

The project is under rapid development. Currently (December 2007), code generation for C and Objective-C is partially complete. Support for C++ and Objective-C++ is still quite incomplete; the project team "[doesn't] expect to have respectable C++ support for another 2 years or so."


External links

Search another word or see clangon Dictionary | Thesaurus |Spanish
Copyright © 2015, LLC. All rights reserved.
  • Please Login or Sign Up to use the Recent Searches feature