AnySL: Efficient and Portable Multi-Language Shading

Scenes rendered with AnySL and the ray tracers RTfact, Manta, and PBRT. The surface shaders are written in RenderMan or scalar C++ and all compiled to a platform-independent intermediate representation. The AnySL system loads, vectorizes, and optimizes the shaders and seemlessly integrates them into the renderer at runtime, achieving close to native shading performance.


In cooperation with the Computer Graphics Group we develop a unified shading system that is independent of source language, target architecture and rendering engine without sacrificing runtime performance.

Our goal is to eventually provide a shading-system that uses a portable shader-format to allow integration into any kind of rendering engine (e.g. ray-tracing, rasterization, global illumination). Additionally, integration of existing shading-languages only requires minimal effort while the compiler technology of AnySL still enables maximum performance.

Shaders denote program fragments that extend the functionality of a rendering system for specific tasks such as computing emission, light-material interaction, or geometry processing --- similar to plug-ins used elsewhere. The key difference to such function-call and library-based plug-ins is that shading code usually needs to be transformed to meet the needs of the target applications regarding performance or program structure and should provide convenience for the programmer. However, to support a certain shading language, the compiler has to provide a compiler framework for it. Hence, the renderer's implementor ends up in investing a large part of his time in creating compilers; something he did not want to do in the first place.

AnySL is a novel approach to ease the integration of a shading language into a renderer. We compile shaders into a program representation that is independent of the shading language, the renderer, and the target hardware platform. The renderer has to provide the implementation of the basic constructs of the shading language. By augmenting the renderer with a just-in-time compiler library, the shaders are loaded and "glued" to the renderer's interface at runtime. Afterwards, the shader is mapped to the underlying hardware platform. With this approach, all performance obstacles incurred by common programming abstraction mechanisms are optimized away, resulting in high performance while keeping the maximum flexibility.

The AnySL Shading System uses an embedded just-in-time compiler (the "Low-Level Virtual Machine" (LLVM)) to load, specialize and optimize shaders at runtime. This allows us to recompile on the fly, e.g. after modifications of shader parameters, without sacrificing performance.

  • AnySL in RTfact


  • AnySL in Manta


  • AnySL in PBRT


Whole-Function Vectorization

For ray tracing engines that employ packet tracing, the scalar shader code is automatically transformed to packet code that operates on packets of data (that are sized depending on the target architecture's SIMD width). This allows to exploit the SIMD instruction sets of CPUs (e.g. SSE, AltiVec) without putting the burden of writing such complex and error-prone code on the shader programmer. The only option to this is sequential shading of all rays of a packet, which incurs a lot of overhead if the ray tracer operates on SIMD datatypes because packets have to be split before execution and results have to be merged again.

Compared to sequential shading, we obtain an average speedup factor of 3.9 of the entire rendering process in RTfact. At the same time, we reach over 90% of the performance of the hand-written, native shaders.

See the project page for more details: Whole-Function Vectorization.

LLVM PTX Backend

As part of the AnySL system we implemented an LLVM backend for NVIDIA's "Parallel Thread Execution" (PTX) assembly language. PTX is the low-level representation fed to NVIDIA GPGPU graphics drivers and is usually generated by compilers for the "Compute Unified Device Architecture" (CUDA).

The backend is similar to LLVM's C-backend and generates .ptx files directly from LLVM's intermediate representation (IR).

The backend already supports most of the PTX features:

There are no intrinsics for PTX-specific functionality like texture fetches, they are currently only accessed via external functions. Atomic and synchronization instructions are not yet implemented but should work the same way.

Performance has not yet been optimized to a larger degree. Register pressure lowering optimizations are necessary for more performant code.

The backend was written as part of the bachelor's thesis of Helge Rhodin. The source code is released under the University of Illinois/NCSA Open Source License (BSD-style) and is hosted at SourceForge.

Code contributions to the backend are very welcome! :)

Download LLVM PTX Backend.



MSc Thesis

BSc Thesis