Deep Learning Frameworks and Compilers

3 minute read

Published:

In previous blogs, we have covered the topic of scheduling tensor programs. This task might be viewed as one step in machine learning compilers, which I will provide an introduction in this blog.

Compilers are inherently designed to abridge the gap between software and hardware. In the same sense, machine learning compilers are primarily developed for hardware for machine learning programs, e.g., deep neural networks.

AI accelerators

Due to the successes of deep neural networks and other machine learning applications, specific hardware devicies have been developed for these computer programs. These hardware might be collectively referred to as AI accelerators, as their major aim is to accelerate tensor program computations.

In below I list some chips that might be used for AI computation:

Tensor programs software frameworks (neural net frameworks)

Having only a dedicated hardware is insufficient to accelerating tensor program computations. They need also software support. Over the past few years, many neural network frameworks have been developed. To name a few, Tensorflow, Pytorch, JAX, and Huawei’s Mindspore.

These frameworks are all created under the above software architecture stack, among which the core question being addressed is how to translate high-level and hardware-agnostic code into low-level, efficient and hardware-aware code. In fact, this core task is fairly identical despite how different these software frameworks may look in appearance. Thus, deep learning compilers do have the right of being exist as standalone projects. Some of them are listed in below.

Machine compilers

This specifically refers to the “intermidiate” part of the software framework above.

  1. XLA

    XLA, developed by Google, originally for tensorflow, but now also supports Pytorch, JAX.

  2. GLOW

    GLOW, developed by Facebook.

  3. Tiramisu

    Tiramisu, developed by MIT.

  4. TensorComprehensions

    TensorComprehensions by Facebook.

  5. AutoTVM

    AutoTVM, as the name indicates, is developed in TVM, but it supports also many other front-ends, e.g. Tensorflow, Pytorch etc.

  6. Halide

    Halide is a language in C++ as well as a compiler. It was first created for accelerating image processing pipelines, but can also be used for deep neural network models, given that the model specification has been converted into Halide’s C++ language.

Difference Summary

In summary, the differences of these tools are majorly reflected in the following aspects:

  • frontend language expressiveness. For example, Halide does not support cyclic computation graphs, e.g., LSTMs.

  • search space modelling. For example, Tiramisu models the scheduling space using Polyhedral, while Halide models the search space as a decision tree.

  • algorithms for scheduling. For example, Tiramisu uses ILP solver. AutoTVM used evolutionary algorithms. Halide uses tree search.

  • backend support

MLIR

Besides the algorithmic differences, all the tools above rely certain format of intermidate representation, which is usually developed for that specific tool. It would ideal that design on this aspect can be standardized, thus different compilers can mostly focus on developing intelligent compilation algorithms, rather than engineering seemingly different but in-theory equivalent designs. The MLIR represnets an effort towards unifying these intermediate representations.

ML compiler