MLIR allows you to define your own intermediate representation (IR) while benefiting from the infrastructure it provides. However, getting started with creating your own IR, called dialect in the MLIR universe, is sometimes tricky. This tutorial addresses some of the challenges arising with the CMake configuration and explores projects like the standalone MLIR dialect example in more detail. Furthermore, we take a look at how TableGen files that define a single dialect can be split.
Even with tutorials on MLIR like the MLIR tutorial presented at the LLVM Developers' Meeting in 2020 [1] and the "Creating a Dialect" [2] as well as the Toy tutorial [3] in the MLIR docs, building an MLIR dialect can feel difficult. This can be especially the case when it comes to the CMake configuration. Good starting points are the standalone MLIR dialect example [4] and especially the tutorial [5] given by S. Neuendorffer at last years LLVM Developers' Meeting.
In addition to the existing tutorials, we will look into the details of the CMake configuration of an out-of-tree dialect like [4]. Furthermore, we dive into more complex CMake configurations for projects like MLIR-EmitC [6], showing how to build a project standalone or embedded into another project. In addition to the CMake configuration, it is briefly covered how to architecture TableGen files. In particular, it is shown how to use multiple TableGen files to define a single dialect, e.g. to define the base dialect, operations, attributes and types.
[1] M. Amini & R. Riddle "MLIR Tutorial", 2020 LLVM Developers' Meeting, https://youtu.be/Y4SvqTtOIDk
[2] https://mlir.llvm.org/docs/Tutorials/CreatingADialect/
[3] https://mlir.llvm.org/docs/Tutorials/Toy/
[4] https://github.com/llvm/llvm-project/tree/main/mlir/examples/standalone
[5] S. Neuendorffer, "Architecting out-of-tree LLVM projects using cmake", 2021 LLVM Developers' Meeting, https://youtu.be/7wOU7csj1ME
[6] https://github.com/iml130/mlir-emitc