In this talk I’ll briefly compare the approach to External Node Classification (ENC) of Puppet and CFEngine, and then describe a very simple yet powerful approach that has so far allowed Opera to cope with a sudden increase of managed nodes.
Classification in Configuration Management is the process of defining common traits of the managed systems (the classes), and tying classes and systems together. Classes collect nodes together in broadly defined sets: in each class most of the nodes share exactly the same traits, and different settings are applied to different classes of nodes.
A node can be managed effectively only if it can be tied to the classes that describe its features, thus making classification a critical process. As all critical processes, classification either scales with the infrastructure or becomes a hindrance to its growth.
Settings are supposed to be homogeneous across all systems in a certain class. However, there are nodes which are part of a class, and yet need some special settings that are different from all (or most) other nodes in the same class. One can well say that exceptions are actually the rule. If classification is critical, to be able to cope with exceptions efficiently in the classification process is crucial.
It’s easy to understand that a simple “internal” node classification done in the tools’ configuration files does not scale (for many reasons), and that other ways must be found to connect together a scalable and manageable “classification database” with the Configuration Management tool of choice: you need “external” node classification to scale.
Configuration Management tools acknowledge this need; for example: Puppet provides “plugs” in its configuration to get classification information from an executable program, and it has done so for ages. That sprung a large ecosystem of ENC tools, where hiera is a popular choice. What about CFEngine?