| Configuration Language Design |
| ============================= |
| |
| In this section we will cover some conventions for HCL-based configuration |
| languages that can help make them feel consistent with other HCL-based |
| languages, and make the best use of HCL's building blocks. |
| |
| HCL's native and JSON syntaxes both define a mapping from input bytes to a |
| higher-level information model. In designing a configuration language based on |
| HCL, your building blocks are the components in that information model: |
| blocks, arguments, and expressions. |
| |
| Each calling application of HCL, then, effectively defines its own language. |
| Just as Atom and RSS are higher-level languages built on XML, HashiCorp |
| Terraform has a higher-level language built on HCL, while HashiCorp Nomad has |
| its own distinct language that is *also* built on HCL. |
| |
| From an end-user perspective, these are distinct languages but have a common |
| underlying texture. Users of both are therefore likely to bring some |
| expectations from one to the other, and so this section is an attempt to |
| codify some of these shared expectations to reduce user surprise. |
| |
| These are subjective guidelines however, and so applications may choose to |
| ignore them entirely or ignore them in certain specialized cases. An |
| application providing a configuration language for a pre-existing system, for |
| example, may choose to eschew the identifier naming conventions in this section |
| in order to exactly match the existing names in that underlying system. |
| |
| Language Keywords and Identifiers |
| --------------------------------- |
| |
| Much of the work in defining an HCL-based language is in selecting good names |
| for arguments, block types, variables, and functions. |
| |
| The standard for naming in HCL is to use all-lowercase identifiers with |
| underscores separating words, like ``service`` or ``io_mode``. HCL identifiers |
| do allow uppercase letters and dashes, but this primarily for natural |
| interfacing with external systems that may have other identifier conventions, |
| and so these should generally be avoided for the identifiers native to your |
| own language. |
| |
| The distinction between "keywords" and other identifiers is really just a |
| convention. In your own language documentation, you may use the word "keyword" |
| to refer to names that are presented as an intrinsic part of your language, |
| such as important top-level block type names. |
| |
| Block type names are usually singular, since each block defines a single |
| object. Use a plural block name only if the block is serving only as a |
| namespacing container for a number of other objects. A block with a plural |
| type name will generally contain only nested blocks, and no arguments of its |
| own. |
| |
| Argument names are also singular unless they expect a collection value, in |
| which case they should be plural. For example, ``name = "foo"`` but |
| ``subnet_ids = ["abc", "123"]``. |
| |
| Function names will generally *not* use underscores and will instead just run |
| words together, as is common in the C standard library. This is a result of |
| the fact that several of the standard library functions offered in ``cty`` |
| (covered in a later section) have names that follow C library function names |
| like ``substr``. This is not a strong rule, and applications that use longer |
| names may choose to use underscores for them to improve readability. |
| |
| Blocks vs. Object Values |
| ------------------------ |
| |
| HCL blocks and argument values of object type have quite a similar appearance |
| in the native syntax, and are identical in JSON syntax: |
| |
| .. code-block:: hcl |
| |
| block { |
| foo = bar |
| } |
| |
| # argument with object constructor expression |
| argument = { |
| foo = bar |
| } |
| |
| In spite of this superficial similarity, there are some important differences |
| between these two forms. |
| |
| The most significant difference is that a child block can contain nested blocks |
| of its own, while an object constructor expression can define only attributes |
| of the object it is creating. |
| |
| The user-facing model for blocks is that they generally form the more "rigid" |
| structure of the language itself, while argument values can be more free-form. |
| An application will generally define in its schema and documentation all of |
| the arguments that are valid for a particular block type, while arguments |
| accepting object constructors are more appropriate for situations where the |
| arguments themselves are freely selected by the user, such as when the |
| expression will be converted by the application to a map type. |
| |
| As a less contrived example, consider the ``resource`` block type in Terraform |
| and its use with a particular resource type ``aws_instance``: |
| |
| .. code-block:: hcl |
| |
| resource "aws_instance" "example" { |
| ami = "ami-abc123" |
| instance_type = "t2.micro" |
| |
| tags = { |
| Name = "example instance" |
| } |
| |
| ebs_block_device { |
| device_name = "hda1" |
| volume_size = 8 |
| volume_type = "standard" |
| } |
| } |
| |
| The top-level block type ``resource`` is fundamental to Terraform itself and |
| so an obvious candidate for block syntax: it maps directly onto an object in |
| Terraform's own domain model. |
| |
| Within this block we see a mixture of arguments and nested blocks, all defined |
| as part of the schema of the ``aws_instance`` resource type. The ``tags`` |
| map here is specified as an argument because its keys are free-form, chosen |
| by the user and mapped directly onto a map in the underlying system. |
| ``ebs_block_device`` is specified as a nested block, because it is a separate |
| domain object within the remote system and has a rigid schema of its own. |
| |
| As a special case, block syntax may sometimes be used with free-form keys if |
| those keys each serve as a separate declaration of some first-class object |
| in the language. For example, Terraform has a top-level block type ``locals`` |
| which behaves in this way: |
| |
| .. code-block:: hcl |
| |
| locals { |
| instance_type = "t2.micro" |
| instance_id = aws_instance.example.id |
| } |
| |
| Although the argument names in this block are arbitrarily selected by the |
| user, each one defines a distinct top-level object. In other words, this |
| approach is used to create a more ergonomic syntax for defining these simple |
| single-expression objects, as a pragmatic alternative to more verbose and |
| redundant declarations using blocks: |
| |
| .. code-block:: hcl |
| |
| local "instance_type" { |
| value = "t2.micro" |
| } |
| local "instance_id" { |
| value = aws_instance.example.id |
| } |
| |
| The distinction between domain objects, language constructs and user data will |
| always be subjective, so the final decision is up to you as the language |
| designer. |
| |
| Standard Functions |
| ------------------ |
| |
| HCL itself does not define a common set of functions available in all HCL-based |
| languages; the built-in language operators give a baseline of functionality |
| that is always available, but applications are free to define functions as they |
| see fit. |
| |
| With that said, there's a number of generally-useful functions that don't |
| belong to the domain of any one application: string manipulation, sequence |
| manipulation, date formatting, JSON serialization and parsing, etc. |
| |
| Given the general need such functions serve, it's helpful if a similar set of |
| functions is available with compatible behavior across multiple HCL-based |
| languages, assuming the language is for an application where function calls |
| make sense at all. |
| |
| The Go implementation of HCL is built on an underlying type and function system |
| :go:pkg:`cty`, whose usage was introduced in :ref:`go-expression-funcs`. That |
| library also has a package of "standard library" functions which we encourage |
| applications to offer with consistent names and compatible behavior, either by |
| using the standard implementations directly or offering compatible |
| implementations under the same name. |
| |
| The "standard" functions that new configuration formats should consider |
| offering are: |
| |
| * ``abs(number)`` - returns the absolute (positive) value of the given number. |
| * ``coalesce(vals...)`` - returns the value of the first argument that isn't null. Useful only in formats where null values may appear. |
| * ``compact(vals...)`` - returns a new tuple with the non-null values given as arguments, preserving order. |
| * ``concat(seqs...)`` - builds a tuple value by concatenating together all of the given sequence (list or tuple) arguments. |
| * ``format(fmt, args...)`` - performs simple string formatting similar to the C library function ``printf``. |
| * ``hasindex(coll, idx)`` - returns true if the given collection has the given index. ``coll`` may be of list, tuple, map, or object type. |
| * ``int(number)`` - returns the integer component of the given number, rounding towards zero. |
| * ``jsondecode(str)`` - interprets the given string as JSON format and return the corresponding decoded value. |
| * ``jsonencode(val)`` - encodes the given value as a JSON string. |
| * ``length(coll)`` - returns the length of the given collection. |
| * ``lower(str)`` - converts the letters in the given string to lowercase, using Unicode case folding rules. |
| * ``max(numbers...)`` - returns the highest of the given number values. |
| * ``min(numbers...)`` - returns the lowest of the given number values. |
| * ``sethas(set, val)`` - returns true only if the given set has the given value as an element. |
| * ``setintersection(sets...)`` - returns the intersection of the given sets |
| * ``setsubtract(set1, set2)`` - returns a set with the elements from ``set1`` that are not also in ``set2``. |
| * ``setsymdiff(sets...)`` - returns the symmetric difference of the given sets. |
| * ``setunion(sets...)`` - returns the union of the given sets. |
| * ``strlen(str)`` - returns the length of the given string in Unicode grapheme clusters. |
| * ``substr(str, offset, length)`` - returns a substring from the given string by splitting it between Unicode grapheme clusters. |
| * ``timeadd(time, duration)`` - takes a timestamp in RFC3339 format and a possibly-negative duration given as a string like ``"1h"`` (for "one hour") and returns a new RFC3339 timestamp after adding the duration to the given timestamp. |
| * ``upper(str)`` - converts the letters in the given string to uppercase, using Unicode case folding rules. |
| |
| Not all of these functions will make sense in all applications. For example, an |
| application that doesn't use set types at all would have no reason to provide |
| the set-manipulation functions here. |
| |
| Some languages will not provide functions at all, since they are primarily for |
| assigning values to arguments and thus do not need nor want any custom |
| computations of those values. |
| |
| Block Results as Expression Variables |
| ------------------------------------- |
| |
| In some applications, top-level blocks serve also as declarations of variables |
| (or of attributes of object variables) available during expression evaluation, |
| as discussed in :ref:`go-interdep-blocks`. |
| |
| In this case, it's most intuitive for the variables map in the evaluation |
| context to contain an value named after each valid top-level block |
| type and for these values to be object-typed or map-typed and reflect the |
| structure implied by block type labels. |
| |
| For example, an application may have a top-level ``service`` block type |
| used like this: |
| |
| .. code-block:: hcl |
| |
| service "http" "web_proxy" { |
| listen_addr = "127.0.0.1:8080" |
| |
| process "main" { |
| command = ["/usr/local/bin/awesome-app", "server"] |
| } |
| |
| process "mgmt" { |
| command = ["/usr/local/bin/awesome-app", "mgmt"] |
| } |
| } |
| |
| If the result of decoding this block were available for use in expressions |
| elsewhere in configuration, the above convention would call for it to be |
| available to expressions as an object at ``service.http.web_proxy``. |
| |
| If it the contents of the block itself that are offered to evaluation -- or |
| a superset object *derived* from the block contents -- then the block arguments |
| can map directly to object attributes, but it is up to the application to |
| decide which value type is most appropriate for each block type, since this |
| depends on how multiple blocks of the same type relate to one another, or if |
| multiple blocks of that type are even allowed. |
| |
| In the above example, an application would probably expose the ``listen_addr`` |
| argument value as ``service.http.web_proxy.listen_addr``, and may choose to |
| expose the ``process`` blocks as a map of objects using the labels as keys, |
| which would allow an expression like |
| ``service.http.web_proxy.service["main"].command``. |
| |
| If multiple blocks of a given type do not have a significant order relative to |
| one another, as seems to be the case with these ``process`` blocks, |
| representation as a map is often the most intuitive. If the ordering of the |
| blocks *is* significant then a list may be more appropriate, allowing the use |
| of HCL's "splat operators" for convenient access to child arguments. However, |
| there is no one-size-fits-all solution here and language designers must |
| instead consider the likely usage patterns of each value and select the |
| value representation that best accommodates those patterns. |
| |
| Some applications may choose to offer variables with slightly different names |
| than the top-level blocks in order to allow for more concise references, such |
| as abbreviating ``service`` to ``svc`` in the above examples. This should be |
| done with care since it may make the relationship between the two less obvious, |
| but this may be a good tradeoff for names that are accessed frequently that |
| might otherwise hurt the readability of expressions they are embedded in. |
| Familiarity permits brevity. |
| |
| Many applications will not make blocks results available for use in other |
| expressions at all, in which case they are free to select whichever variable |
| names make sense for what is being exposed. For example, a format may make |
| environment variable values available for use in expressions, and may do so |
| either as top-level variables (if no other variables are needed) or as an |
| object named ``env``, which can be used as in ``env.HOME``. |
| |
| Text Editor and IDE Integrations |
| -------------------------------- |
| |
| Since HCL defines only low-level syntax, a text editor or IDE integration for |
| HCL itself can only really provide basic syntax highlighting. |
| |
| For non-trivial HCL-based languages, a more specialized editor integration may |
| be warranted. For example, users writing configuration for HashiCorp Terraform |
| must recall the argument names for numerous different provider plugins, and so |
| auto-completion and documentation hovertips can be a great help, and |
| configurations are commonly spread over multiple files making "Go to Definition" |
| functionality useful. None of this functionality can be implemented generically |
| for all HCL-based languages since it relies on knowledge of the structure of |
| Terraform's own language. |
| |
| Writing such text editor integrations is out of the scope of this guide. The |
| Go implementation of HCL does have some building blocks to help with this, but |
| it will always be an application-specific effort. |
| |
| However, in order to *enable* such integrations, it is best to establish a |
| conventional file extension *other than* `.hcl` for each non-trivial HCL-based |
| language, thus allowing text editors to recognize it and enable the suitable |
| integration. For example, Terraform requires ``.tf`` and ``.tf.json`` filenames |
| for its main configuration, and the ``hcldec`` utility in the HCL repository |
| accepts spec files that should conventionally be named with an ``.hcldec`` |
| extension. |
| |
| For simple languages that are unlikely to benefit from specific editor |
| integrations, using the ``.hcl`` extension is fine and may cause an editor to |
| enable basic syntax highlighting, absent any other deeper features. An editor |
| extension for a specific HCL-based language should *not* match generically the |
| ``.hcl`` extension, since this can cause confusing results for users |
| attempting to write configuration files targeting other applications. |