Data Structures & Algorithms

Understanding YAML

YAML Ain’t Markup Language or YAML for short is data serialization language commonly used in configuration files such as Kubernetes, Docker, Ansible, and many more. Its popularity has grown over the years, making it kind of a competitor to JSON.

Ok, if YAML Ain’t Markup Language, What Is It?

As mentioned, YAML is a data serialized language developed to improve human readability by using indentation and native data structures. Think of it as a strict superset of JSON or a cross between JSON and XML. This hybrid allows it to do what JSON can and more additional features.

The purpose of this tutorial is to introduce you to YAML, give you a guide on the syntax of the language, provide you with quick tools to work with YAML and teach you how to use it for configuration files and more.

How to Write YAML

Writing YAML is incredibly intuitive (I guess that’s the point) as it utilizes key-value pair syntax. Thus, it is like a dictionary in Python. However, unlike Python, YAML does not allow TAB spacing; it uses spaces.

The general syntax is:

key: value

To begin a new YAML document, we start with three dashes indicating the beginning of a new file.

This feature allows you to have multiple documents separated by the dashes in a single file.

Create a file with a .YAML extension and add the following contents.

---

language
: Python

author
: Guido van Rossum

country
: Netherlands

---

language
: JavaScript

author
: Brendan Eich

country
: United States

---

language
: Ruby

author
: Yukihiro Matsumoto

country
: Japan

As you can see from the above file, each document in yaml starts with three dots, followed by the data stored in key-value pairs.

Install a YAML linter

Before proceeding further, let us confirm that what we have is a valid YAML file. To do this, we need to install a YAML linter.

A linter is a tool that checks and notifies the developer of programming errors such as syntax errors and invalid constructs. For example, it allows you to check for the valid syntax of a file.

In our example, we shall use yamllint.

To install, use apt as:

sudo apt-get update

sudo apt-get install yamllint -y

Once installed, we can run the linter against the file using the command

echo -e ‘this is a valid: YAML syntax’ | yamllint sample.yaml

If the file contains valid YAML syntax, it does not give us any output.

Now, try adding spaces inside the YAML file or add a single dash at the bottom as:

---

language
: Python

author
: Guido van Rossum

country
: Netherlands

---

language
: JavaScript

author
: Brendan Eich

country
: United States

---

language
: Ruby

author
: Yukihiro Matsumoto

country
: Japan

-

If we run the linter against this file, the errors show up, as shown below:

sample.yaml
15:1 error syntax error
: expected <block end>, but found '-' (syntax)

NOTE: Like dictionaries in Python and similar data structures in various programming languages, the key-value pairs in a YAML document must be unique.

YAML Data types

YAML supports various ways to represent data. They include:

#: Scalar types

These are the most common data type in YAML. They are in the form of key-value pairs, as shown in the example above.

The values in a pair can be any type such as string, numbers including hexadecimal, integers, and such.

#: Strings

YAML also supports strings enclosed in single or double-quotes. This is not a requirement as the YAML parser will figure it out but can be helpful, especially in strings with escape characters.

The following are examples of valid strings in YAML.

---

string
: This is a string

string2
: “This is also a string”

string
: ‘so is this one’

NOTE: Ensure to close the double or single quotes where used. The following will result in an error.

---

invalid: ‘this is incorrect

To add a paragraph in a YAML file, use the (greater than) sign. Remember to add a space before the line. For example:

---

para: >

creating a paragraph

that spans for more than one

line.

#: Numeric Types

The other data type supported in YAML is numerical types. Numeric types include integers, decimals, hexadecimal, octal, and other numerical types.

The following YAML syntax represents numerical types.

---

int
: 100

hex
: 0x7f000001

octal
: 0177

float
: 127.0

expo
: 6.022e+23

#: Lists

Lists in YAML are specified using a single dash as:

---

- list

- another

- and another

#: Sequences

Sequences are data types used to hold multiple values in a single the same key. For example:

---

server
:

- apache

- 2.07

- LAMPP

#: Mappings

Mapping is pretty similar to a sequence but comprised of key-value pairs all contained under one sub-group.

Here is an example:

---

Servers
:

- apache
:

name
: server1

os
: Debian 10

version
: 2.4.46

- IIS
:

name
: iis-v01

os
: Windows Datacenter 2019

version
: 10.0.17763

#: Null

we set a null in YAML using a tilde (~) or the string null as shown in the example below:

---

tilde: ~

var: null

#: Arrays

Arrays in YAML are specified using the square brackets in a single line. The following example shows the definition of arrays in YAML.

---

numbers: [1,2,3,4,5,6,7,8,9,10]

strings: ["Hello""World""From""LinuxHint"]

YAML Comments

YAML also supports comments which allows you to add extra information to the YAML data. The parser ignores comments.

YAML comments begin with an Octothorpe (#).

---

# This is a comment in YAML

Process YAML to JSON

In some instances, we may need to convert YAML to JSON. Since the two are closely related, it makes sense to need one from the other.

For such scenarios, we can use a tool such as yq, which is a YAML/XML parser for jq.

To install it, use pip with the command as shown:

pip3 install yq

NOTE: Ensure you have jq installed as it is a required dependency for yq.

Suppose we have a sample Kubernetes pod creation file (Kubernetes.yaml) with the contents as shown:

---
apiVersion
: v1
kind
: Pod
metadata
:
  name
: store-site
  labels
:
    app
: web
spec
:
  containers
:
    - name
: nginx
      image
: nginx
      ports
:
        - containerPort
: 8080
      volumeMounts
:
        - name
: master
          mountPath
: /var/www/html/nginx
  dnsPolicy
: Default
  volumes
:
  - name
: home_directory
    emptyDir
: {}

NOTE: The above file is for illustration purposes and may contain errors if used in a real Kubernetes instance.

To convert the YAML file to JSON, use the command:

sudo yq eval -j kubernetes.yaml

Upon executing the command above, the contents of the file are automatically converted to JSON, as shown below:

{
    "apiVersion"
: "v1",
    "kind"
: "Pod",
    "metadata"
: {
      "name"
: "store-site",
      "labels"
: {
        "app"
: "web"
      }
    },
    "spec"
: {
      "containers"
: [
        {
          "name"
: "nginx",
          "image"
: "nginx",
          "ports"
: [
            {
              "containerPort"
: 8080
            }
          ],
          "volumeMounts"
: [
            {
              "name"
: "master",
              "mountPath"
: "/var/www/html/nginx"
            }
          ]
        }
      ],
      "dnsPolicy"
: "Default",
      "volumes"
: [
        {
          "name"
: "home_directory",
          "emptyDir"
: {}
        }
      ]
    }
  }

That makes works easier when switching from JSON to YAML and vice versa.

Conclusion

YAML is an incredibly powerful tool that allows you to build highly readable and compatible configuration files for support services. Using the concepts in this tutorial, you are in a position to build complex YAML documents for your applications or applications supporting YAML.

Thank you & Happy Coding!

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list