YAML: A Simple Introduction

9/2/2020

YAMl (rhymes with Camel) is a popular data-serialization language that’s a superset of JSON and easier to read.

Recently I had to work with YAML whilst creating configuration files for ROS so thought I’d share what I’ve learned so far.

In this post we’re going to be using the pyyaml package to see how our YAML files get converted into Python 3 data structures.

What is YAML?

YAML (or YAML Ain’t Markup Language) is a human-readable data-serialization language. You’ll commonly find YAML being used to define configuration files or for storing or transmitting data.

In fact, YAML 1.2 is a superset of JSON, another popular data-serialization language that’s used extensively in applications such as REST APIs and GraphQL for data transmission and MongoDB for data storage.

Comments

Unlike JSON you can have comments within your YAML files. They’re denoted by hashes (#) just like in Python. Note that if your #’s form part of a string they’ll be treated as a hash literal, but this only works if you enclose the string in single (‘ ’) or double (“ ”) quotes.

Basic Data Types in YAML

We’ll cover 6 of the basic data types from the YAML specification:

  • Booleans
  • Integers
  • Floats
  • Strings
  • Lists
  • Associative Arrays (dictionaries)

Lists and associative arrays can contain any of the above data types (even themselves!) as we’ll soon see.

Booleans

Use true or false to denote boolean values in YAML.

Integers and Floats

Integers and floats are self-explanatory. We’ll use them extensively throughout the rest of this post so you’ll get a chance to see how they work.

Strings

Strings in YAML are ordinarily unquoted. However, you can enclose them in single (‘ ’) or double (“ ”) quotes as well. Double quotes also allow you to escape special characters using the backslash character ().

Lists

There are two main ways that we can define lists in YAML. The first way uses hyphens (-) and indentation to designate list members. It is important to get the indentation correct, otherwise you may not end up with what you intended.

# A list of different data types
- Pancakes
- 2
- 3.0
- - 'Headphones'
  - 1
  - 1.27
- "Apples are \"better\""

This converts to the following array in Python 3:

['Pancakes', 2, 3.0, ['Headphones', 1, 1.27], 'Apples are "better"']

The second way to define lists is using square brackets ([]) and is exactly the same as the way you define lists in JSON:

# A list of different data types
[Pancakes, 2, 3.0, ['Headphones', 1, 1.27], "Apples are \"better\""]

Associative Arrays

There are two main ways of defining associative arrays in YAML. The first way has each key-value pair on its own line with each pair using a colon and a space to separate the key and value:

# An associative array
title: Dune
author: Frank Herbert
pages: 412
published: 1/08/1965
media:
  - hardcover
  - paperback
  - audio
publisher:
  name: Chilton books
  founded: 1904

This associative array in YAML gets turned into a Python 3 dictionary:

{'title': 'Dune',
 'author': 'Frank Herbert',
 'pages': 412,
 'published': '1/08/1965',
 'media': ['hardcover', 'paperback', 'audio'],
 'publisher': {'name': 'Chilton books', 'founded': 1904}}

The second way of defining associative arrays in YAML is to enclose the array with curly brackets ({}) and separate the key-value pairs with commas (,). Each pair requires only a colon to separate the key and value (to ensure consistency with JSON):

{
 title: Dune,
 author: Frank Herbert,
 pages: 412,
 published: 1/08/1965,
 media: [hardcover, paperback, audio],
 publisher: {name:Chilton books, founded:1904}
}

YAML Example

A contrived example that covers everything in this post:

name: Tesla Inc
executives:
  ceo: Elon Musk
  cto: Drew Baglino
  cfo: Zach Kirkhorn
  chairwoman: Robyn Denholm
founded: 2003
public: true
employees: 48016
product_areas: [Electric Vehicles, Tesla Batteries, Solar Panels]
products:
  cars:
    - {name: Model S, codename: WhiteStar, style: sedan, introduced: 2012, electric: true}
    - {name: Model 3, codename: BlueStar, style: sedan, introduced: 2017, electric: true}
    - {name: Model X, codename: , style: suv, introduced: 2015, electric: true}
    - {name: Model Y, codename: , style: suv, introduced: 2020, electric: true}
subsidiaries:
  - SolarCity
  - Tesla Grohmann
  - Automation
  - Maxwell Technologies
  - DeepScale
  - Hibar Systems

Conclusion

In this post we learned how to use 6basic types in YAML to define different data structures and saw how they mapped to Python 3. We also saw how to use comments to make our YAML files even more human-readable. Of course, there are many more features in YAML that we haven’t covered.

If you want to play around with YAML try using pyyaml. If you run the following command, it’ll load your YAML file and parse it into a Python 3 data structure (remember to import yaml first!):

yaml.load(open(‘<path to YAML file>’, ‘r’), Loader=yaml.FullLoader)

Have fun!