User:Baxter.brad/Drafts/YAML Tutorial and Style Guide

YAML Tutorial and Style Guide
This draft document refers to YAML version 1.1.

Calling this a tutorial and style guide is perhaps not entirely accurate, unless you view it as teaching by examples. But "Tutorial and Style Guide" is the motivation, and perhaps as sections are refined, that title will better apply.

In particular, this is not intended to be a rewrite of the specifications. Rather this is the author's attempt to explain YAML is his own words as a learning exercise. Where this document conflicts with the specifications, the specifications of course take precedent.

All of the examples were tested for correct parsing using the amazing online Haskell parsing tool, YPaste.

Comments, questions, corrections, suggestions are all welcome.

YAML Ain't Markup Language
http://yaml.org/ says, "YAML is a human friendly data serialization standard for all programming languages."

Data Serialization in Other Languages
Data serialization in Perl is most often done using the Data::Dumper. Javascript has JSON, which is used in other languages, too. YAML is designed to provide a general purpose standard for all languages.

Data::Dumper Pros, Cons, and Similarities
A pro for using Data::Dumper is simplicity: the value returned when you " " the serialized data is a native Perl data structure. Cons include security concerns. You are, after all, " "ing program code, which could contain any valid code, e.g., " ". If the person performing the " " has permission to do that, then you might have a problem. Another con is that Data::Dumper serialized data is recognized only by Perl and not typically usable in other languages.

When a scalar, array, or hash is serialized using Data::Dumper, the result is similar to YAML's quoted scalar, flow sequence, and flow mapping representations.

use Data::Dumper;

print Dumper( [   'fish',     'fish',     {        'red' => 'fish',         'blue' => 'fish',     } ] );

__END__

$VAR1 = [ 'fish', 'fish', {                  'blue' => 'fish', 'red' => 'fish' }              ];

The example above shows a Perl program that uses Data::Dumper to serialize an array containing scalars and a hash. This illustrates that the serialized representation is in fact program code just like the code that created the structures natively.

With some careful configuration, Data::Dumper can emit valid YAML (not valid Perl).

use Data::Dumper; $Data::Dumper::Terse = 1; $Data::Dumper::Indent = 1; $Data::Dumper::Pair = ' : ';

print Dumper( [   'fish',     'fish',     {        'red' => 'fish',         'blue' => 'fish',     } ] );

__END__

[ 'fish', 'fish', {   'blue' : 'fish', 'red' : 'fish' } ]

This technique will not work for all scalar values.

JSON Pros, Cons, and Similarities
JSON uses a data parser to turn the serialized data into a native data structure. It is safer because the parser is not executing program code. It is usable by any programming language for which a parser is available. Arguably, you could say the same for Data::Dumper, but it appears that more languages have parsers for JSON than for corresponding Data::Dumper serialized data (i.e., data limited to scalars, arrays, and hashes). A con of JSON is that it lacks support for specialized data structures, like objects. (Typically, these are supported in Data::Dumper using " ".)

Like the Data::Dumper example above, JSON is similar to YAML's quoted scalar, flow sequence, and flow mapping representations. So similar, in fact, that beginning with version 1.1, YAML provides complete support for the JSON format. So JSON is YAML.

YAML Pros, Cons, Similarities
Enter YAML. Like JSON, it requires a parser; it is safer by not executing program code; it is usable by many programming languages (any for which a parser and dumper have been written--including Perl). Further pros:
 * 1) JSON is YAML, so existing JSON data can be parsed as YAML;
 * 2) YAML supports specialized data structures using Tags;
 * 3) YAML is designed from the ground up to be very human readable--that's its primary goal.

Some cons of YAML are related to its goal to be very human readable. The specifications that accomplish this are complex. For documents that are edited by hand, people must learn and follow a number of rules to produce valid YAML data, though arguably, this is universally true of any data serialization language.

The following diagram shows a typical representation in these three data serialization languages.

Data::Dumper (Perl code)     JSON                          YAML

[                            [                             - fish 'fish',                      "fish",                   - fish 'fish',                      "fish",                   - red: fish {                            {                           blue: fish 'red' => 'fish',             "red" : "fish", 'blue' => 'fish',            "blue" : "fish", }                            } ]                             ]

YAML Documents and Streams
YAML serialized data is stored in YAML documents. A YAML document begins with " " and ends either with another " " (starting a new document) or with " ", e.g.,

--- contents of YAML document 1 --- contents of YAML document 2 ...

(If there is only one serialized data structure present, the document begin marker, " ", and end marker, " " are optional. If there are multiple documents, the end marker and the first begin marker are optional.)

A series of YAML documents, like the one above is referred to as a YAML stream. (Technically even one document--or zero documents--can be considered a stream.)

A YAML document contains only one "thing": a scalar value, a sequence (i.e., an array), or a mapping (i.e., a hash). This "thing" may be very large and complex, but the bottom line is: it's only one thing. (An analogous example is a perl subroutine that returns a single reference and not a list of references.)

For example, the following is not a valid YAML document:

--- "thing 1" "thing 2" ...
 * 1) invalid document

The reason it isn't valid is that it contains two quoted scalars that are not part of a larger data structure. (Note that a YAML stream is not itself a serialized data structure; rather it is how YAML stores, separates, and presents individual serialized data structures.)

If you intended to present a list of two things, then you could make this document into a sequence, e.g.,

--- - thing 1 - thing 2 ...

Note that the quotes have been removed (because they can be here) for improved readability. Note also that this is now one thing--a sequence--that contains two elements (scalars).

On the other hand, if you intended to present two separate things (two separate data structures), then you could turn that original (invalid) document into two documents by inserting a document marker, " ":

--- thing 1 --- thing 2 ...

Note again that the quotes have been removed, because they can be here.

The reason that quotes were necessary to produce the invalid document is because the following document is valid:

--- thing 1 thing 2 ...
 * 1) not really two things

However, it does not contain two things; it contains one scalar whose value is " " (because of line folding).

Scalar Values
Arguably, the bottom line of any data structure is a scalar value, e.g., a number: 1, 2, 3; a string: "hello, world"; a boolean value: true, false; a null value, etc. More complex structures are typically collections (sequences and mappings) of scalar values. Or they are collections of collections--but the bottom line is still the scalar at the end of a given path.

In YAML, scalar values can be presented a number of ways:

Flow Scalars
--- "Hello, World.\n" ...
 * the Double-Quoted Style, e.g.,

--- 'the quick brown fox jumped over the lazy dog' ...
 * the Single-Quoted Style, e.g.,

--- PI is 3.141592653589 ...
 * the Plain Style (i.e., no quotes), e.g.,

Block Scalars
---   $Data::Dumper::Pair = " : ";       # specify hash key/value separator print Dumper($boo); ...
 * the Literal Style (also no quotes, but using the "block scalar header": " "), e.g.,

--- >   Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal.
 * and the Folded Style (also no quotes, but using the block scalar header: " "), e.g.,

Now we are engaged in a great civil war, testing whether that nation or any nation so conceived and so dedicated can long endure. ...

Line Folding
It is important to note that even though the previous two block style scalars consist of multiple lines, they each represent just one scalar value. In the literal style, the line breaks between lines are included in the scalar value. In the folded style, all of the line breaks--with a couple of exceptions--are line folded, i.e., each line break and any spaces immediately following (i.e., at the beginning of the next line) are "folded" into just one space.

The line breaks that are exceptions: blank lines between text and the final line break. These are treated slightly differently in line folding, and the result is that a line break will be included in the scalar value at these points. In addition, the final line break may be excluded from the scalar by adding a block chomping indicator right after the block scalar header, " " or " ".

The three flow styles (double quoted, single quoted, and plain) may also be written across multiple lines, i.e., they may flow from line to line. The line breaks here are treated like those in the block folded style: they are "folded" into a single space. So the examples above could be written this way: --- "Hello, World.\n" --- 'the quick brown fox jumped over the lazy dog' --- PI is 3.141592653589 ...

These are all equivalent to the previous examples, even though the second one above contains a lot of extra spaces. Line folding turns a line break and any spaces immediately following a line break into a single space.

Unlike with the block style scalars, the final line break a the end of a plain scalar will not be included in the scalar value.

Indentation
A key concept to keep in mind is this: YAML pays very close attention to indentation.

In the literal and folded scalar styles above, the indentation of the first line of text is a declaration to YAML that those indentation spaces--and any similar indentations of subsequent lines--should not be included in the scalar value.

For example, the following three documents are equivalent: --- This is a test This is a test ---         This is a test This is a test --- "This is a test\nThis is a test\n" ...

The indentation in the second document is not considered to be part of the scalar data.

Similarly, the following documents describe the same scalar value: --- > Now is the time for all good men to come to the aid of their party. --- >               Now is the time for all good men to come to the aid of their party. --- "Now is the time for all good men to come to the aid of their party.\n" ...

As stated above, YAML uses the indentation of the first line to determine the indentation level to look for in following lines.

Under-indented following lines are an error, e.g., --- >       clowns to the left of me    jokers to the right ...
 * 1) invalid document

Over-indented following lines cause extra spaces to be included in the scalar. For example, the following documents result in the same scalar value: --- >   here I am        stuck in the middle with you --- "here I am    stuck in the middle with you\n" ...

Indentation is still playing a large part in this example. The extra spaces between " " and " " do include the four spaces to the left of " ", but do not include the indentation spaces to the left of that. These indentation spaces match the ones YAML recognized in the first line. The fifth space between " " and " " in the second document represents the space YAML puts in the place of the line break following " " in the first one.

Indentation is also key in the block collection styles (i.e., for sequences and mappings) discussed below.

Using Quotes
There are two scalar styles that use quotes, the double quoted style and the single quoted style. None of the other scalar styles, plain, literal, or folded, allow you to put quotes around the data.

Double Quotes
You can always use the double quoted style. It might prove awkward for some values, but the bottom line is: you can always use the double quoted style.

If you do use the double quoted style, you have to remember three rules:


 * 1) A double quote that is part of the data must be escaped as " ".
 * 2) A backslash that is part of the data must be escaped as " ".
 * 3) Unprintable unicode characters must be escaped using an appropriate style, e.g., " ", " ", or " ".

For example: --- "He says, \"The rules are: 1) a double quote must be escape as \\\", and   2) a backslash must be escaped as \\\\.    3) an unprintable character must be escaped as \\x1F, \\udfff, or \\U0000ffff\"" ...

This adds a lot of punctuation and decreases readability, and that's why YAML provides so many other ways to represent scalars.

There are two occasions when you must use double quoted scalars: 1) if you want to, or 2) if you have to, use escaped characters, like " ", " ", ", " , " ".

For example, you might want to use " " to make a YAML document that contains no line breaks, and you might have to use other escaped characters if your data contains control or other characters that are unprintable.

Single Quotes
Anywhere that you can use a plain style scalar, you can also put it in single quotes. And you can use single quotes for occasions when you can't use a plain scalar and you don't need double quotes, e.g., if your data has leading or trailing spaces.

If you use the single quoted style, you have to remember one rule:


 * 1) A single quote that is part of the data must be escaped as " ".

For example: --- 'He says, "The rule is:   1) a single quote must be escaped as ."' ...

Another thing to remember: you cannot use any of the escaped characters that are available only in the double quoted style. If you happen to put " " in a single quoted scalar, it would simply represent two characters, " " and " "; it would not represent the one line break character.

Quotes or No Quotes
For readability, unquoted scalars excel. But as noted above, there are times when quotes must be used. This section attempts to outline some rules of thumb for deciding when to use quotes.


 * double quoted
 * leading or trailing blanks
 * optional escape characters, like line breaks
 * mandatory escape characters (unprintable)
 * data contains special characters not allowed in plain scalars
 * single quoted
 * leading or trailing blanks
 * data does not contain line breaks
 * data contains special characters not allowed in plain scalars
 * plain (no quotes)
 * short lines with no special characters and no line breaks
 * literal (no quotes)
 * multiple lines that need to be left as they are, including line breaks
 * folded (no quotes)
 * long lines that need to be formatted for readability, possibly including line breaks
 * multiple lines that need to be left as they are, including line breaks
 * folded (no quotes)
 * long lines that need to be formatted for readability, possibly including line breaks
 * long lines that need to be formatted for readability, possibly including line breaks

Following are examples showing which special characters are not allowed in plain scalars and might warrant using quotes.

--- - '[ed. Hartford]' - '' - ',,,,,,,,,' - '#CCC' ...
 * Data begins with " ", " ", " ", " ", " ", " ". These would be interpreted by YAML as something other than a scalar

--- - '1998 : New York, N.Y.' - "Hanna Reitsch: Hitler's Female Test Pilot" ...
 * Data contains " ", which would make it look like a key/value pair.

Sequences
A sequence (i.e., "list", "array") is an ordered collection of elements. Each element in a sequence may be a scalar, another sequence, or a mapping. When a sequence is parsed into a native data structure, each element is numbered, usually starting at zero, i.e., 0, 1, 2, 3, etc.

In YAML, sequences may be represented two ways:

--- [ apple, orange, peach, tomato ] ...
 * as Flow Sequences, e.g.,

--- - apple - orange - peach - tomato ...
 * or as Block Sequences

In the examples above (which are equivalent), the sequences contain four scalar elements. Those scalars are represented using the plain scalar style.

Flow Sequences
A flow sequence begins with " " and ends with " ". The elements in a flow sequence are separated with " ". An ending " " is allowed, so flow sequences like the following are easier to maintain. --- [ apple, orange, peach, tomato, ] ...

Note that this trailing comma expressly does not indicate an extra null element at the end--this sequence is equivalent to the previous two.

Note also that a flow sequence may "flow" from one line to the next; line breaks and spaces between the structure indicators are not included in the scalar values.

Block Sequences
A block sequence begins at the first occurrence of " ", i.e., a hypen followed by a space. The space is significant; a hyphen that is not followed by a space is considered to be (most likely) part of a plain scalar, e.g., --- -1 ...

The above document does not contain a sequence with one element: " ". Rather it contains one scalar value: " ".

A block sequence continues with each occurrence of " " that is at the same indentation level as the first one. The above block sequence could have been written as --- - apple - orange - peach - tomato ...

If you decrease the indentation level of an element, it is an error, e.g., ---   - apple - orange - peach - tomato ...
 * 1) invalid document

If you increase the indentation level of an element, you may find that the result is very different than you intended. Above we saw that a plain scalar may be broken across lines for readability. If we were to increase indentation levels in the above document like this: ---   - apple - orange - peach - tomato ...

it would not cause " " and " " to somehow become a child sequence under " ". Instead, the over-indented lines are considered continuation lines of the plain scalar that started with the word " ", so the document would become equivalent to: ---   - apple - "orange - peach - tomato" ...

Even though the above document is valid YAML, over-indentations like this are clearly not representing the intended data, so should just be considered errors. Rule of thumb: be careful to line up all the hyphens in a block sequence.

Mappings
A mapping (i.e., "hash", "hash table", "dictionary", "associative array") is an named collection of elements, or a collection of key/value pairs, where the keys are unique within the mapping. Each key and each value in a mapping may be a scalar, a sequence, or another mapping.

In YAML, mappings may be represented in two ways:

--- { apple: fruit, orange: fruit, peach: fruit, tomato: vegetable } ...
 * as Flow Mappings, e.g.,

--- apple: fruit orange: fruit peach: fruit tomato: vegetable ...
 * or as Block Mappings

In the examples above (which are equivalent), the mappings contain four key/value pairs whose keys and values are scalars. Those scalars are represented using the plain scalar style.

Flow Mappings
A flow mapping begins with " " and ends with " ". In each key/value pair, the key is separated from the value by " ", i.e., a colon followed by a space. The space after the colon is normally required. Spaces between the key and the colon are allowed. But note that if the key is in quotes (like JSON), the space after the colon is optional, e.g., --- { 'apple' :fruit, 'orange' :fruit, 'peach' :fruit, 'tomato' :vegetable } ...

The key/value pairs in a flow mapping are separated with " ". An ending " " is allowed, so flow mappings like the following are easier to maintain: --- { apple:  fruit, orange: fruit, peach: fruit, tomato: vegetable, } ...

Note that this trailing comma expressly does not indicate an extra null key/value pair at the end--this mapping is equivalent to the previous ones.

Block Mappings
A block mapping begins typically at the first occurrence of a scalar followed by " ", i.e., a colon followed by a space. The space after the colon is required. Spaces between the key and the colon are allowed, e.g., --- apple : fruit orange : fruit peach : fruit tomato : vegetable ...

Note that unlike in a flow mapping, the space after the colon is always required in a block mapping, even if the key is in quotes.

A block mapping continues with each key/value pair that is at the same indentation level as the first pair. The above block mapping could have been written as --- apple: fruit orange: fruit peach: fruit tomato: vegetable ...

If you decrease the indentation level of a key/value pair, it is an error, e.g., ---   apple: fruit orange: fruit peach: fruit tomato: vegetable ...
 * 1) invalid document

If you increase the indentation level of a key/value pair, it is also likely an error, e.g., ---   apple: fruit orange: fruit peach: fruit tomato: vegetable ...
 * 1) invalid document

Key/Value Indicators
Note that technically speaking, the " " between a key and its corresponding value is really an indicator that precedes a value in a key/value pair. There is a similar indicator for keys: " ", a question mark followed by a space.

Because YAML typically figures out where a key is from the location of the value indicator " ", you typically will not see key indicators in a document. But it is perfectly okay to include them, and in some cases, e.g., a long or complicated key, you must include this indicator to help YAML recognize the key.

For example, the above flow representations could be written as --- { ? apple: fruit, ? orange: fruit, ? peach: fruit, ? tomato: vegetable } --- { ? apple  : fruit, ? orange : fruit, ? peach : fruit, ? tomato : vegetable, } ...

In a block mapping, if you use the key indicators, the value indicators must be put on a new line (and they must line up with the key indicators), e.g., --- ? apple
 * fruit

? orange
 * fruit

? peach
 * fruit

? tomato
 * vegetable

...

Since spaces are allowed between the indicators and the scalar values (or possibly collections instead of just scalars), you might use extra spaces to help the values stand out from the keys, e.g.,

--- ? apple
 * fruit

? orange
 * fruit

? peach
 * fruit

? tomato
 * vegetable

...

Comments
In YAML, comments begin with the " " comment indicator and end at the end of the line they start on. Comments must be separated from other content by white space, which means they typically start on a line of their own, or there is a space in front of the " " comment indicator.

Otherwise, comments may appear almost anywhere. Since comments are not associated with a particular node, they typically don't have to adhere to the usual rules about indentation. The following examples show where comments may appear in the data structures covered so far. Clearly, these examples couldn't be considered style guides. Good style probably suggests keeping comments to a minimum, or at least going easy with them.

Scalars
--- # double quoted
 * 1) quoted scalars

"Hello, World.\n" # comment okay after quoted scalar
 * 1) comment
 * 1) comment

--- # single quoted

'the quick brown fox jumped over the lazy dog' # again, comment okay
 * 1) comment
 * 1) comment

--- # plain scalars

PI is 3.141592653589 # comment okay after plain scalar
 * 1) comment
 * 1) comment

--- # block literal style scalar

# NOT a comment (expressly by the specs) $Data::Dumper::Pair = " : ";   # NOT a comment print Dumper($boo);
 * 1) comment
 * # comment

# NOT a comment (indented like above)


 * 1) comment (indented less than above)
 * 2) comment (indented less than above)

--- # block flow style scalar

> # comment Now is the time for all good men to come to the aid of their party.
 * 1) comment
 * 1) NOT a comment


 * 1) NOT a comment (same indentation level as scalar above)

...

There are several places above where values that contain " ", and might at first glance appear to be comments, are not comments. These examples illustrate that putting comments around block literal and block folded scalars can be a problem if you don't get the indentations right. A rule of thumb: it is safer to put comments above these scalars than below (comments may never appear to the right of block style scalars.)

Sequences

 * 1) comments around sequences

--- # single-line flow sequence

[ apple, orange, peach, tomato ] #comment
 * 1) comment
 * 1) comment

--- # multiline flow sequence

[ # comment
 * 1) comment
 * 1) comment

apple, # comment orange, # comment peach, # comment tomato, # comment
 * 1) comment

] # comment
 * 1) comment
 * 1) comment

--- # block sequence


 * 1) comment

- apple   # comment - 'orange' # comment - peach # comment - tomato
 * 1) comment

...

Mappings
--- # single line flow mapping
 * 1) comments around mappings

{ apple: fruit, orange: fruit, peach: fruit, tomato: vegetable } # comment
 * 1) comment
 * 1) comment

--- # multiline flow mapping

{ # comment
 * 1) comment
 * 1) comment

apple: fruit, # comment orange: fruit, # comment peach: fruit, # comment tomato: vegetable, # comment

} # comment
 * 1) comment
 * 1) comment

--- # block mapping
 * 1) comment

apple: fruit   # comment orange: 'fruit' # comment peach: fruit # comment tomato: vegetable
 * 1) comment

...

Mappings of Mappings
Below is a mapping of mappings. The top-level keys, Apple, Orange, Peach, Tomato, are lined up with each other. The indentation of Kingdom, Division, etc. below each top-level key serves to group the key/value pairs in each child mapping. While it is arguably good style to keep indentation consistent for all the mappings, varying this from mapping to mapping does not invalidate the data (note that the mapping below Tomato is indented slightly less). As long as the keys of each separate mapping are lined up in that mapping, YAML will group them correctly.

It may not be readily visible, but this example illustrates an acceptable use of tabs as white space. Tabs may not be used to indent content (for the purpose of establishing structure), but may be used in other places where whitespace is allowed. Below, there is a tab between each child mapping key and its value, e.g., between "Kingdom:" and "Plantae".

As a matter of style, using tabs can be problematic, since their alignment can depend on how the data is being viewed. For example, the child mapping under Tomato is indented less than the other child mappings because the longer key, Subkingdom, caused the value, Tracheobionta, not to align nicely (at least in a typical browser). In practice, using spaces instead of tabs is usually preferable.

--- Apple: Kingdom: 	Plantae Division: 	Magnoliophyta Class: 	Magnoliopsida Order: 	Rosales Family: 	Rosaceae Subfamily: 	Maloideae Genus: 	Malus Species: 	M. domestica Orange: Kingdom: 	Plantae Division: 	Magnoliophyta Class: 	Magnoliopsida Subclass: 	Rosidae Order: 	Sapindales Family: 	Rutaceae Genus: 	Citrus Species: 	C. sinensis Peach: Kingdom: 	Plantae Division: 	Magnoliophyta Class: 	Magnoliopsida Order: 	Rosales Family: 	Rosaceae Genus: 	Prunus Subgenus: 	Amygdalus Species: 	P. persica Tomato: Kingdom: 	Plantae Subkingdom: 	Tracheobionta Division: 	Magnoliophyta Class: 	Magnoliopsida Subclass: 	Asteridae Order: 	Solanales Family: 	Solanaceae Genus: 	Solanum Species: 	S. lycopersicum ...

Mappings of Sequences
The following example reorganizes the data above into a slightly difference configuration. You might do this to keep the child fields in order (and reduce file size).

Note that the keys in the mapping are lined up, as are the hyphens that mark each element in the sequences.

Note also the "bare" hyphens with no values. These connote empty fields, and are needed as a placeholders for when the Subkingdom field doesn't apply.

--- Legend: - Kingdom - Subkingdom - Division - Class - Order - Family - Subfamily - Genus - Species Apple: - Plantae -   - Magnoliophyta - Magnoliopsida - Rosales - Rosaceae - Maloideae - Malus - M. domestica Orange: - Plantae -   - Magnoliophyta - Magnoliopsida - Rosidae - Sapindales - Rutaceae - Citrus - C. sinensis Peach: - Plantae -   - Magnoliophyta - Magnoliopsida - Rosales - Rosaceae - Prunus - Amygdalus - P. persica Tomato: - Plantae - Tracheobionta - Magnoliophyta - Magnoliopsida - Asteridae - Solanales - Solanaceae - Solanum - S. lycopersicum ...

Sequences of Mappings
The following example is a sequence of mappings. In fact, it is a sequence of mappings of sequences of mappings. Described that way, it sounds complicated. But presented as YAML data, with everything in nice vertical alignment, it is quite easy to follow.

---   - Name:    Alajos Szokolyi Record: - Medal:   Bronze Olympics: 1896 Athens Event:   100 metres
 * 1) some olympic athletes

- Name:   Arthur Blake Record: - Medal:   Silver Olympics: 1896 Athens Event:   1500 metres

- Name:   Tom Burke Record: - Medal:   Gold Olympics: 1896 Athens Event:   100 metres - Medal:   Gold Olympics: 1896 Athens Event:   400 metres

- Name:   Ellery Clark Record: - Medal:   Gold Olympics: 1896 Athens Event:   High jump - Medal:   Gold Olympics: 1896 Athens Event:   Long jump

- Name:   James Connolly Record: - Medal:   Gold Olympics: 1896 Athens Event:   Triple jump - Medal:   Silver Olympics: 1896 Athens Event:   High jump - Medal:   Bronze Olympics: 1896 Athens Event:   Long jump - Medal:   Silver Olympics: 1900 Paris Event:   Triple jump

- Name:   Thomas Curtis Record: - Medal:   Gold Olympics: 1896 Athens Event:   110 metre hurdles

- Name:  Kharilaos Vasilakos Record: - Medal:   Silver Olympics: 1896 Athens Event:   Marathon

- Name:  Sotirios Versis Record: - Medal:   Bronze Olympics: 1896 Athens Event:   Discus throw - Medal:   Bronze Olympics: 1896 Athens Event:   Two hand lift ...

Sequences of Sequences
The following example shows another possible way to list our fruit: as a sequence of sequences. The first element lists the field names that apply to the data in the remaining elements, not unlike a CSV file. Note that the hypens for the top-level sequence are lined up, as are those for the child sequences. Extra spacing illustrates that YAML allows this as needed to improve readability.

Note also the "bare" hyphens with no values. As they did above, these connote empty fields, and are needed as placeholders for when the Subkingdom field doesn't apply.

--- -  - Name - Kingdom - Subkingdom - Division - Class - Order - Family - Subfamily - Genus - Species

-  - Apple - Plantae -   - Magnoliophyta - Magnoliopsida - Rosales - Rosaceae - Maloideae - Malus - M. domestica

-  - Orange - Plantae -   - Magnoliophyta - Magnoliopsida - Rosidae - Sapindales - Rutaceae - Citrus - C. sinensis

-  - Peach - Plantae -   - Magnoliophyta - Magnoliopsida - Rosales - Rosaceae - Prunus - Amygdalus - P. persica

-  - Tomato - Plantae - Tracheobionta - Magnoliophyta - Magnoliopsida - Asteridae - Solanales - Solanaceae - Solanum - S. lycopersicum ...

A more compact representation might use flow sequences:

--- - [ Name,  Kingdom, Subkingdom,    Division,      Class,         Order,     Family,     Subfamily,  Genus,     Species         ] - [ Apple, Plantae, !!null "",     Magnoliophyta, Magnoliopsida, Rosales,   Rosaceae,   Maloideae,  Malus,     M. domestica    ] - [ Orange, Plantae, ~,            Magnoliophyta, Magnoliopsida, Rosidae,   Sapindales, Rutaceae,   Citrus,    C. sinensis     ] - [ Peach, Plantae, ~,             Magnoliophyta, Magnoliopsida, Rosales,   Rosaceae,   Prunus,     Amygdalus, P. persica      ] - [ Tomato, Plantae, Tracheobionta, Magnoliophyta, Magnoliopsida, Asteridae, Solanales, Solanaceae, Solanum,   S. lycopersicum ] ...

An advantage of this approach is being able to line the data up into columns. A disadvantage is needing to use  to explicitly denote an empty field. (Most YAML implementations allow writing an unquoted  instead).

Other Combinations
Below is a single element in a sequence from a hypothetical album catalog.

--- - Title: Come Upstairs, Studio album by Carly Simon Artist: Name: Carly Simon Chronology: - Spy (1979) - Come Upstairs (1980) - Torch (1981) Released:  1980-06-16 Recorded:  at Power Station Studios, New York City Genre:     Rock Length:    39:23 Label:     Warner Bros.  Producer:   Mike Mainieri Professional reviews: > Come Upstairs is singer-songwriter Carly Simon's 10th album, and ninth studio album, released in 1980. It was the first of her three albums for Warner Bros. Records and it featured a harder, rock-oriented sound than her previous albums. Whereas those earlier records were prime examples of the singer-songwriter genre, with soft-rocking arrangements primarily built around piano and/or acoustic guitar accompaniment, Come Upstairs featured electric guitars and synthesizers prominently. In the vernacular of     the time, Come Upstairs was a "new wave" album, and it followed the lead of new bands such as The Knack and Talking Heads. The album was generally well received, and many critics thought that Simon proved herself more than capable of this artistic leap.

The first single released from the album was "Jesse", an acoustic ballad that was more a throwback to Simon's earlier work rather than an example of her new abilities. "Jesse" proved to be a major hit, staying on the Billboard singles charts for 6 months and achieving gold status (sales of more than 1,000,000 copies), as well as reaching #5 in     Australia and being her biggest hit there since "You're So Vain". Fans apparently chose to buy the single rather than the album, which did not "go gold" and quickly went out of print. It was one of the last of Simon's early albums to be released on CD. Track listing: - 1. "Come Upstairs" (C. Simon/M. Mainieri) — 4:18 - 2. "Stardust" (C. Simon/M. Mainieri) — 4:13 - 3. "Them" (C. Simon/M. Mainieri) — 3:44 - 4. "Jesse" (C. Simon/M. Mainieri) — 4:15 - 5. "James" (C. Simon/M. Mainieri) — 2:28 - 6. "In Pain" (C. Simon/D. Grolnick/M. Mainieri) — 6:10 - 7. "The Three Of Us In The Dark" (C. Simon/M. Mainieri) — 4:14 - 8. "Take Me As I Am" (C. Simon/M. Mainieri/S. McGinnis) — 4:50 - 9. "The Desert" (C. Simon/M. Mainieri) — 4:44 Cover: MIME: image/jpeg DATA: !!binary | /9j/4AAQSkZJRgABAQEASABIAAD/2wBDAAYEBQYFBAYGBQYHBwYIChAKCgkJChQODwwQFxQYGBcU FhYaHSUfGhsjHBYWICwgIyYnKSopGR8tMC0oMCUoKSj/2wBDAQcHBwoIChMKChMoGhYaKCgoKCgo KCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCj/wAARCADFAMgDASIA AhEBAxEB/8QAHAAAAQUBAQEAAAAAAAAAAAAABAACAwUGBwEI/8QAPRAAAgECBQIEAwYDCAMAAwAA AQIDBBEABRIhMRNBBiJRYRRxgSMykaGxwQcVQiQzUmJy0eHwFkOCktLx/8QAGQEAAwEBAQAAAAAA ...       k5RV/Rt8sqHqKJACUJLLcfTti4oY41y1GeNZFbyurC4Ye+FhYPEk+zcmoujP1vgShqZ5KzK6qpyu WTdkgIMZb10n9iMc7/mWcrnoyp8yBVTvKsOliNvfY4WFhrit/wCDynJJbLnMMjiqoZmjkMTQB/MR qZ/Jfc39b74wtRTiKcxFixi8uri5He30wsLFXitvHs5/nJLJojDF1Vtr7Lh7uxcOrFWCgDTtYemF hYzN+x7xm1j0MLOU6Zkfp33W+xw2SEBmBYkjvhYWPRSG8m7s80AG/cfviWODUR5u3phYWNpAxbLK soekkUKuuiysbLuxIvc79uAMRRq7KkUjiSO+yONh6cHnCwsbwjxug5SaboGni0sbEetrYWFhYOIl n//Z ...