What is Archie Markup Language?

ArchieML (or "AML") was created at The New York Times to make it easier to write and edit structured text on deadline that could be rendered in web pages, or more specifically, rendered in interactive graphics.

One of the main goals was to make it easy to tag text as data, without having type a lot of special characters. Another goal was to allow the document to contain lots of notes and draft text that would not be read into the data. And finally, because we make extensive use of Google Documents's concurrent-editing features — while working on a graphic, we can have several reporters, editors and developers all pouring information into a single document — we wanted to have a format that could survive being edited by users who may never have seen ArchieML or any other markup language at all before.


Why not YAML? Or JSON?

ArchieML differs from other popular formats like YAML and JSON in several areas that we've found are key to making it easy to use:


How Does It Work in Practice?

For a very simple example, here's a screenshot of the Google Doc that powers a recent graphic about the trick plays used by the New England Patriots and Seattle Seahawks:

To generate the graphic, we load the ArchieML data from the document using the archieml-js npm module, then pass it to an underscore template to render the final markup server-side. This lets the journalists who are focusing on the text and content concentrate on getting the copy in shape independently of the developers working on the graphic.

While this is a very simple example, with only a few bits of text and data and one comment at the end that is ignored, when we're covering a breaking news story, we can have a half-dozen people all contributing to a Google Doc at the same time as we gather all the information we need for a graphic and turn it into the final copy blocks that make their way into the finished piece.

Resources

Parsers and tools in (hopefully) your language of choice.

Integrating with Google Documents

At The New York Times, we normally write ArchieML in Google Documents. Both parsers include quick-start examples for how to download text from Google Docs and run it through the parser. They also include some formatting steps we take, such as converting links to HTML tags.
Examples:

For more fully-fledged integrations with Google Docs, use one of the plugins above.


Introductory Demo

Click on any ArchieML textarea to try it out yourself, and see how changes affect the output.

Or try out ArchieML in the Sandbox.

Keys and values

Strings can be stored as part of key/value pairs, defined whenever a line in ArchieML begins with a token followed by a colon. Keys can contain any unicode character, with the exception of whitespace / invisible characters, and a handful of characters that are used within ArchieML ({ } [ ] : . +). The rest of the string is the value.

key: This is a value ☃: Unicode Snowman for you and you and you!

Whitespace surrounding keys and values is ignored. Indent as you like. Keys are case sensitive.

1: value 2:value 3 : value 4: value 5: value a: lowercase a A: uppercase A

Lines that don't look like keys or other special commands are ignored:

This is a key: key: value It's a nice key!

Nested key structure

Use dot-notation to create nested objects.

colors.red: #f00 colors.green: #0f0 colors.blue: #00f

You can also use "object" blocks to namespace a group of keys.

{colors} red: #f00 green: #0f0 blue: #00f {} {numbers} one: 1 ten: 10 one-hundred: 100 {} key: value

Dot notation can be used in object blocks as well:

{colors.reds} crimson: #dc143c darkred: #8b0000 {} {colors.blues} cornflowerblue: #6495ed darkblue: #00008b {}

You can close an object by beginning a line with {}. ArchieML is parsed one line at a time, so you can also close an object by opening a new one.

{colors} red: #f00 green: #0f0 blue: #00f {numbers} one: 1 ten: 10 one-hundred: 100 {} key: value

Nested object blocks

Object blocks with names prepened with a period nest inside of open objects instead of ending them. Beginning a line with {} closes a nested object and returns to the parent.

{colors} red: #f00 green: #0f0 blue: #00f {.numbers} one: 1 ten: 10 one-hundred: 100 {} nestedKey: nestedValue {months} january: 0 february: 1

Arrays of objects

Groups of keys can be placed inside an array by giving the array a name within brackets. The name of the array can be any valid key, and can use dot-notation. You can optionally end an array with an empty set of brackets, or by beginning a new array.

[scope.array] []

All keys inside the array are inserted into a single object within the array. The parser remembers the first key it found, and whenever it encounters it again, a new object is started.

[arrayName] Jeremy spoke with her on Friday, follow-up scheduled for next week name: Amanda age: 26 # Contact: 434-555-1234 name: Tessa age: 30 []

Arrays of strings

You can also create "simple" or "flat" arrays of strings. If an asterisk is encountered first within an array, that array will become a simple array, and key/value pairs within it will be ignored. If a key/value pairs is encountered first, then asterisk lines will be ignored.

[days] * Sunday note: holiday! * Monday * Tuesday Whitespace is still fine around the '*' * Wednesday * Thursday Friday! * Friday * Saturday []

Nested arrays

Array elements can contain arrays of their own. To begin an array while inside an array element, prepend its name with a period.

[array] [.subarray] [.subsubarray] key: value

Much like nested object blocks, nested arrays must be "closed" with empty brackets in order to move up to the parent level.

[days] name: Monday [.tasks] * Clean dishes * Pick up room [] name: Tuesday [.tasks] * Buy milk []

Freeform arrays

Freeforms are a third type of array that was created to have better control over presentation from within ArchieML.

Unlike regular arrays, which group lines into objects whose values have no order, freeforms preserve the order of each of its lines. Clients that use ArchieML's output can then use that order to render the values, allowing you to vary the presentation for each array item.

[+books] kicker: Books you should read score: ★★★★★!!! title: Wuthering Heights author: Emily Brontë title: Middlemarch author: George Eliot score: ★★★★☆ []

Each line becomes its own object, with a type and value. ArchieML splits these two words into separate objects to make it easier to deal with different type of information; rendering logic can always be based on the content of the type attribute.

Freeforms also allow you to type unstructured lines of text, which are included as items in the array with a type of text. Note that this means that comments do not work within freeforms.

[+text] I can type words here... And separate them into different paragraphs without tags. []

Having full control over order is useful when arrays need to be mixed with other types of data. For example, showing a list of events interspersed with general artwork.

[+events] header: My Birthday date: August 20th, 1990 {.image} src: https://example.com/photo.png alt: Family Photo {} header: High School Graduation date: June 4th, 2008 []

Multi-line values

Values automatically end when a newline is encountered. But all subsequent text is read into a buffer that can be added to that key. Anchor the end of a multi-line value by following the value with a line beginning with ":end". All whitespace within the block is preserved.

Try removing the last line to see how it changes the output:

key: value More value Even more value :end

Works within object and simple arrays

[arrays.complex] key: value more value :end [arrays.simple] * value more value :end

Escape characters

You can place any text inside of a multi-line value. If one of your lines would be interpreted by the parser as a key or some other special command, you may have to escape that line by adding a backslack to the beginning of it. The backslash won't be included in the value.

Try removing the backslashes from the following lines:

key: value \:end :end key: value \more: value :end key: value [escaping * is not necessary if we're not inside an array, but will still be removed] \* value :end key: value \:ignore \:skip \:endskip :end

Block comments

Wrap text between lines that begin with ":skip" and ":endskip" to ignore blocks of text.

:skip this: text will: be ignored :endskip

There is also a safety mechanism of sorts built in. When the parser encounters a line beginning with ":ignore" (even if it's within a :skip block), parsing immediately stops, and the rest of the document is ignored.

key: value :ignore [array] * Blah [] other-key: other value

Usage

If you use JavaScript or Ruby, we hope you'll try one of the existing ArchieML parsers.

If you want to make a parser yourself (or want the technical details on the format), the full specification is online here.

Questions or concerns? The Github repository for this site is available at newsdev/archieml.org, and you can use its Issues page to submit questions or bugs on the spec itself.


Created by Michael Strickland, Archie Tse, Matthew Ericson and Tom Giratikanon / The New York Times

Copyright (c) 2015 The New York Times Company