A tutorial on how to encode music in the Humdrum syntax for VHV.

This tutorial covers the basics of representing graphic music notation in the Humdrum syntax. Mostly this involves encoding music within the **kern data representation. For more advanced notational features, other data types are used, such as **dynam for dynamics and **text for lyrics. Also see other pages in the Humdrum encoding section of the navigation menu for special encoding topics (to the left or above).

Each musical example in the tutorial below is interactive, so trying tweaking the examples to see what happens. A text box containing the Humdrum data used to produce the notation is given on the left side of each notation example. The text in these boxes is editable, and changing the text will update the notation as you type. [If the notation no longer updates when editing the Humdrum text, you will have to reload the webpage to restart verovio again.]


Here is a short music example of notes, each containing a rhythm and then a pitch:

4 represents a quarter note, and c, d, e and f represent the first four pitches of the fourth octave (middle-C octave). The musical example is interactive and changes whenever you type text into the box to the left, so try adding pitches for G, A and B in the same octave to the notation like this:

Experiment with the numbers to generate other rhythms, such as half notes and eighth notes:

Note the order of rhythm and then pitch in the encoded notes. The order can be reversed, although the canonical order is rhythm first, then pitch. Try reversing the order of a note such as 4c to c4 and see what happens to the note in the graphical notation.


VHV will automatically choose the most suitable clef (either bass or treble) to fit the pitch range of the notes on a staff, but typically a clef is encoded explicitly for the music. Clefs are encoded in interpretation tokens that start with a single * followed by the string clef and then the shape and line position of the clef. For example, a treble clef is *clefG2, with G meaning a G-clef, and 2 meaning that the clef is centered on the second line up from the bottom of the staff. The bass clef is *clefF4 since it is an F-clef on the fourth line of the staff.

Try moving the clefs to different staff lines by changing the number after the clef shape in the above example textbox.

A vocal tenor clef is represented by *clefGv2, where the v means the music should be played an octave lower than the regular clef’s sounding pitches. Try creating a vocal tenor clef in the above interactive example. The v operator also works on the other clefs (but these sorts of clefs are very rare). Another rare clef is *clefG^2 which is the opposite of *clefGv2, where the music is written an octave lower than actually sounding pitch for the normal form of the clef. You can also try to create exotic two-octave clefs by doubling the ^^ and vv markers.


The octave position of a note is indicated in **kern data as various repetitions of the pitch name as an upper- or lower-case letter. The middle-C octave is indicated by single lower case letters, and each successively higher octave repeats an additional lower-case letter to indicate the octave:

Lower octaves are indicated by successive repetitions of upper-case letters:


Barlines are indicated by a token starting with an equals sign. A optional number can immediately follow the equals sign, representing a measure number. VHV will display the bar number at each system break, excluding a label for measure 1:

Following the measure number can be an optional styling for the barline. - means an invisible barline, || is a double thin barline, |!: is a left-repeat, :|!|: is a left-right repeat. A special style is ==, representing a final barline. This bar usually does not include a bar number, since it is the last one in the piece/movement.

Humdrum can represent more barline styles, but the above types are currently supported in verovio.

Pick-up measures

Typically the initial barline should be labeled in the Humdrum encoding even though it is not printed. Either =1 or =1- should be used at the start of the first measure (with - meaning an invisible barline), but the starting barline will be automatically suppressed if it is a regular barline. Labeling the first bar is necessary if you need to extract it by measure number with a tool such as myank.

Pickup beats do not have a barline at their start. If a measure does not match the duration of the time signature at the start of the music and the data does not start with a barline, it will be automatically interpreted as a pick-up measure and automatically assigned 0 as the measure number for use with the myank filter.

Here is example music with a pick-up beat, and a typical ending that drops the duration of the pickup measure:

Time signatures

Time signatures are interpretations in the form *MX/Y where X is the top number of the signature and Y is the bottom number of the signature:

Meter symbols

Cut-time and Common-time symbols can be displayed by including an interpretation in the form *met(X), where X is c| for cut-time, or c for common time:

The time signature equivalent of the meter symbol should still be encoded, typically immediately before the meter symbol. Try deleting the meter symbol lines from the above example and see what happens.

Most mensural-music mensuration signs are also available with *met():

Notice that *met(c) is used for the modern common-time metric sign, and *met(C) is used for the mensural symbol.


Accidentals must always be indicated along with the diatonic pitch name unless the pitch is natural. Accidentals up to double-sharp/flat can be displayed in VHV (no triple sharps/flats or higher can be rendered with verovio, although they can be represented in **kern data):

VHV will automatically calculate which accidentals should be visible in the graphic notation. In the following example, the key signature contains an F-sharp, so in the first measure no sharps are displayed on the F’s. The second measure contains an F-natural since there is no chromatic alteration of the F, so a natural is shown to cancel out the key signatures F-sharp. And at the end of the second measure, the second F has a sharp alteration which must be shown; otherwise, it would appear to be an F-natural.

Try removing the key signature, and see what happens to the accidentals displayed in the example.

Cautionary/Forced accidentals

When accidentals are not required according to the algorithm for calculating visual accidentals, but the accidentals are desired in order to add clarity for a performer, the accidental of a note can be forced to be displayed (whether it would be required or not) by adding X after the accidental.

In the first measure, two F-naturals are given, with only the first F-natural displaying its natural alteration. In the second measure, the first F has a sharp, but this sharp would not normally be displayed since it is within the key signature, and the barline canceled out the F-natural of the previous measure. To avoid confusion, the first F-sharp in the second measure should have a cautionary sharp added to the note as is shown in the above example.

Natural signs do not require X, as they are always displayed when given, but when they are used to indicate a cautionary accidental, the X is necessary. The X can also be used in diplomatic encodings to indicate a visual accidental in the source manuscript. Also, to force or indicate that an accidental is implied in a diplomatic score, use y immediately after the accidental to force it to be hidden.

Editorial accidentals

Editorial accidentals can be indicated by adding an RDF reference record for a user-signifier (character) used to mark the editorial accidentals in the data. Typically the character i is used to mark editorial accidentals in Josquin Research Project Humdrum encodings, displaying the accidental above the note:

The RDF reference record can be placed on any line of the file but typically is placed at the end of the data. Notice that editorial accidentals may affect the visual display of accidentals after the current pitch. Try making the natural sign on the last note in the above example an editorial accidental as well.

Also note that the style of the editorial accidental can be controlled from the RDF entry. Add the string brack or bracket to the RDF description so that the editorial accidentals are displayed as regular accidentals enclosed in brackets. Also try adding paren or parentheses to the RDF line to display the editorial accidental in parentheses.

Key signatures

Key signatures are represented by interpretations in the form *k[X], where X is a list of the accidentals in the order in which they are displayed in the key signature. Here a scale in C-sharp major and C-flat major, showing all of the accidentals in their proper order:

Verovio cannot handle changing the order of the flats, and key signatures such as *k[F#4F#5] (showing F-sharps on the lower and upper octave in treble clef) cannot be displayed yet.

Key designation

The key of the music is different from the key signature. Indicate the key of the music by adding an interpretation in this form *X:, where X is a pitch name plus possible accidental, with lower-case pitch names indicating minor keys, and upper-case pitch names indicating major keys. For example, *C: means C major, and *a: means A minor (note that both have the key signature *k[] where there are no sharps or flats).

The key designation typical follows the key signature, or can appear on its own if the key changes but the key signature does not.

Note that the music in the last measure does not have a key signature change even though the music modulates to C-sharp minor.

The following codes can be appended to the key designation to indicate a particular mode:

code meaning example
dor dorian *d:dor
phr phrygian *e:phr
lyd lydian *F:lyd
mix mixolydian *G:mix
aeo aeolian *a:aeo
ion ionian *C:ion
loc locrian *b:loc

If the mode is closest to a minor key, then a lower-case letter will be used for the tonic note; otherwise, modes closer to major use an upper-case for the tonic.


Chords are created by adding multiple notes to a token, separated by a single space character. Rhythms and articulations of each note should be duplicated, but not slurs, fermatas or beams.


Rhythms in **kern data are given in terms of the number of units the duration of the note will divide a whole note. Whole notes are 1 since there is one whole note in a whole note. Half notes are 2 since there are two in a whole note. Quarter notes are 4 since there are four in a whole note, etc.

An exception to the numeric system for rhythms is used for some notes longer than a whole note: breves are represented by 0, longs are represented by 00 and maximas by 000.

The shortest note value that verovio can display is a 256th note, although shorter values can be represented in **kern data.

Augmentation dots

Augmentation dots are represented by period characters (.). The first one adds 1/2 the duration of the plain note, the next adds an additional 1/4 of the original note, and so on.


Ties are indicated by attaching [ to the starting note of a tie, and ] on the ending note. For intermediate notes in a tied group, the underscore character _ indicates a previous tie ends on the note at the same time that a tie starts to the next note.


Beams work in a similar manner to ties. The L character indicates the start of a beam, and the J character indicates the end of a beam.

Try adding beams with irregular groupings in the first measure of the example.

Lazy beaming

VHV uses a lazy beaming system to beam notes together. Rather than specifying the beginning/ending of each sub-beam or beamlet direction, you instead indicate the beginning/ending of a beamed group of notes with the L and J characters (i.e., encode only the primary sub-beam):

Encoding a full description of sub-beams:

which is equivalent to this lazy-beam encoding of only the primary beam (the beam furthest from the noteheads):

autobeam filtering

The autobeam filter can be used to beam notes together automatically, based on the prevailing time signature:

Try copy-and-pasting the above content into VHV, and then press alt-c to compile the filter. This will run the Humdrum data through the filter and then display the results back in the text editor. Otherwise, the data is filtered before being converted into notation.


Tuplets are no different from regular rhythmic values since they describe how many notes of that duration sum together to create a whole-note duration. Triplet eighth notes are represented with the number 12 because twelve of them equal a whole-note duration. quintuplet sixteenth notes are represented as 20 since 20 of the equal a whole-note duration.

Extended Rhythm representation

Note durations that do not divide the duration of a whole note into an integer number of equal pieces (when also accounting for augmentation dots), must be encoded in an extended **recip representation. This system is understood by VHV and Humdrum Extras, but not in the classical Humdrum Toolkit (see rscale for using extended rhythms with the Humdrum Toolkit).

Example extended rhythms include triplet whole notes. 3/2 of a triplet whole note fill the duration of a regular whole note, so it is represented by the string 3%2. Another way of conceptualizing this is to flip the numbers in the rhythm string, noting that a triplet whole note is 2/3rds of a whole note:

Notice the 1%2 rhythm in the last measure. This represent a double whole note, since 1/2 of the double whole note is equivalent to a whole note. 0 is a special code equivalent to 1%2, 00 is equivalent to 1%4 (a long) and 000 is equivalent to 1%8 (a maxima). It is probably best to use the zero-system for breves, maximas and longs, reserving extended rhythms for readability and use %-based descriptions more complicated rhythmic cases.


Rests are represent by the character r in **kern data. Whole-measure rests are given their actual duration, and if the duration of the rest matches the time signature, then the rest will be displayed as a centered whole-measure rest regardless of its actual duration (or as a breve-measure rest if the duration of the measure is a breve or longer).


Slurs are indicated by adding ( to a token for the slur start, and ) for a slur end.

Nested slurs

Slurs can be nested by opening another slur while another one is active. The first slur closing will affect the closest slur opening to it.

Notice the direction RDF character which can be used to force the direction of a slur.

Crossing slurs

Slurs can cross each other by prefixing an ampersand (&) in front of a slur marker which crosses another slur. For more than one crossing slur at a time, additional & characters can be added to the slur prefix.

Articulations and ornaments

Here is a sampler of note articulations:

Articulations can be place on the opposite side of their automatic location by adding RDF entries to force the articulation above or below the notehead:

Here is a sampler of ornaments. Upper-case ornaments indicate whole-tone auxiliary notes, and lower-case ornaments indicate semi-tone auxiliary notes. If the auxiliary notes require accidentals in the graphical music notation, then will be added automatically, as well as cautionary accidentals that may be needed on primary notes which follow the auxiliary notes of the ornaments.


Lyrics are added to a staff by placing one or more **text spines after a **kern spine.


Dynamics are added to the staff as a separate spine to the right of a **kern spine to which the dynamics should be associated with.

When lyrics are present, dynamics will automatically be placed above the staff:

Note that the order of the **dynam and **text spines does not matter.

Multiple voices on a staff

Spine split manipulators (*^) and spine merge manipulators (*v) are used to add or remove voices/layers on a staff. Note that the highest part on the staff will typically be left most (opposite of the staff ordering from lowest to highest).

Notes sounding together in the different voices are placed on the same line, and if the other voice is sustaining, a . character is used as a place holder for that voice (the dot is called a null token in Humdrum terminology).

Partial voices/layers

Voices/layers which do not continue throughout the measure can be encoded in two equivalent manners: (1) splitting and merging the spine as needed in the measure, or (2) adding invisible rests to fill in the voice/layer for the entire measure. For method #1, note that the spine merging cannot occur until the end of both sounding notes in each layer.

Multiple staves

Each **kern spine in the data will produce a staff in the graphical notation. The lowest staff is the left-most spine in the data, and the highest staff in the notation is the right-most spine.

Cross-staff notes

Notes can be stored in one **kern spines, but appear in another staff by applying an orientation RDF marker immediately after a note’s pitch.

Text directions

Text can be displayed in the graphical notation by adding a text layout direction to the encoding:

Text layout comments start with !LO:TX: and are placed immediately before the note to which they are attached. The parameter a means place the text above, and b means place the text below. The font is roman by default; adding an i parameter will make the text italic, and b to make it bold. The actual text is given by the t parameter, such as “cresc.” with the parameter t=cresc.. Note that parameters are separated by colons, and : is used to insert a colon into the text string.

Transposing parts

Transposing scores are always encoded at sounding pitch in Humdrum data, and instructions for transposing to written pitch are encoded at the start of the part. In the following example, the second spine encodes a B-flat clarinet part, with the transposition interpretation *ITrd1c2 meaning that the written part is a diatonic step up, equivalent to 2 semi-tone up.

See the transposing parts section of the documentation for more information.