MMH - MIDI-MOD Hybrid music format
By David Piepgrass (for hire 7/99)
__________________________________
Please set your text-editor in word wrap mode.
Now, both MIDI and MOD have their strengths and weaknesses.
Probably MIDI's main strength is that it is very "raw"... primarily just a series of commands and time intervals. This makes it sound great if you have really high-end equipment to accept those commands, and if the MIDI you have was designed for that hardware. If not, well, you can settle for General MIDI. You might still get something that sounds good, but it will not have any "special effects" or exotic-sounding instruments. At least the file size tends to be very small. In any case, making good MIDIs is no walk in the park and so far I haven't seen a free editor (but then, I haven't looked either.)
As for MODs, S3Ms, XMs, and the like, they are a totally different concept. You store the instrument samples in the file along with a number of "patterns" which can be represented on the screen as a grid: the rows are the channels and the columns are the different spots in time where you can start a note (or vice versa). It's somewhat limiting, but it lures you with the miscellaneous special effects and what-not. I imagine it would seem pretty bizarre to someone who's done nothing but sheet music.
My entry to the fray will probably seem a bit premature, since I don't have the time or resources to put all the features that you find in some formats, like IT. But it is simple to edit, while also useful, and the format isn't too complex. Plus, the player is free, as is the editor... for now, anyway. Perhaps the biggest draw for programmers and such is that the documentation on the format is exceptional--not to boast, or anything. I have always been frustrated at how badly other formats were documented, and that is part of the reason I made my own format.
As in a MOD file, the MMH file contains a number of "patterns". But MMH patterns are variable-length, and can be started at any position in the file. Furthermore, the tempo of a pattern can be (but doesn't have to be) set independently of other patterns in the file. There are no "channels". You can have as many notes play at the same time as you want (of course, in the interest of keeping processor usage down, you should restrain yourself.) Also, there are no "divisions" per se like in a MOD file, but there IS a minimum note length: the length and position of a note is specified in terms of 1/64 notes. A beat, which is displayed in the music editor as a round black circle with a line sticking up on the right side, is defined as a 16/64 note--in other words, a quarter note.
What about instruments? Like a MIDI, instrument samples are normally stored outside of the file (in a library). However, like a MOD, you may also put custom samples in the file, giving them a ID number of your choosing.
Effects? I came up with a few basic ones, such as frequency and volume sliding, and several amplitude modifiers (mainly for vibrato effects, but also fading out an instrument.)
There are a few things in the MMH format that actually do not affect how it is played: firstly, you can have lyrics. Secondly, each pattern may have a time signature (only the top number, since the bottom is fixed at four)--but the editor does not force you to keep notes within measure boundaries, so the measures are for cosmetic benefit only. Thirdly, each pattern may have a key signature, but it doesn't affect how the music is played--only how it is displayed. Individual notes are not stored in the file as a letter and a step value (sharp, flat or natural), nor are they stored as a position on a staff; rather, they are stored as a simple number representing the number of half-steps up from A in the lowest octave (27.5 Hz). For example, the number 2 represents A-sharp or B-flat--because they are really the same tone, no distinction is made between them in the file. The editor uses the key signature to help decide how a piece should be displayed--whether a note should be shown as sharp or flat, and where to use accidental symbols.
Here is a conceptual basic layout for a hypothetical MMH music file (probably a rather boring piece):
Patterns---Name-----------------Tempo*---Length-----------------------------
Pattern 0 Basic Percussion 144 bpm 12 beats
Pattern 1 Verse 1 144 bpm 72 beats
Pattern 2 Chorus 144 bpm 36 beats
Pattern 3 Verse 2 144 bpm 72 beats
Pattern 4 End-verse 1 fill 144 bpm 12 beats
Pattern 5 Chorus start drums 144 bpm 24 beats
Pattern 6 End-verse 2 fill 144 bpm 12 beats
Pattern 7 Lyrics 72 bpm 108 beats
* Actually, the tempo is stored on the timeline so that individual instances of the pattern can be played at different rates.
-------------------------------Timeline-------------------------------------
[......Pattern 1.......][Pattern 2.][......Pattern 3.......][Pattern 2.]
[P0][P0][P0][P0][P4][P5] [P0][P0][P0][P0][P0][P6][P5]
[..............................Pattern 7...............................]
0:00----0:10----0:20----0:30----0:40----0:50----1:00----1:10----1:20----1:30
Please note that the vertical arrangement of the patterns on the timeline is also cosmetic: it would sound the same if the the percussion was above the tune.
And for good measure, here's what the hypothetical pattern 0 might look like:
-------------------------------Timeline-------------------------------------
[..132...] [..131...] [..132...] [..131..] [128] [
[128..] [128.][129.] [128..] [128.][129.] [131] [128]
[129..] [130.] [129..] [130.] [131] [128
0-----1-----2-----3-----4-----5-----6-----7-----8-----9-----10----11----12--
Where the numbers on the timeline represent percussion instruments stored elsewhere in the file. Of course there can be more information in a note than just an instrument number, but there wasn't enough room to fit all that on the time line. :)
So, are you interested in the specifics now?
The MMH file consists of five main parts:
1. The header
2. The pattern list
3. The timeline for the piece
4. The patterns, in order by index (most important part of the file)
5. The instrument information and instrument samples (either the largest or smallest part of the file, depending on whether you have custom instruments.
Now for the details. Before I go into the five main parts, I must describe the format of a "note". A lot of people will think that a note is a single tone with a pitch, instrument number, volume and length. In the MMH format, a note could be that simple, but it might be even simpler or much more complicated. It's obvious how it could get more complicated: by adding effects or panning. But how could it be simpler, you ask? After all, every note has a pitch, volume, instrument and length, and some have more properties that make them distinct, such as panning and special effects.
That's because, in addition to ordinary notes, you can also put in "null notes" which are not played but can set defaults--a default instrument and volume, for instance--which are used for all notes from that point forward in the pattern. Both kinds of notes are described below. To really take advantage of this feature, you generally need to run several patterns at once, so that you can tweak the sound of the sections of your piece independently--rather like sections of an orchestra, where each section goes into its own pattern.
In order to reduce CPU load, the MMH format has special support for samples of an instrument that contain chords. You can store chord samples of an instrument AND single pitch samples under the same instrument number. Then, you can specify that a certain note is a chord, and give all the pitches in that chord. The MMH player will count the semitones between each pitch in the chord, and look in the instrument table for an instrument with the same semitone spacing.
What if there is no null note in the pattern, yet an audible note is given that lacks vital information? There is a single null note stored in the header that gives global defaults for the entire file.
In order to increase music quality and realism, MMH also supports having multiple samples for each instrument: up to 6, in fact. There are three purposes for this. Firstly, you can store variations that are designed for higher or lower pitches. The player can automatically select the best variation for the piece. Secondly, in the case that multiple variations have the same original pitch, you can set notes to use a random variation. This could be useful in an acoustic guitar sample, for instance, where each strum sounds a bit different. Thirdly, some instruments make more than one distinct sound, or have different lengths of their sound. You can store each one as a variation, and pick the specific one you want within each note. By the way, you can also make variations of chord samples.
____________________________________________________________________________
The note
The definition of a note is quite broad in the context of MMH: it's any event that happens at a specific time in a pattern. There are audible notes, null notes, "lyric" notes, and "reserved" notes which are designed to allow future extension of the file format.
A note starts with a flag byte: (MSB)76543210(LSB)
Bits 1,0: specifies the type of note, as follows.
Value 00: Normal note (audible)
Value 01: Lyric.
Value 10: Reserved.
Value 11: Null note (inaudible, sets defaults)
Lyrics are the simplest type of note, so I will document them first.
Lyrics use the other bits in the first byte as follows:
Bits 3,2: Specifies a "line" to put the lyric on. Normally this will be zero, but it may sometimes be useful to have your lyrics vertically arranged on the screen in multiple lines. For example, if your piece had multiple parts to sing, you could have the main part on the first line (line 0) and the alternate part on the line below. Depending on where you put the lyrics, you might use the multiple lines to represent different verses of a song.
Or, you could use the bottom line to put explanatory text. The lyrics from all currently playing patterns are merged onto the same set of four lines.
Bit 4: Reserved. Should be zero.
Bit 5: Specifies that the lyric should be displayed in boldface.
Bit 6: Specifies that the lyric should be italicized.
Bit 7: Specifies that the lyric should be grayed.
Just so so understand, you typically have many, many lyric "notes" in a composition if you have any at all. You typically create a separate lyric note for each word that starts on a certain beat so that the words always line up with the audible notes.
The second byte, for both lyric notes and reserved notes, specifies the length of the data that follows. For a lyric note, this data is simply a null-terminated string. For example, "Hello", with its \0 at the end, would have a length of 6. Why would you need a terminator character when a length is already specified? Again, this allows for the possibility of extending the format: extra data could be added after the actual text.
Normal notes and null notes use the other bits in the first byte as follows:
Bit 2: If 1, this note specifies a pitch or a chord.
Bit 3: If 1, this note specifies a length.
Bit 4: If 1, this note specifies a volume (loudness).
Bit 5: If 1, this note specifies an instrument number.
Bit 6: If 1, there is a linked note stored after this one. Null notes cannot have linked notes, nor can they be linked notes.
Bit 7: Reserved. Should be 0.
The second byte in an audible or null note contains more flags:
Bit 0: If 1, this note specifies an amplitude vibrato/sliding effect.
Bit 1: If 1, this note specifies a panning number or sliding effect.
Bit 2: If 1, this note specifies boundary offsets for the note.
Bit 3: If 1, This note specifies a frequency slide.
Bits 4-6: These bits specify a instrument variation code.
Value 000: Specifies that the player should choose the variation with the closest original pitch to this note.
Value 001: Specifies that the player should choose any random variation.
Value 010 to 111: Plays a specific variation. Subtract 2 to get the index.
These bits are ignored in a null note.
Bit 7: Reserved. Should be 0.
What comes after this depends on the flags. If all the flags are set, then all of the following things will be found in the note:
1. Frequency/chord (2 to 16 bytes)
2. Length (1 byte)
3. Volume (1 byte)
4. Instrument (1 byte)
5. Amplitude vibrato/slide effect (4 bytes)
6. Panning or panning slide (1 byte)
7. Boundary offsets (1 byte)
8. Frequency slide (2 bytes)
Frequency/chord format (2 to 16 bytes)
This specifies all the pitches in the note. Each set of two bytes is treated as an unsigned word with the format:
(MSB)5432109876543210(LSB)
Bits 0-3: specifies the tone finetuning in 1/16 halftone increments. i.e. if this was 3, it could represent 3/16 of the way up from C to C#. Normally these bits are zero.
Bits 4-10: specifies the number of halftones up from the lowest supported pitch, A-0 (27.5 Hz): 1 represents A-0, 3 represents B-0, 12 represents A-1 and so on. (See notes 1 and 2 below)
Bits 11-13: In the first note, this specifies the number of additional frequencies in the chord. i.e. 0 would mean it's a single pitch, while 1 would be a single extra pitch (more of a duet than a chord, but for the purposes of this document I classify it as a chord), and 3 would be a four-pitch chord. These bits are only used in the first pitch and ignored in the rest.
Bits 14-15: Reserved. Should be 0.
Note 1: If these bits are zero, then the MMH player plays the instrument at its original sampling rate. Typically, these bits are set to zero for percussion instruments.
Note 2: The MMH player arranges tones in a chord in order of lowest pitch to highest pitch.
Length (1 byte)
The whole byte represents the length. If the length is specified as zero, then the length is the exact length of the instrument sample. In other words, the sample is played once through. Otherwise, the length is specified in 1/64 notes. Thus, the maximum length is just lower than four whole notes. Hopefully there ain't many people who want longer notes than that.
Volume (1 byte)
Specifies the volume to use for the note, where 255 is the full original volume of the sample and 0 is muted. This brings up an important point: you should usually sample your instruments with maximum possible loudness, because from there you can only make it quieter.
Instrument (1 byte)
The whole byte represents the instrument number. Numbers below 128 are reserved for instruments in the standard libraries, while you can have custom instruments at or above 128.
Volume slide/vibrato effect (4 bytes)
The first 16 bits specify the magnitude of vibrato to use at each "section" of the note: You can start with a low vibrato and move up, or go up at first and then down again. The effect is linearly interpolated.
Bits 0-3 of the first byte: Vibrato with which to start the note
Bits 4-7 of the first byte: Vibrato to use 1/3 of the way through
Bits 0-3 of the second byte: Vibrato to use 2/3 of the way through
Bits 4-7 of the second byte: Vibrato to end the note with.
The third byte specifies a wavelength, in 1/128 notes.
When doing a volume slide, the initial volume is specified using the volume byte above. The ending volume is specified here, in the fourth byte.
Panning or panning slide (1 byte)
Bits 0-3 specify the initial note panning, while bits 4-7 specify the final panning. If you don't want the panning to slide, make these two values the same.
Boundary offsets (1 byte)
This modifies the position at which the note starts or ends in 1/128 note increments.
Bits 0-2: Signed 3-bit number, with negatives causing the begginning of the note to come earlier and positive numbers causing the beginning to come later.
Bits 3-5: Signed 3-bit number, with negatives causing the end of the note to come earlier and positive numbers causing the end to come later.
Bit 6: Specifies that the start boundary can be randomly increased or decreased by a 1/128 note length.
Bit 7: Reserved. Should be 0.
Freqency slide: (2 bytes)
This is a difficult effect for the player to do, but I figured someone might need it. The slide occurs at the beginning of the note, and stops when it has reached the pitch specified in the frequency section.
(MSB)5432109876543210(LSB)
Bits 0-10 of the word specify the pitch with which to start the note.
Bits 11-15 specify the number of semitones per 1/4 note to slide, or, if you like, the number of 1/16 semitones per 1/64 note.
When playing a chord, all the pitches of the chord are slid. As if the effect wasn't tricky enough already. :)
Now, we've got the basic format covered. Now, if you'll remember way, way back in this section there was a bit that talked about "linked notes". I'm going to define and describe them here.
A linked note allows you to play several samples, exactly one after another. This allows you to time things exactly. For example, you could have a basic guitar plucking sample, then follow it up with a sample of the note being stopped in an audible way (you know what I'm talking about, right?) anyway, I think some of you creative types will be able to think of a use for this.
When a note is linked, it is stored right after the original note in the file. A linked note must be an audible note. Also, you can link another note at the end of the linked note. You can string as many samples as you want together like this.
Please note that the boundary offsets effect does not affect where linked notes start. If you have delayed the end of the original note, the linked note will still start at the same time as if you hadn't put in the delay. The result is that, for a brief moment, both the original and linked notes will be playing at the same time. Conversely, if you have used the effect to make the original note stop early, there will be a small gap between the end of the original note and beginning of the linked note.
____________________________________________________________________________
1. The header
First Four Bytes: "MMH\0" or the numbers 0x4D, 0x4D, 0x48, 0x00 or 0x00484D4D. This brings up an important point about the numbers in this format: all numbers use the PC format, which puts the least significant byte first. A mac player, which we might never see :( would have to swap all the appropriate bytes while loading the file.
Next four bytes: specifies the offset in the file of the pattern list.
Next four bytes: specifies the offset in the file of the main timeline.
Next four bytes: specifies the offset in the file of the instruments.
Next comes the "default note". The two flag bytes are not included because they are (conceptually) fixed at the following: first=0x3F second=0x40. In other words, the following settings are found, in order:
1. Frequency/chord (2 bytes)
2. Length (1 byte)
3. Volume (1 byte)
4. Instrument (1 byte)
7. Boundary offsets (1 byte)
You may not use a chord for the default note.
Next, the default tempo is specified (2 bytes). This tempo is actually stored as a reciprocol (time per beat instead of beats per time), as described in the main timeline section of this document.
Next, the default number of beats in a measure (the time signature) is given (1 byte).
Finally, there are four null-terminated strings (maximum length: 256 characters). In order, these are:
1. The song name
2. The artist name
2. Copyright notice/terms of use summary
3. Comment (e.g. Date composed etc.)
____________________________________________________________________________
2. The pattern list.
This list contains information about all the patterns in the file. The first 16-bit word specifies the number of patterns.
For each pattern, the following information is given.
1. Offset of the pattern data in the file (4 bytes)
2. Length of the pattern, in beats (2 bytes) (where a beat is a 16/64 note).
3. Key signature (two bytes). This is stored like so:
(MSB)54 32 10 98 76 54 32 10(LSB)
Each set of two bits (1 and 0, 3 and 2 etc.) represent a tone on the staff: Bits 0,1=A; 3,2=B; 5,4=C; 7,6=D; 9,8=E; 11,10=F, 13,12=G; bits 15,14 are unused and should be zero.
Each set of two bits specifies a default type for that position on the staff:
Value 00: Natural (neither sharp nor flat)
Value 01: Sharp
Value 10: Flat
Value 11: Invalid.
Remember, as described in the introduction, the key signature is cosmetic only; it will not affect how the file is played.
4. Number of beats in a measure (i.e. the time signature.) (1 Byte.) If this is zero, then the default, stored in the header, is used.
5. Name of the pattern (fixed size of 33 bytes, NULL-terminated)
____________________________________________________________________________
3. The main timeline.
This specifies when to play which patterns. The first thing in this section is the number of entries in the timeline (2 bytes). The MMH player can calculate the length of the song based simply on when all the patterns on this list will have finished playing. A "pattern library", by the way, is an MMH file that has patterns but no entries on the timeline.
The entries on the timeline are stored one after another, and do not have to be in chronological order. Here is their format:
1. Which pattern number to play (two bytes)
2. The time to start the pattern (four bytes). This is given in 1/64 notes, based on the length of a 1/64 note as stored in the file header. For example, if the header specified 20ms per 1/64 note, and the number here was 1000, then the pattern would start 20 seconds into the piece.
3. The reciprocol of the tempo, given in 1/100 milliseconds per 1/64 note (two bytes). The number must be at least 400, which is 4ms per 1/64 note, or 256ms per whole note, which is very fast and therefore very processor-intensive. On the other hand, you can play the file as slow as you want (up to the biggest number that fits in a word, which works out to about 10.5 seconds per beat--I'll be damned if anyone wants to play a song that slow.) In order to use the default tempo from the header, simply set this to zero.
____________________________________________________________________________
4. The patterns
Each pattern has a number of notes in it. So, the first thing that goes in each pattern is a 2-byte note count (this includes all the types of notes.) The next thing is two bytes that are reserved for future use, and should be zero.
After that, the pattern simply consists of a list of notes. Before each note is two bytes specifying the delay (in 1/64 notes, of course) between the beginning of the last note and the beginning of the current one. For example, if there was a quarter note followed immediately by another note, the time stored here would be 16. Even the very first note has a delay before it, so you can have some empty space at the beginning of the pattern.
____________________________________________________________________________
5. The instruments.
This section begins with a variable-size table describing each of the instruments. The first byte contains the number of instruments that are in the file. Then, the instruments are listed in order from lowest ID number to highest ID number. Each instrument contains this information:
1. Instrument number (1 byte)
1. Flags (1 byte): (MSB)76543210(LSB)
Bit 0: If 1, this instrument number is an alias for another instrument.
Bit 1: If 1, this instrument may only be played at its original sample rate. In other words, notes cannot change the pitch of the sound, or play a chord using this instrument. This flag is typically used for percussion.
Bits 2-7: Reserved; should be 0.
2. Instrument name (null terminated; maximum length is 256.)
3. Comment (null terminated; maximum length is 256.)
4. Normally, this will be a count of the total number of samples recorded for this instrument (1 byte). However, if the instrument is specified to be an alias, then this byte will instead say what instrument to substitute.
For instruments that are not aliases, a set of information about each sample is given. The info is stored like this:
1. The number of values in the sample (4 bytes).
2. The loop start position (4 bytes)
3. The loop length (4 bytes)
4. The original pitch or chord of the sample. (2 to 16 bytes). This is stored in exactly the same format as a pitch/chord in a note. The MMH player arranges tones in a chord in order of lowest pitch to highest pitch. The original pitch may be zero, signifying that the sound has no particular pitch, if (and only if) bit 1 of the instrument flags is set.
5. The original sampling rate of the sample, in Hertz (2 bytes).
4. Flags (1 byte):
Bit 0: If 1, this sample is stereo. The size of the sample will be double the number given above. Panning is not allowed on a stereo sample.
Bit 1: If 1, this sample is 8-bit (signed) instead of 16-bit (signed). The MMH player upscales the sample to 16 bits when loading.
Bit 2: When a note specifies it wants to autodetect the variation, putting a 1 in for this bit will prevent the player from choosing this variation. (If ALL the variations have a 1 here, then the player will be forced to use the first variation.)
Bits 3-7: Reserved for compression information; should be 0 for PCM. Currently, compression is not supported.
After all the samples are listed, and after all the instruments are listed, the raw sample data is listed, in order from the first sample in the first instrument to the last sample in the last instrument. Each sample is stored like this:
1. Size of sample, in bytes (4 bytes), NOT including these four bytes. Since the size can easily be calculated using the information in the sample table, this field works like a checksum: If the calculated size doesn't match the actual size listed here, the file is corrupt. Of course, if compression is ever added, then this size might indeed be smaller than expected.
2. The sample data (variable size)
____________________________________________________________________________
Finally, the part everyone has been waiting for: EOF (End Of File)
And that just about wraps up our little document.
Standard file extensions:
.mmh - A song with a timeline, set of patterns, and optionally, instruments.
.mmhpl - A pattern library with patterns but with empty timeline and instrument sections
.mmhsl - A sample library with instruments but with empty timeline and pattern sections
____________________________________________________________________________
By the way, I was not a music composer or sound programmer before I came up with this format, although I love music. Do you think there was something I should have done differently? No? You want to offer me a job? Well anyway, my e-mail address is qwertie256@gmail.com.
If you want to extend the file format in a way that can fit into the "room for expansion" this format already provides, please consult me so we can discuss the best way to do it and so that I can make the change official. Plus, we might get to "do lunch".
Copyright 1999 by David Piepgrass. This document may not be modified except by the author, and this notice may not be removed. This document may be distributed freely.