Beethoven by Deep Neuronal Network

Our DNN (Keras2-based WaveNet) is now training on Ludwig van Beethoven, specifically the piano sonatas. You can find the complete works on Amazon* (no affiliate :-). Artist is Yukio Yokoyama (also, Amazon-Link).

We have some mixed results, but mainly lots of learnings…

Learnings

  1. Well, first understand how long training on such a complicated work can take: the total collection requires about 6.5 days per epoch to train at 22KHz. That means, we had to cut it down significantly to reduce training time.
  2. If you cut it down, choose wisely: We had just selected the first 9 pieces (listed below). But they are so different from each other, that our neuronal network has real difficulties in learning music.
  3. Even if you cut it down to nine pieces, it can still take a hell of a time to train at 22KHz: 6.5 hours per epoch using Horovod/OpenMPI, running training on two GPUs (GeForce 1080 Ti)
  4. Different initializers in the DNN deliver only slightly different results (no big deal here).
  5. Experiment with temperature: this brought major differences in the output
  6. Be patient… The first result that doesn’t completely sound like random came after 9 epochs.
  7. If you want to generate more than 10 seconds of music, make sure that your software is careful with the RAM or put in lots of RAM (>128). We fixed lots of issues in the software. Before, it would use up around 250GB for 1 minute of music. Now, it just uses 1-2GB – regardless of how long the generated music is.

The works we are training on are the first ten pieces from the first album on that collection:

  1. Piano Sonata No. 1 In F Minor, Op. 2,1 – I. Allegro (3:51)
  2. Piano Sonata No. 1 In F Minor, Op. 2,1 – II. Adagio (4:23)
  3. Piano Sonata No. 1 In F Minor, Op. 2,1 – III. Menuetto: Allegretto (2:54)
  4. Piano Sonata No. 1 In F Minor, Op. 2,1 – IV. Prestissimo (4:54)
  5. Seven Bagatelles For Piano, Op. 33 – 1. Andante Grazioso Quasi Allegretto (3:10)
  6. Seven Bagatelles For Piano, Op. 33 – 2. Scherzo: Allegro (2:26)
  7. Seven Bagatelles For Piano, Op. 33 – 3. Allegretto (2:14)
  8. Seven Bagatelles For Piano, Op. 33 – 4. Andante (2:31)
  9. Seven Bagatelles For Piano, Op. 33 – 5. Allegro Ma Non Troppo (2:38)

Total training data amount around 27 minutes.

Here are the results. When you see “_006_m4a_init.wav” at the end, it means that the sixth piece above was used for initial seed. Otherwise we used random seed.

The filenames are formatted as showing the epochsample-temperature, and the seed.

The result is mostly gibberish (random-sounding stuff). But epoch-9 already has some initial interesting characters.

We continue the training for (probably) another week and will keep publishing samples.

Data

On learning #2 above: As you can see, the music we have chosen is really complicated to learn for the DNN. We have so many styles: Allegro, Adagio, Allegretto, Scherzo, Quasi Allegretto, Andante and Allegro Ma Non Troppo. The DNN has real difficulties learning all of these. That’s why the initial results here are not as “musical-sounding” as in Chopin-examples. This means we have to train harder … and longer…

In the next experiments, we will try to use music where all the training data has the same tempo and style…

For now, have fun.