Our DNN (Keras2-based WaveNet) is now training on Ludwig van Beethoven, specifically the piano sonatas. You can find the complete works on Amazon* (no affiliate :-). Artist is Yukio Yokoyama (also, Amazon-Link).
We have some mixed results, but mainly lots of learnings…
- Well, first understand how long training on such a complicated work can take: the total collection requires about 6.5 days per epoch to train at 22KHz. That means, we had to cut it down significantly to reduce training time.
- If you cut it down, choose wisely: We had just selected the first 9 pieces (listed below). But they are so different from each other, that our neuronal network has real difficulties in learning music.
- Even if you cut it down to nine pieces, it can still take a hell of a time to train at 22KHz: 6.5 hours per epoch using Horovod/OpenMPI, running training on two GPUs (GeForce 1080 Ti)
- Different initializers in the DNN deliver only slightly different results (no big deal here).
- Experiment with temperature: this brought major differences in the output
- Be patient… The first result that doesn’t completely sound like random came after 9 epochs.
- If you want to generate more than 10 seconds of music, make sure that your software is careful with the RAM or put in lots of RAM (>128). We fixed lots of issues in the software. Before, it would use up around 250GB for 1 minute of music. Now, it just uses 1-2GB – regardless of how long the generated music is.
The works we are training on are the first ten pieces from the first album on that collection:
- Piano Sonata No. 1 In F Minor, Op. 2,1 – I. Allegro (3:51)
- Piano Sonata No. 1 In F Minor, Op. 2,1 – II. Adagio (4:23)
- Piano Sonata No. 1 In F Minor, Op. 2,1 – III. Menuetto: Allegretto (2:54)
- Piano Sonata No. 1 In F Minor, Op. 2,1 – IV. Prestissimo (4:54)
- Seven Bagatelles For Piano, Op. 33 – 1. Andante Grazioso Quasi Allegretto (3:10)
- Seven Bagatelles For Piano, Op. 33 – 2. Scherzo: Allegro (2:26)
- Seven Bagatelles For Piano, Op. 33 – 3. Allegretto (2:14)
- Seven Bagatelles For Piano, Op. 33 – 4. Andante (2:31)
- Seven Bagatelles For Piano, Op. 33 – 5. Allegro Ma Non Troppo (2:38)
Total training data amount around 27 minutes.
Here are the results. When you see “_006_m4a_init.wav” at the end, it means that the sixth piece above was used for initial seed. Otherwise we used random seed.
The filenames are formatted as showing the epoch, sample-temperature, and the seed.
The result is mostly gibberish (random-sounding stuff). But epoch-9 already has some initial interesting characters.
We continue the training for (probably) another week and will keep publishing samples.
On learning #2 above: As you can see, the music we have chosen is really complicated to learn for the DNN. We have so many styles: Allegro, Adagio, Allegretto, Scherzo, Quasi Allegretto, Andante and Allegro Ma Non Troppo. The DNN has real difficulties learning all of these. That’s why the initial results here are not as “musical-sounding” as in Chopin-examples. This means we have to train harder … and longer…
In the next experiments, we will try to use music where all the training data has the same tempo and style…
For now, have fun.