Chopin by a Deep Neuronal Network

The last few days, we were training our Keras2-based WaveNet on samples from Chopin. The samples are just two pieces by Frederic Chopin, recorded very long ago and thus out-of-copyright.

We still have a long way to go (and also figure out why we had a dead-lock during Multi-GPU-training). Because of that deadlock, we are currently training it on a single GPU while we analyze the deadlock-problem.

Regardless, we have generated a few samples.

The first sample is a 3 second audio after 9 epochs. As you can hear, it does sound like something but we are not there yet.

The second sample is a 30-second audio after 11 epochs. It is getting a little better but there is too much pause. This is also due to a bug.

The third sample is already 60 seconds long after 12 epochs. This one already sounds more like something, but we are not there yet (btw: keep listening until the end).

The fourth sample is also 60 seconds long, after 13 epochs. Hmm, it is getting worse. Well, let’s keep training.

Actually, ‘max_epoch’ is set for ‘1,000’ but we’ll never get there (at the current rate, that would be 6,000 hours = 250 days!!)

In the next weeks, we’ll analyze the deadlock problem and see if we can increase the training speed by distributing across four GPUs across two servers…

In any case, it is already promising. Stay tuned…

(BTW: http://data.m-ailabs.bayern/ will be where we will publish training data, samples, and more)…