Latent Inspector: a interactive demo of Drum VAE

102 views
Skip to first unread message

"Thio Vibert (張欣嘉)"

unread,
Jan 8, 2019, 11:32:04 PM1/8/19
to Magenta Discuss, Yi-Hsuan Yang
Hi everyone and Magenta,


It’s an interactive web demo based on an VAE model of drum patterns. It visualize the relation of latent vectors and drum patterns in the VAE. The upper half shows the drum patterns with 9 drum samples (kick/snare/open hi-hat/closed hi-hat/low tom/mid tom/hi tom/ride/cymbal). The middle is the pseudo encoder/decoder visualization. And, the lower is an adjustable 32-dims latent vector visualized by a circular diagram.


The player can modify any dimension in the latent vector to see how it affect the drum pattern, through the decoder. Also, the latent vector will change responsively when the drum notes are added/removed.

Besides the idea of visualization, I am also developed this based on another thought. What musicians need may not be only a particular feature (like interpolation) of musical machine learning, or well-organized function, but also some unknown possibilities. As David Bowie said, it would be much more interesting when the musician have no idea what the knobs are designed for on a synthesizer. Every dimension in the latent vector is similar to the knobs on a synthesizer, and it will produce some melodies/drum variation that might be interesting.

The model is not built with magenta.js, but the algorithm I developed myself. However, I think this discussion would be kind of interesting in this mail list, maybe. Of course, any feedback is very much appreciated and helpful!


Best,
Vibert Thio

===============
Vibert Thio (張欣嘉)
Programmer | Artists | Researcher

WEBSITE: vibertthio.com
EMAIL: viber...@gmail.com
FB/IG: Vibert Thio
===============


Jesse Engel

unread,
Jan 9, 2019, 10:18:03 AM1/9/19
to Thio Vibert (張欣嘉), Magenta Discuss, Yi-Hsuan Yang
Awesome work Vibert. I really like using the circle as a way of visualizing many knobs at once in a compact manner. The bug (or feature?) with the metaphor of course is that the dimensions are not all independent from eachother, so tweaking a knob is doesn't always have the same meaning, but locally it can makes sense and it's fun for exploration. Great work!

--
Magenta project: magenta.tensorflow.org
To post to this group, send email to magenta...@tensorflow.org
To unsubscribe from this group, send email to magenta-discu...@tensorflow.org
---
You received this message because you are subscribed to the Google Groups "Magenta Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to magenta-discu...@tensorflow.org.

Thio Vibert

unread,
Jan 9, 2019, 11:49:36 AM1/9/19
to Jesse Engel, Magenta Discuss, Yi-Hsuan Yang
Hi Jesse,

Yes, it is definitely a problem in the system. However, it also help people to understand that VAEs are not learning those encoder/decoder magically. Those variation reflects how the model interpret the drum pattern structure.

On the other hand, our lab (Music and AI Lab, Sinica, Taiwan) is also working on some conditional training to make some interpretable features in the drum pattern generation algorithm. Since it’s a topic currently investigated, especially in the musical machine learning, it’s still not clear what’s the best practice and the corresponding demonstration strategy.

Best,
Thio

Jesse Engel <jesse...@google.com>於 2019年1月9日 週三,下午11:17寫道:

Adam Roberts

unread,
Jan 9, 2019, 1:04:06 PM1/9/19
to Thio Vibert, Jesse Engel, Magenta Discuss, Yi-Hsuan Yang
Hi Thio,

This is great! I'm curious why you chose to use a different model than the ones in Magenta.js you used in the sornting app.

-Adam

Douglas Eck

unread,
Jan 9, 2019, 1:36:55 PM1/9/19
to Adam Roberts, Thio Vibert, Jesse Engel, Magenta Discuss, Yi-Hsuan Yang
Hi Thio,

Did you publish your model somewhere? It seems to a different variational model than MusicVAE.  What dataset did you use for training? Did you find it useful to overfit somewhat? Did the variational loss help?  Looking forward to an arxiv paper about your newer work on interpretable features. That's such an important direction of research.  Great work!  

"Thio Vibert (張欣嘉)"

unread,
Jan 12, 2019, 2:23:26 AM1/12/19
to Adam Roberts, Jesse Engel, Magenta Discuss, Yi-Hsuan Yang

Hi Adam,

Sorry for the late reply.

I chose not to use Magenta.js because I implemented a brand new model using PyTorch and want to see how it acts as an encoder/decoder. In Music and AI Lab, most of my friends working on music generation are using PyTorch. Hence, it’s better for me to build the model system with PyTorch.

I am aware that the API of the MusicVAE model include the encode/decode function. It might be interesting to see how those different models work together. I assume that it would take just a few hours to put MusicVAE into this work. I just haven’t tried it though.

Best,
Thio
===============
Vibert Thio (張欣嘉)
Programmer | Artists | Researcher

WEBSITE: vibertthio.com
EMAIL: viber...@gmail.com
FB/IG: Vibert Thio
===============

On Jan 10, 2019, at 2:03 AM, Adam Roberts <ada...@google.com> wrote:

Hi Thio,

This is great! I'm curious why you chose to use a different model than the ones in Magenta.js you used in the sornting app.

-Adam

On Wed, Jan 9, 2019 at 8:49 AM Thio Vibert <viber...@gmail.com> wrote:
Hi Jesse,

Yes, it is definitely a problem in the system. However, it also help people to understand that VAEs are not learning those encoder/decoder magically. Those variation reflects how the model interpret the drum pattern structure.

On the other hand, our lab (Music and AI Lab, Sinica, Taiwan) is also working on some conditional training to make some interpretable features in the drum pattern generation algorithm. Since it’s a topic currently investigated, especially in the musical machine learning, it’s still not clear what’s the best practice and the corresponding demonstration strategy.

Best,
Thio
Jesse Engel <jesse...@google.com>於 2019年1月9日 週三,下午11:17寫道:
Awesome work Vibert. I really like using the circle as a way of visualizing many knobs at once in a compact manner. The bug (or feature?) with the metaphor of course is that the dimensions are not all independent from eachother, so tweaking a knob is doesn't always have the same meaning, but locally it can makes sense and it's fun for exploration. Great work!

On Tue, Jan 8, 2019 at 8:32 PM "Thio Vibert (張欣嘉)" <viber...@gmail.com> wrote:
Hi everyone and Magenta,


It’s an interactive web demo based on an VAE model of drum patterns. It visualize the relation of latent vectors and drum patterns in the VAE. The upper half shows the drum patterns with 9 drum samples (kick/snare/open hi-hat/closed hi-hat/low tom/mid tom/hi tom/ride/cymbal). The middle is the pseudo encoder/decoder visualization. And, the lower is an adjustable 32-dims latent vector visualized by a circular diagram.

<p929.gif>

"Thio Vibert (張欣嘉)"

unread,
Jan 12, 2019, 2:42:10 AM1/12/19
to Douglas Eck, Adam Roberts, Jesse Engel, Magenta Discuss, Yi-Hsuan Yang
Hi Douglas,

Thanks for the feedback. It means a lot to me. And, sorry for the late reply.
I am working in the Music and AI Lab in Taiwan, directed by Eric Yang. We have sent the paper about the demonstrative framework to the MILC 2019, where Magenta presented the demonstrations of MusicVAE and NSynth last year. However, we haven’t publish anything particular about the variational model right now. We used the LPD (Lakh Pianoroll Dataset) which is derived from the LMD. It’s developed by one of our member in the lab.

The variational loss definite helps in a certain extent. But general speaking, we don’t think it essentially learn the structure of drum patterns yet. We find it is hard to reduce both reconstruction loss and the ELBO in the same time (I read that your team also faced the similar problem in the MILC 2018 paper). Since we want to generate some music where there is no much symbolic data (like techno), we are also working on drum transcription to generate training data.

Do you guys still working with drum patterns generation?

Best,
Thio

===============
Vibert Thio (張欣嘉)
Programmer | Artists | Researcher

WEBSITE: vibertthio.com
EMAIL: viber...@gmail.com
FB/IG: Vibert Thio
===============

On Jan 10, 2019, at 2:36 AM, Douglas Eck <de...@google.com> wrote:

Hi Thio,

Did you publish your model somewhere? It seems to a different variational model than MusicVAE.  What dataset did you use for training? Did you find it useful to overfit somewhat? Did the variational loss help?  Looking forward to an arxiv paper about your newer work on interpretable features. That's such an important direction of research.  Great work!  

On Wed, Jan 9, 2019 at 10:04 AM 'Adam Roberts' via Magenta Discuss <magenta...@tensorflow.org> wrote:
Hi Thio,

This is great! I'm curious why you chose to use a different model than the ones in Magenta.js you used in the sornting app.

-Adam

On Wed, Jan 9, 2019 at 8:49 AM Thio Vibert <viber...@gmail.com> wrote:
Hi Jesse,

Yes, it is definitely a problem in the system. However, it also help people to understand that VAEs are not learning those encoder/decoder magically. Those variation reflects how the model interpret the drum pattern structure.

On the other hand, our lab (Music and AI Lab, Sinica, Taiwan) is also working on some conditional training to make some interpretable features in the drum pattern generation algorithm. Since it’s a topic currently investigated, especially in the musical machine learning, it’s still not clear what’s the best practice and the corresponding demonstration strategy.

Best,
Thio
Jesse Engel <jesse...@google.com>於 2019年1月9日 週三,下午11:17寫道:
Awesome work Vibert. I really like using the circle as a way of visualizing many knobs at once in a compact manner. The bug (or feature?) with the metaphor of course is that the dimensions are not all independent from eachother, so tweaking a knob is doesn't always have the same meaning, but locally it can makes sense and it's fun for exploration. Great work!

On Tue, Jan 8, 2019 at 8:32 PM "Thio Vibert (張欣嘉)" <viber...@gmail.com> wrote:
Hi everyone and Magenta,


It’s an interactive web demo based on an VAE model of drum patterns. It visualize the relation of latent vectors and drum patterns in the VAE. The upper half shows the drum patterns with 9 drum samples (kick/snare/open hi-hat/closed hi-hat/low tom/mid tom/hi tom/ride/cymbal). The middle is the pseudo encoder/decoder visualization. And, the lower is an adjustable 32-dims latent vector visualized by a circular diagram.

<p929.gif>
Reply all
Reply to author
Forward
0 new messages