pycomposer.gancomposable package

Submodules

pycomposer.gancomposable.conditional_gan_composer module

class pycomposer.gancomposable.conditional_gan_composer.ConditionalGANComposer(midi_path_list, batch_size=20, seq_len=8, time_fraction=1.0, learning_rate=1e-10, hidden_dim=15200, generative_model=None, discriminative_model=None, gans_value_function=None)[source]

Bases: pycomposer.gan_composable.GANComposable

Algorithmic Composer based on Conditional Generative Adversarial Networks(Conditional GANs).

This composer learns observed data points drawn from a conditional true distribution of input MIDI files and generates feature points drawn from a fake distribution that means such as Uniform distribution or Normal distribution, imitating the true MIDI files data.

The components included in this class are functionally differentiated into three models.

  1. TrueSampler.
  2. Generator.
  3. Discriminator.

The function of TrueSampler is to draw samples from a true distribution of input MIDI files. Generator has NoiseSampler`s which can be considered as a `Conditioner`s like the MidiNet(Yang, L. C., et al., 2017) and draw fake samples from a Uniform distribution or Normal distribution by use it. And `Discriminator observes those input samples, trying discriminating true and fake data.

While Discriminator observes Generator’s observation to discrimine the output from true samples, Generator observes Discriminator’s observations to confuse `Discriminator`s judgments. In GANs framework, the mini-max game can be configured by the observations of observations.

After this game, the Generator will grow into a functional equivalent that enables to imitate the TrueSampler and makes it possible to compose similar but slightly different music by the imitation.

In this class, Convolutional Neural Networks(CNNs) and Deconvolution Networks are implemented as Generator and Discriminator. The Deconvolution also called transposed convolutions “work by swapping the forward and backward passes of a convolution.” (Dumoulin, V., & Visin, F. 2016, p20.)

Following MidiNet and MuseGAN(Dong, H. W., et al., 2018), this class consider bars as the basic compositional unit for the fact that harmonic changes usually occur at the boundaries of bars and that human beings often use bars as the building blocks when composing songs. The feature engineering in this class also is inspired by the Multi-track piano-roll representations in MuseGAN. But their strategies of activation function did not apply to this library since its methods can cause information losses. The models just binarize the Generator’s output, which uses tanh as an activation function in the output layer, by a threshold at zero, or by deterministic or stochastic binary neurons(Bengio, Y., et al., 2018, Chung, J., et al., 2016), and ignore drawing a distinction the consonance and the dissonance.

This library simply uses the softmax strategy. This class stochastically selects a combination of pitches in each bars drawn by the true MIDI files data, based on the difference between consonance and dissonance intended by the composer of the MIDI files.

References

  • Bengio, Y., Léonard, N., & Courville, A. (2013). Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432.
  • Chung, J., Ahn, S., & Bengio, Y. (2016). Hierarchical multiscale recurrent neural networks. arXiv preprint arXiv:1609.01704.
  • Dong, H. W., Hsiao, W. Y., Yang, L. C., & Yang, Y. H. (2018, April). MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In Thirty-Second AAAI Conference on Artificial Intelligence.
  • Dumoulin, V., & V,kisin, F. (2016). A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285.
  • Fang, W., Zhang, F., Sheng, V. S., & Ding, Y. (2018). A method for improving CNN-based image recognition using DCGAN. Comput. Mater. Contin, 57, 167-178.
  • Gauthier, J. (2014). Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester, 2014(5), 2.
  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., … & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680).
  • Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).
  • Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2015). Adversarial autoencoders. arXiv preprint arXiv:1511.05644.
  • Yang, L. C., Chou, S. Y., & Yang, Y. H. (2017). MidiNet: A convolutional generative adversarial network for symbolic-domain music generation. arXiv preprint arXiv:1703.10847.
bar_gram

getter

compose(file_path, velocity_mean=None, velocity_std=None)[source]

Compose by learned model.

Parameters:
  • file_path – Path to generated MIDI file.
  • velocity_mean – Mean of velocity. This class samples the velocity from a Gaussian distribution of velocity_mean and velocity_std. If None, the average velocity in MIDI files set to this parameter.
  • velocity_std – Standard deviation(SD) of velocity. This class samples the velocity from a Gaussian distribution of velocity_mean and velocity_std. If None, the SD of velocity in MIDI files set to this parameter.
extract_logs()[source]

Extract update logs data.

Returns:
  • list of probabilities inferenced by the discriminator (mean) in the discriminator’s update turn.
  • list of probabilities inferenced by the discriminator (mean) in the generator’s update turn.
Return type:The shape is
generative_model

getter

get_bar_gram()[source]

getter

get_generative_model()[source]

getter

get_true_sampler()[source]

getter

learn(iter_n=500, k_step=10)[source]

Learning.

Parameters:
  • iter_n – The number of training iterations.
  • k_step – The number of learning of the discriminator.
set_readonly(value)[source]

setter

true_sampler

getter

pycomposer.gancomposable.gan_composer module

class pycomposer.gancomposable.gan_composer.GANComposer(midi_path_list, target_program=0, batch_size=20, seq_len=8, time_fraction=1.0, learning_rate=1e-10, generative_model=None, discriminative_model=None, gans_value_function=None)[source]

Bases: pycomposer.gan_composable.GANComposable

Algorithmic Composer based on Generative Adversarial Networks(GANs).

This composer learns observed data points drawn from a true distribution of input MIDI files and generates feature points drawn from a fake distribution that means such as Uniform distribution or Normal distribution, imitating the true MIDI files data.

The components included in this class are functionally differentiated into three models.

  1. TrueSampler.
  2. Generator.
  3. Discriminator.

The function of TrueSampler is to draw samples from a true distribution of input MIDI files. Generator has NoiseSampler`s and draw fake samples from a Uniform distribution or Normal distribution by use it. And `Discriminator observes those input samples, trying discriminating true and fake data.

While Discriminator observes Generator’s observation to discrimine the output from true samples, Generator observes Discriminator’s observations to confuse `Discriminator`s judgments. In GANs framework, the mini-max game can be configured by the observations of observations.

After this game, the Generator will grow into a functional equivalent that enables to imitate the TrueSampler and makes it possible to compose similar but slightly different music by the imitation.

In this class, Long short term memory(LSTM) networks are implemented as Generator and Discriminator. Originally, Long Short-Term Memory(LSTM) networks as a special RNN structure has proven stable and powerful for modeling long-range dependencies.

The Key point of structural expansion is its memory cell which essentially acts as an accumulator of the state information. Every time observed data points are given as new information and input to LSTM’s input gate, its information will be accumulated to the cell if the input gate is activated. The past state of cell could be forgotten in this process if LSTM’s forget gate is on. Whether the latest cell output will be propagated to the final state is further controlled by the output gate.

Following MidiNet and MuseGAN(Dong, H. W., et al., 2018), this class consider bars as the basic compositional unit for the fact that harmonic changes usually occur at the boundaries of bars and that human beings often use bars as the building blocks when composing songs. The feature engineering in this class also is inspired by the Multi-track piano-roll representations in MuseGAN. But their strategies of activation function did not apply to this library since its methods can cause information losses. The models just binarize the Generator’s output, which uses tanh as an activation function in the output layer, by a threshold at zero, or by deterministic or stochastic binary neurons(Bengio, Y., et al., 2018, Chung, J., et al., 2016), and ignore drawing a distinction the consonance and the dissonance.

This library simply uses the softmax strategy. This class stochastically selects a combination of pitches in each bars drawn by the true MIDI files data, based on the difference between consonance and dissonance intended by the composer of the MIDI files.

References

  • Bengio, Y., Léonard, N., & Courville, A. (2013). Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432.
  • Chung, J., Ahn, S., & Bengio, Y. (2016). Hierarchical multiscale recurrent neural networks. arXiv preprint arXiv:1609.01704.
  • Dong, H. W., Hsiao, W. Y., Yang, L. C., & Yang, Y. H. (2018, April). MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In Thirty-Second AAAI Conference on Artificial Intelligence.
  • Fang, W., Zhang, F., Sheng, V. S., & Ding, Y. (2018). A method for improving CNN-based image recognition using DCGAN. Comput. Mater. Contin, 57, 167-178.
  • Gauthier, J. (2014). Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester, 2014(5), 2.
  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., … & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680).
  • Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).
  • Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2015). Adversarial autoencoders. arXiv preprint arXiv:1511.05644.
  • Malhotra, P., Ramakrishnan, A., Anand, G., Vig, L., Agarwal, P., & Shroff, G. (2016). LSTM-based encoder-decoder for multi-sensor anomaly detection. arXiv preprint arXiv:1607.00148.
  • Zaremba, W., Sutskever, I., & Vinyals, O. (2014). Recurrent neural network regularization. arXiv preprint arXiv:1409.2329.
compose(file_path, velocity_mean=None, velocity_std=None)[source]

Compose by learned model.

Parameters:
  • file_path – Path to generated MIDI file.
  • velocity_mean – Mean of velocity. This class samples the velocity from a Gaussian distribution of velocity_mean and velocity_std. If None, the average velocity in MIDI files set to this parameter.
  • velocity_std – Standard deviation(SD) of velocity. This class samples the velocity from a Gaussian distribution of velocity_mean and velocity_std. If None, the SD of velocity in MIDI files set to this parameter.
extract_logs()[source]

Extract update logs data.

Returns:
  • list of probabilities inferenced by the discriminator (mean) in the discriminator’s update turn.
  • list of probabilities inferenced by the discriminator (mean) in the generator’s update turn.
Return type:The shape is
generative_model

getter

get_generative_model()[source]

getter

get_true_sampler()[source]

getter

learn(iter_n=500, k_step=10)[source]

Learning.

Parameters:
  • iter_n – The number of training iterations.
  • k_step – The number of learning of the discriminator.
set_readonly(value)[source]

setter

true_sampler

getter

Module contents