pyqlearning.functionapproximator package

Submodules

pyqlearning.functionapproximator.cnn_fa module

class pyqlearning.functionapproximator.cnn_fa.CNNFA(batch_size, layerable_cnn_list, cnn_output_graph, learning_rate=1e-05, computable_loss=None, opt_params=None, verificatable_result=None, pre_learned_path_list=None, pre_learned_output_path=None, cnn=None, verbose_mode=False)[source]

Bases: pyqlearning.function_approximator.FunctionApproximator

Convolutional Neural Networks(CNNs) as a Function Approximator.

CNNs are hierarchical models whose convolutional layers alternate with subsampling layers, reminiscent of simple and complex cells in the primary visual cortex.

This class demonstrates that a CNNs can solve generalisation problems to learn successful control policies from observed data points in complex Reinforcement Learning environments. The network is trained with a variant of the Q-learning algorithm, with stochastic gradient descent to update the weights.

The Deconvolution also called transposed convolutions “work by swapping the forward and backward passes of a convolution.” (Dumoulin, V., & Visin, F. 2016, p20.)

References

  • Dumoulin, V., & V,kisin, F. (2016). A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285.
  • Masci, J., Meier, U., Cireşan, D., & Schmidhuber, J. (2011, June). Stacked convolutional auto-encoders for hierarchical feature extraction. In International Conference on Artificial Neural Networks (pp. 52-59). Springer, Berlin, Heidelberg.
  • Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
get_loss_list()[source]

getter

get_model()[source]

object of model as a function approximator, which has cnn whose type is pydbm.cnn.pydbm.cnn.convolutional_neural_network.ConvolutionalNeuralNetwork.

inference_q(next_action_arr)[source]

Infernce Q-Value.

Parameters:next_action_arrnp.ndarray of action.
Returns:np.ndarray of Q-Values.
learn_q(predicted_q_arr, real_q_arr)[source]

Infernce Q-Value.

Parameters:
  • predicted_q_arrnp.ndarray of predicted Q-Values.
  • real_q_arrnp.ndarray of real Q-Values.
loss_list

getter

model

object of model as a function approximator, which has cnn whose type is pydbm.cnn.pydbm.cnn.convolutional_neural_network.ConvolutionalNeuralNetwork.

set_loss_list(value)[source]

setter

set_model(value)[source]

object of model as a function approximator.

pyqlearning.functionapproximator.lstm_fa module

class pyqlearning.functionapproximator.lstm_fa.LSTMFA(batch_size, lstm_model, seq_len=10, learning_rate=1e-05, computable_loss=None, opt_params=None, verificatable_result=None, verbose_mode=False)[source]

Bases: pyqlearning.function_approximator.FunctionApproximator

LSTM Networks as a Function Approximator.

Originally, Long Short-Term Memory(LSTM) networks as a special RNN structure has proven stable and powerful for modeling long-range dependencies.

The Key point of structural expansion is its memory cell which essentially acts as an accumulator of the state information. Every time observed data points are given as new information and input to LSTM’s input gate, its information will be accumulated to the cell if the input gate is activated. The past state of cell could be forgotten in this process if LSTM’s forget gate is on. Whether the latest cell output will be propagated to the final state is further controlled by the output gate.

References

  • Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
  • Malhotra, P., Ramakrishnan, A., Anand, G., Vig, L., Agarwal, P., & Shroff, G. (2016). LSTM-based encoder-decoder for multi-sensor anomaly detection. arXiv preprint arXiv:1607.00148.
  • Zaremba, W., Sutskever, I., & Vinyals, O. (2014). Recurrent neural network regularization. arXiv preprint arXiv:1409.2329.
get_loss_list()[source]

getter

get_model()[source]

object of model as a function approximator, which has lstm_model whose type is pydbm.rnn.lstm_model.LSTMModel.

inference_q(next_action_arr)[source]

Infernce Q-Value.

Parameters:next_action_arrnp.ndarray of action.
Returns:np.ndarray of Q-Values.
learn_q(predicted_q_arr, real_q_arr)[source]

Infernce Q-Value.

Parameters:
  • predicted_q_arrnp.ndarray of predicted Q-Values.
  • real_q_arrnp.ndarray of real Q-Values.
loss_list

getter

model

object of model as a function approximator, which has lstm_model whose type is pydbm.rnn.lstm_model.LSTMModel.

set_loss_list(value)[source]

setter

set_model(value)[source]

Model as a function approximator.

Module contents