Source code for pyqlearning.function_approximator

# -*- coding: utf-8 -*-
from abc import ABCMeta, abstractmethod, abstractproperty


[docs]class FunctionApproximator(metaclass=ABCMeta): ''' The interface of Function Approximators. Typically, the Deep Q-Learning such as the Deep Q-Network uses the Convolutional Neural Networks(CNN) as a function approximator to solve problem setting of so-called Combination explosion. But it is not inevitable to functionally reuse CNN as a function approximator. In the above problem setting of generalisation and Combination explosion, for instance, Long Short-Term Memory(LSTM) networks, which is-a special Reccurent Neural Network(RNN) structure, and CNN as a function approximator are functionally equivalent. In the same problem setting, functional equivalents can be functionally replaced. Considering that the feature space of the rewards has the time-series nature, LSTM will be more useful. This interface defines methods to controll functionally equivalents of CNN. `DeepQLearning` can be delegated an object that is-a this interface. More detail, this interface defines a family of algorithms of Deep Learning, such as LSTM, Convolutional LSTM(Xingjian, S. H. I. et al., 2015), and CLDNN Architecture(Sainath, T. N, et al., 2015) encapsulate each one, and make them interchangeable. Strategy lets the function approximation algorithm vary independently from the clients that use it. Capture the abstraction in an interface, bury implementation details in derived classes. References: - https://code.accel-brain.com/Deep-Learning-by-means-of-Design-Pattern/README.html - https://code.accel-brain.com/Reinforcement-Learning/README.html#deep-q-network - [Egorov, M. (2016). Multi-agent deep reinforcement learning.](https://pdfs.semanticscholar.org/dd98/9d94613f439c05725bad958929357e365084.pdf) - Gupta, J. K., Egorov, M., & Kochenderfer, M. (2017, May). Cooperative multi-agent control using deep reinforcement learning. In International Conference on Autonomous Agents and Multiagent Systems (pp. 66-83). Springer, Cham. - Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. - Sainath, T. N., Vinyals, O., Senior, A., & Sak, H. (2015, April). Convolutional, long short-term memory, fully connected deep neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on (pp. 4580-4584). IEEE. - Xingjian, S. H. I., Chen, Z., Wang, H., Yeung, D. Y., Wong, W. K., & Woo, W. C. (2015). Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Advances in neural information processing systems (pp. 802-810). ''' @abstractproperty def model(self): ''' `object` of model as a function approximator. ''' raise NotImplementedError("This property must be implemented.")
[docs] @abstractmethod def learn_q(self, predicted_q_arr, real_q_arr): ''' Infernce Q-Value. Args: predicted_q_arr: `np.ndarray` of predicted Q-Values. real_q_arr: `np.ndarray` of real Q-Values. ''' raise NotImplementedError("This method must be implemented.")
[docs] @abstractmethod def inference_q(self, next_action_arr): ''' Infernce Q-Value. Args: next_action_arr: `np.ndarray` of action. Returns: `np.ndarray` of Q-Values. ''' raise NotImplementedError("This method must be implemented.")