Source code for pyqlearning.function_approximator

# -*- coding: utf-8 -*-
from abc import ABCMeta, abstractmethod, abstractproperty


[docs]class FunctionApproximator(metaclass=ABCMeta):
    '''
    The interface of Function Approximators.
    
    Typically, the Deep Q-Learning such as the Deep Q-Network uses the 
    Convolutional Neural Networks(CNN) as a function approximator
    to solve problem setting of so-called Combination explosion.
    
    But it is not inevitable to functionally reuse CNN as 
    a function approximator. In the above problem setting of 
    generalisation and Combination explosion, for instance, 
    Long Short-Term Memory(LSTM) networks, which is-a special 
    Reccurent Neural Network(RNN) structure, and CNN as a function 
    approximator are functionally equivalent. In the same problem 
    setting, functional equivalents can be functionally replaced. 
    Considering that the feature space of the rewards has the 
    time-series nature, LSTM will be more useful.
    
    This interface defines methods to controll functionally equivalents
    of CNN. `DeepQLearning` can be delegated an object that is-a this interface.
    More detail, this interface defines a family of algorithms of Deep Learning,
    such as LSTM, Convolutional LSTM(Xingjian, S. H. I. et al., 2015), and 
    CLDNN Architecture(Sainath, T. N, et al., 2015) encapsulate each one, 
    and make them interchangeable.  Strategy lets the function approximation 
    algorithm vary independently from the clients that use it. 
    Capture the abstraction in an interface, bury implementation details in derived classes.

    References:
        - https://code.accel-brain.com/Deep-Learning-by-means-of-Design-Pattern/README.html
        - https://code.accel-brain.com/Reinforcement-Learning/README.html#deep-q-network
        - [Egorov, M. (2016). Multi-agent deep reinforcement learning.](https://pdfs.semanticscholar.org/dd98/9d94613f439c05725bad958929357e365084.pdf)
        - Gupta, J. K., Egorov, M., & Kochenderfer, M. (2017, May). Cooperative multi-agent control using deep reinforcement learning. In International Conference on Autonomous Agents and Multiagent Systems (pp. 66-83). Springer, Cham.
        - Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
        - Sainath, T. N., Vinyals, O., Senior, A., & Sak, H. (2015, April). Convolutional, long short-term memory, fully connected deep neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on (pp. 4580-4584). IEEE.
        - Xingjian, S. H. I., Chen, Z., Wang, H., Yeung, D. Y., Wong, W. K., & Woo, W. C. (2015). Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Advances in neural information processing systems (pp. 802-810).

    '''

    @abstractproperty
    def model(self):
        '''
        `object` of model as a function approximator.
        '''
        raise NotImplementedError("This property must be implemented.")

[docs]    @abstractmethod
    def learn_q(self, predicted_q_arr, real_q_arr):
        '''
        Infernce Q-Value.
        
        Args:
            predicted_q_arr:    `np.ndarray` of predicted Q-Values.
            real_q_arr:         `np.ndarray` of real Q-Values.
        '''
        raise NotImplementedError("This method must be implemented.")

[docs]    @abstractmethod
    def inference_q(self, next_action_arr):
        '''
        Infernce Q-Value.
        
        Args:
            next_action_arr:     `np.ndarray` of action.
        
        Returns:
            `np.ndarray` of Q-Values.
        '''
        raise NotImplementedError("This method must be implemented.")