pyqlearning.deepqlearning package

Submodules

pyqlearning.deepqlearning.deep_q_network module

class pyqlearning.deepqlearning.deep_q_network.DeepQNetwork(function_approximator)[source]

Bases: pyqlearning.deep_q_learning.DeepQLearning

Abstract base class to implement the Deep Q-Network(DQN).

The structure of Q-Learning is based on the Epsilon Greedy Q-Leanring algorithm, which is a typical off-policy algorithm. In this paradigm, stochastic searching and deterministic searching can coexist by hyperparameter epsilon_greedy_rate that is probability that agent searches greedy. Greedy searching is deterministic in the sensethat policy of agent follows the selection that maximizes the Q-Value.

References

epsilon_greedy_rate

getter

get_epsilon_greedy_rate()[source]

getter

select_action(next_action_arr, next_q_arr)[source]

Select action by Q(state, action).

Parameters:
  • next_action_arrnp.ndarray of actions.
  • next_q_arrnp.ndarray of Q-Values.
Retruns:
Tuple(np.ndarray of action., Q-Value)
select_action_key(next_action_arr, next_q_arr)[source]

Select action by Q(state, action).

Parameters:
  • next_action_arrnp.ndarray of actions.
  • next_q_arrnp.ndarray of Q-Values.
Retruns:
np.ndarray of keys.
set_epsilon_greedy_rate(value)[source]

setter

Module contents