accelbrainbase.samplabledata.policysampler._mxnet package¶
Submodules¶
accelbrainbase.samplabledata.policysampler._mxnet.labeled_similar_image_policy module¶
-
class
accelbrainbase.samplabledata.policysampler._mxnet.labeled_similar_image_policy.
LabeledSimilarImagePolicy
¶ Bases:
accelbrainbase.samplabledata.policy_sampler.PolicySampler
Policy sampler for the Deep Q-learning to evaluate the value of the “action” of selecting the image with the highest similarity based on the “state” of observing an image.
The state-action value is proportional to the similarity between the previously observed image and the currently selected image.
This class calculates the image similarity by cross entorpy of labels (metadata).
-
check_the_end_flag
(state_arr, meta_data_arr=None)¶ Check the end flag.
If this return value is True, the learning is end.
As a rule, the learning can not be stopped. This method should be overrided for concreate usecases.
Parameters: - state_arr – state in self.t.
- meta_data_arr – meta data of the state.
Returns: bool
-
get_labeled_image_iterator
()¶ getter for LabeledImageIterator.
-
labeled_image_iterator
¶ getter for LabeledImageIterator.
-
observe_reward_value
(state_arr, action_arr, meta_data_arr=None)¶ Compute the reward value.
Parameters: - state_arr – Tensor of state.
- action_arr – Tensor of action.
- meta_data_arr – Meta data of actions.
Returns: Reward value.
-
observe_state
(state_arr, meta_data_arr)¶ Observe states of agents in last epoch.
Parameters: - state_arr – Tensor of state.
- meta_data_arr – meta data of the state.
-
set_labeled_image_iterator
(value)¶ setter for LabeledImageIterator.
-
update_state
(action_arr, meta_data_arr=None)¶ Update state.
This method can be overrided for concreate usecases.
Parameters: - action_arr – action in self.t.
- meta_data_arr – meta data of the action.
Returns: Tuple data. - state in self.t+1. - meta data of the state.
-
accelbrainbase.samplabledata.policysampler._mxnet.labeled_summarization_policy module¶
-
class
accelbrainbase.samplabledata.policysampler._mxnet.labeled_summarization_policy.
LabeledSummarizationPolicy
(txt_path_list, abstract_pos='top', s_a_dist_weight=0.3)¶ Bases:
accelbrainbase.samplabledata.policy_sampler.PolicySampler
Policy sampler for the Deep Q-learning to evaluate the value of the “action” of selecting the image with the highest similarity based on the “state” of observing an image.
The state-action value is proportional to the similarity between the previously observed image and the currently selected image.
This class calculates the image similarity by mean squared error of images.
-
check_the_end_flag
(state_arr, meta_data_arr=None)¶ Check the end flag.
If this return value is True, the learning is end.
As a rule, the learning can not be stopped. This method should be overrided for concreate usecases.
Parameters: - state_arr – state in self.t.
- meta_data_arr – meta data of the state.
Returns: bool
-
get_unlabeled_t_hot_txt_iterator
()¶ getter for UnlabeledTHotTXTIterator.
-
observe_reward_value
(state_arr, action_arr, meta_data_arr=None)¶ Compute the reward value.
Parameters: - state_arr – Tensor of state.
- action_arr – Tensor of action.
- meta_data_arr – Meta data of actions.
Returns: Reward value.
-
observe_state
(state_arr, meta_data_arr)¶ Observe states of agents in last epoch.
Parameters: - state_arr – Tensor of state.
- meta_data_arr – meta data of the state.
-
set_unlabeled_t_hot_txt_iterator
(value)¶ setter for UnlabeledTHotTXTIterator.
-
unlabeled_t_hot_txt_iterator
¶ getter for UnlabeledTHotTXTIterator.
-
update_state
(action_arr, meta_data_arr=None)¶ Update state.
This method can be overrided for concreate usecases.
Parameters: - action_arr – action in self.t.
- meta_data_arr – meta data of the action.
Returns: Tuple data. - state in self.t+1. - meta data of the state.
-
accelbrainbase.samplabledata.policysampler._mxnet.unlabeled_similar_image_policy module¶
-
class
accelbrainbase.samplabledata.policysampler._mxnet.unlabeled_similar_image_policy.
UnlabeledSimilarImagePolicy
¶ Bases:
accelbrainbase.samplabledata.policy_sampler.PolicySampler
Policy sampler for the Deep Q-learning to evaluate the value of the “action” of selecting the image with the highest similarity based on the “state” of observing an image.
The state-action value is proportional to the similarity between the previously observed image and the currently selected image.
This class calculates the image similarity by mean squared error of images.
-
check_the_end_flag
(state_arr, meta_data_arr=None)¶ Check the end flag.
If this return value is True, the learning is end.
As a rule, the learning can not be stopped. This method should be overrided for concreate usecases.
Parameters: - state_arr – state in self.t.
- meta_data_arr – meta data of the state.
Returns: bool
-
get_unlabeled_image_iterator
()¶ getter for UnlabeledImageIterator.
-
observe_reward_value
(state_arr, action_arr, meta_data_arr=None)¶ Compute the reward value.
Parameters: - state_arr – Tensor of state.
- action_arr – Tensor of action.
- meta_data_arr – Meta data of actions.
Returns: Reward value.
-
observe_state
(state_arr, meta_data_arr)¶ Observe states of agents in last epoch.
Parameters: - state_arr – Tensor of state.
- meta_data_arr – meta data of the state.
-
set_unlabeled_image_iterator
(value)¶ setter for UnlabeledImageIterator.
-
unlabeled_image_iterator
¶ getter for UnlabeledImageIterator.
-
update_state
(action_arr, meta_data_arr=None)¶ Update state.
This method can be overrided for concreate usecases.
Parameters: - action_arr – action in self.t.
- meta_data_arr – meta data of the action.
Returns: Tuple data. - state in self.t+1. - meta data of the state.
-