Junior
The Junior sub agent is the first deep learning method in CurriculumAgent pipeline with the goal to mimic the actions of the greedy Tutor agent. The purpose of this agent is to fit a sequential neural network, i.e., the weights of the network, on the input data of the Grid2Op environment. After a successful training, the weights are then used for the Senior in order to warm start the Deep Reinforcement Learning approach. Accordingly, the Junior sub agent plays a vital role in the curriculum approach.
Usage
Overall, the agent is trained on the experience of the Tutor agent. For this reason, the experience output of
the collect_tutor_experience is first separated into a training, validation and test set via
the load_dataset(). Thereafter, the junior class Junior
can be used for training and evaluating the deep learning model.
Under consideration that the parameters of the junior agent are in many cases similar, one can alternatively just run
the train() method, which combines the collection and the training.
Note that if the general_tutor was used with multiple action sets, one has to provide
these sets for the junior as well.
Structure of the Junior Model
The Junior sub agent is based on a Tensorflow (Keras) sequential model and has the following structure:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 1000) 1222000
dense_1 (Dense) (None, 1000) 1001000
dense_2 (Dense) (None, 1000) 1001000
dropout (Dropout) (None, 1000) 0
dense_3 (Dense) (None, 1000) 1001000
dropout_1 (Dropout) (None, 1000) 0
dense_4 (Dense) (None, 806) 806806
=================================================================
Total params: 5,031,806
Trainable params: 5,031,806
Non-trainable params: 0
_________________________________________________________________