LunarLander
LunarLander
Bases: EnvType
This class represents the Lunar Lander environment.
__init__(algo=Algo.DQN, algo_param={'batch_size': 128, 'buffer_size': 50000, 'exploration_final_eps': 0.1, 'exploration_fraction': 0.12, 'gamma': 0.99, 'gradient_steps': -1, 'learning_rate': 0.00063, 'learning_starts': 0, 'policy': 'MlpPolicy', 'policy_kwargs': {'net_arch': [256, 256]}, 'target_update_interval': 250, 'train_freq': 4, 'tensorboard_log': 'data/model/LunarLanderDQN/'}, prompt={'Goal': "Land safely on the ground, but don't move if you touch the ground", 'Observation Space': 'Box([ -2.5 -2.5 -10. -10. -6.2831855 -10. -0. -0. ], [ 2.5 2.5 10. 10. 6.2831855 10. 1. 1. ], (8,), float32)\n The state is an 8-dimensional vector: the coordinates of the lander in x & y, its linear velocities in x & y, its angle, its angular velocity, and two booleans that represent whether each leg is in contact with the ground or not.\n '})
Constructor for the Lunar Lander environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
algo
|
Algo
|
The algorithm to be used for training. Defaults to Algo.DQN. |
DQN
|
algo_param
|
dict
|
The parameters for the algorithm. Defaults to {"batch_size": 128, "buffer_size": 50000, "exploration_final_eps": 0.1, "exploration_fraction": 0.12, "gamma": 0.99, "gradient_steps": -1, "learning_rate": 0.00063, "learning_starts": 0, "policy": "MlpPolicy", "policy_kwargs": {"net_arch": [256, 256]}, "target_update_interval": 250, "train_freq": 4, "tensorboard_log": "data/model/LunarLanderDQN/"}. |
{'batch_size': 128, 'buffer_size': 50000, 'exploration_final_eps': 0.1, 'exploration_fraction': 0.12, 'gamma': 0.99, 'gradient_steps': -1, 'learning_rate': 0.00063, 'learning_starts': 0, 'policy': 'MlpPolicy', 'policy_kwargs': {'net_arch': [256, 256]}, 'target_update_interval': 250, 'train_freq': 4, 'tensorboard_log': 'data/model/LunarLanderDQN/'}
|
prompt
|
dict | str
|
The prompt for the environment. Defaults to {"Goal": "Land safely on the ground, but don't move if you touch the ground", "Observation Space": [...]. |
{'Goal': "Land safely on the ground, but don't move if you touch the ground", 'Observation Space': 'Box([ -2.5 -2.5 -10. -10. -6.2831855 -10. -0. -0. ], [ 2.5 2.5 10. 10. 6.2831855 10. 1. 1. ], (8,), float32)\n The state is an 8-dimensional vector: the coordinates of the lander in x & y, its linear velocities in x & y, its angle, its angular velocity, and two booleans that represent whether each leg is in contact with the ground or not.\n '}
|
__repr__()
This function returns the name of the environment.
Returns:
Name | Type | Description |
---|---|---|
str |
The name of the environment. |
objective_metric(states)
This function calculates the objective metric for the Lunar Lander environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
states
|
list
|
The states of the environment. |
required |
Returns:
Type | Description |
---|---|
list[dict[str, float]]
|
list[dict[str, float]]: The objective metric for the environment. |
success_func(env, info)
This function checks if the Lunar Lander has landed successfully or failed.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
env
|
Env
|
The environment. |
required |
info
|
dict
|
The information about the environment. |
required |
Returns:
Type | Description |
---|---|
tuple[bool | bool]
|
tuple[bool | bool]: A tuple of two booleans. The first boolean represents if the lander has landed successfully, and the second boolean represents if the lander has failed |