Skip to content

LunarLander

LunarLander

Bases: EnvType

This class represents the Lunar Lander environment.

__init__(algo=Algo.DQN, algo_param={'batch_size': 128, 'buffer_size': 50000, 'exploration_final_eps': 0.1, 'exploration_fraction': 0.12, 'gamma': 0.99, 'gradient_steps': -1, 'learning_rate': 0.00063, 'learning_starts': 0, 'policy': 'MlpPolicy', 'policy_kwargs': {'net_arch': [256, 256]}, 'target_update_interval': 250, 'train_freq': 4, 'tensorboard_log': 'data/model/LunarLanderDQN/'}, prompt={'Goal': "Land safely on the ground, but don't move if you touch the ground", 'Observation Space': 'Box([ -2.5 -2.5 -10. -10. -6.2831855 -10. -0. -0. ], [ 2.5 2.5 10. 10. 6.2831855 10. 1. 1. ], (8,), float32)\n The state is an 8-dimensional vector: the coordinates of the lander in x & y, its linear velocities in x & y, its angle, its angular velocity, and two booleans that represent whether each leg is in contact with the ground or not.\n '})

Constructor for the Lunar Lander environment.

Parameters:

Name Type Description Default
algo Algo

The algorithm to be used for training. Defaults to Algo.DQN.

DQN
algo_param dict

The parameters for the algorithm. Defaults to {"batch_size": 128, "buffer_size": 50000, "exploration_final_eps": 0.1, "exploration_fraction": 0.12, "gamma": 0.99, "gradient_steps": -1, "learning_rate": 0.00063, "learning_starts": 0, "policy": "MlpPolicy", "policy_kwargs": {"net_arch": [256, 256]}, "target_update_interval": 250, "train_freq": 4, "tensorboard_log": "data/model/LunarLanderDQN/"}.

{'batch_size': 128, 'buffer_size': 50000, 'exploration_final_eps': 0.1, 'exploration_fraction': 0.12, 'gamma': 0.99, 'gradient_steps': -1, 'learning_rate': 0.00063, 'learning_starts': 0, 'policy': 'MlpPolicy', 'policy_kwargs': {'net_arch': [256, 256]}, 'target_update_interval': 250, 'train_freq': 4, 'tensorboard_log': 'data/model/LunarLanderDQN/'}
prompt dict | str

The prompt for the environment. Defaults to {"Goal": "Land safely on the ground, but don't move if you touch the ground", "Observation Space": [...].

{'Goal': "Land safely on the ground, but don't move if you touch the ground", 'Observation Space': 'Box([ -2.5 -2.5 -10. -10. -6.2831855 -10. -0. -0. ], [ 2.5 2.5 10. 10. 6.2831855 10. 1. 1. ], (8,), float32)\n The state is an 8-dimensional vector: the coordinates of the lander in x & y, its linear velocities in x & y, its angle, its angular velocity, and two booleans that represent whether each leg is in contact with the ground or not.\n '}

__repr__()

This function returns the name of the environment.

Returns:

Name Type Description
str

The name of the environment.

objective_metric(states)

This function calculates the objective metric for the Lunar Lander environment.

Parameters:

Name Type Description Default
states list

The states of the environment.

required

Returns:

Type Description
list[dict[str, float]]

list[dict[str, float]]: The objective metric for the environment.

success_func(env, info)

This function checks if the Lunar Lander has landed successfully or failed.

Parameters:

Name Type Description Default
env Env

The environment.

required
info dict

The information about the environment.

required

Returns:

Type Description
tuple[bool | bool]

tuple[bool | bool]: A tuple of two booleans. The first boolean represents if the lander has landed successfully, and the second boolean represents if the lander has failed