Skip to content

GenCode

GenCode

Generate the code from a response, it can be handle error, and refine from the llm new responses

__init__(env, llm)

Generate the code from a response, it can be handle error, and refine from the llm new responses

Parameters:

Name Type Description Default
env Environments

the environment to test the code

required
llm OllamaChat

the llm to handle the response

required

compile_reward_function()

Compile a reward function dynamically from a string response.

This method takes a code string representing a reward function and dynamically compiles it into an executable Python function. It provides a secure way to generate reward functions for reinforcement learning environments.

Key Features
  • Dynamically executes code in an isolated global namespace
  • Provides access to NumPy functions
  • Extracts the compiled function by its name
  • Robust error handling for syntax issues

Parameters:

Name Type Description Default
response str

A string containing a complete Python function definition for a reward function.

required

Returns:

Name Type Description
Callable Callable

The compiled reward function that can be called with appropriate arguments in a gym environment.

Raises:

Type Description
SyntaxError

If the provided code contains invalid Python syntax.

ValueError

If the function cannot be extracted from the compiled namespace.

Notes
  • Uses exec() for dynamic code compilation
  • Provides NumPy (np) in the execution namespace
  • Assumes the last function defined in the response is the reward function

get(response)

Get the next state from the response code

Parameters:

Name Type Description Default
response str

response code from the llm

required

Returns:

Name Type Description
State State

contain the Callable reward function and is string associeted

get_clean_response()

Clean and validate a code response by removing code block markers and ensuring a function definition.

This method is designed to process code responses, typically extracted from text or code blocks, by performing the following operations:

  1. Remove leading and trailing code block markers (```),
  2. Remove the 'python' language identifier,
  3. Strip any additional whitespace
  4. Validate that the response contains a function definition

Parameters:

Name Type Description Default
response str

The raw code response to be cleaned and validated.

required

Returns:

Name Type Description
str None

The cleaned code response containing a function definition.

Raises:

Type Description
ValueError

If the response does not contain a valid function definition (i.e., if "def " is not present in the cleaned response).

Logging

Logs the cleaned code at DEBUG level for debugging purposes.

get_runnable_function(error=None)

Process and validate a reward function for a gym environment.

This method attempts to generate and validate a reward function by:

  1. Handling potential previous errors
  2. Creating a gym environment
  3. Cleaning and compiling the code
  4. Testing the reward function with a sample action
  5. Recursively handling various potential errors

Parameters:

Name Type Description Default
response str

The code response containing the reward function definition.

required
error str

Previous error message to be added to LLM context. Defaults to None.

None

Returns:

Name Type Description
tuple Callable

A tuple containing: - Callable: The compiled and validated reward function - str: The original response code

Raises:

Type Description
-ValueError

Invalid function definition

-SyntaxError

Syntax issues in the function

-RuntimeError

Execution problems during function testing

Note
  • Uses recursion to handle potential errors
  • Relies on get_code, compile_reward_function, and test_reward_function methods
  • Provides a robust mechanism for generating valid reward functions

test_reward_function(reward_function, *args, **kwargs)

Test the compiled reward function with provided inputs to validate its execution.

This method serves as a crucial validation step in the reward function generation process. It attempts to execute the reward function with the given arguments and logs the output or raises an error if execution fails.

Purpose
  • Verify the reward function can be executed without errors
  • Log the reward function's output for debugging
  • Ensure the function returns a valid result in the context of a gym environment

Parameters:

Name Type Description Default
reward_function Callable

The compiled reward function to be tested.

required
*args

Variable length argument list to pass to the reward function. Typically includes observations, actions, or environment states.

()
**kwargs

Arbitrary keyword arguments to pass to the reward function. May include additional context like 'terminated' or 'truncated' flags.

{}

Raises:

Type Description
RuntimeError

If the reward function fails to execute successfully. This includes any exceptions that occur during function invocation.

Logging
  • Logs the reward function's output at DEBUG level when successful
  • Provides detailed error information if execution fails
Notes
  • Designed to be flexible with varying function signatures
  • Critical for validating dynamically generated reward functions
  • Part of the reward function generation quality control process