GenCode
GenCode
Generate the code from a response, it can be handle error, and refine from the llm new responses
__init__(env, llm)
Generate the code from a response, it can be handle error, and refine from the llm new responses
Parameters:
Name | Type | Description | Default |
---|---|---|---|
env
|
Environments
|
the environment to test the code |
required |
llm
|
OllamaChat
|
the llm to handle the response |
required |
compile_reward_function()
Compile a reward function dynamically from a string response.
This method takes a code string representing a reward function and dynamically compiles it into an executable Python function. It provides a secure way to generate reward functions for reinforcement learning environments.
Key Features
- Dynamically executes code in an isolated global namespace
- Provides access to NumPy functions
- Extracts the compiled function by its name
- Robust error handling for syntax issues
Parameters:
Name | Type | Description | Default |
---|---|---|---|
response
|
str
|
A string containing a complete Python function definition for a reward function. |
required |
Returns:
Name | Type | Description |
---|---|---|
Callable |
Callable
|
The compiled reward function that can be called with appropriate arguments in a gym environment. |
Raises:
Type | Description |
---|---|
SyntaxError
|
If the provided code contains invalid Python syntax. |
ValueError
|
If the function cannot be extracted from the compiled namespace. |
Notes
- Uses
exec()
for dynamic code compilation - Provides NumPy (
np
) in the execution namespace - Assumes the last function defined in the response is the reward function
get(response)
Get the next state from the response code
Parameters:
Name | Type | Description | Default |
---|---|---|---|
response
|
str
|
response code from the llm |
required |
Returns:
Name | Type | Description |
---|---|---|
State |
State
|
contain the Callable reward function and is string associeted |
get_clean_response()
Clean and validate a code response by removing code block markers and ensuring a function definition.
This method is designed to process code responses, typically extracted from text or code blocks, by performing the following operations:
- Remove leading and trailing code block markers (```),
- Remove the 'python' language identifier,
- Strip any additional whitespace
- Validate that the response contains a function definition
Parameters:
Name | Type | Description | Default |
---|---|---|---|
response
|
str
|
The raw code response to be cleaned and validated. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
None
|
The cleaned code response containing a function definition. |
Raises:
Type | Description |
---|---|
ValueError
|
If the response does not contain a valid function definition (i.e., if "def " is not present in the cleaned response). |
Logging
Logs the cleaned code at DEBUG level for debugging purposes.
get_runnable_function(error=None)
Process and validate a reward function for a gym environment.
This method attempts to generate and validate a reward function by:
- Handling potential previous errors
- Creating a gym environment
- Cleaning and compiling the code
- Testing the reward function with a sample action
- Recursively handling various potential errors
Parameters:
Name | Type | Description | Default |
---|---|---|---|
response
|
str
|
The code response containing the reward function definition. |
required |
error
|
str
|
Previous error message to be added to LLM context. Defaults to None. |
None
|
Returns:
Name | Type | Description |
---|---|---|
tuple |
Callable
|
A tuple containing: - Callable: The compiled and validated reward function - str: The original response code |
Raises:
Type | Description |
---|---|
-ValueError
|
Invalid function definition |
-SyntaxError
|
Syntax issues in the function |
-RuntimeError
|
Execution problems during function testing |
Note
- Uses recursion to handle potential errors
- Relies on get_code, compile_reward_function, and test_reward_function methods
- Provides a robust mechanism for generating valid reward functions
test_reward_function(reward_function, *args, **kwargs)
Test the compiled reward function with provided inputs to validate its execution.
This method serves as a crucial validation step in the reward function generation process. It attempts to execute the reward function with the given arguments and logs the output or raises an error if execution fails.
Purpose
- Verify the reward function can be executed without errors
- Log the reward function's output for debugging
- Ensure the function returns a valid result in the context of a gym environment
Parameters:
Name | Type | Description | Default |
---|---|---|---|
reward_function
|
Callable
|
The compiled reward function to be tested. |
required |
*args
|
Variable length argument list to pass to the reward function. Typically includes observations, actions, or environment states. |
()
|
|
**kwargs
|
Arbitrary keyword arguments to pass to the reward function. May include additional context like 'terminated' or 'truncated' flags. |
{}
|
Raises:
Type | Description |
---|---|
RuntimeError
|
If the reward function fails to execute successfully. This includes any exceptions that occur during function invocation. |
Logging
- Logs the reward function's output at DEBUG level when successful
- Provides detailed error information if execution fails
Notes
- Designed to be flexible with varying function signatures
- Critical for validating dynamically generated reward functions
- Part of the reward function generation quality control process