The main interaction with agents it through the ai-traineree-client python module.
Currently supported agent models:
SUPPORTED_MODELS= ['dqn', 'ppo', 'ddpg', 'rainbow']¶
Built-in mutable sequence.
If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.
The main component that allows communication with the Agents Bar is the
Its whole API is provided below. Note, however, that it is intentionally similar to all agents
in the AI Traineree (ai-traineree) package.
RemoteAgent(agent_name: str, description: str = '', **kwargs)¶
__init__(agent_name: str, description: str = '', **kwargs)¶
An instance of the agent in the Agents Bar.
description (str) – Optional. Description for the model, if creating a new one.
- Keyword Arguments
access_token (str) – Default None. Access token to use for authentication. If none provided then one is obtained by logging to the service using credentials.
username (str) – Default None. Overrides username from the env variables.
password (str) – Default None. Overrides password from the env variables.
get_access_token(username=None, password=None, access_token=None) → str¶
Retrieves access token.
create_agent(state_size: int, action_size: int, agent_model: str) → Dict¶
Creates a new agent in the service.
Uses provided information on RemoteAgent instantiation to create a new agent. Creating a new agent will fail if the owner already has one with the same name.
Note that it can take a few seconds to create a new agent. In such a case, any calls to the agent might fail. To make sure that your program doesn’t fail either use
ai_traineree_client.wait_until_agent_exists()or manually sleep for a few seconds.
state_size (int) – Dimensionality of the state space.
action_size (int) – Dimensionality of the action space. In case of discrete space, that’s a single dimensions with potential values. In case of continuous space, that’s a number of dimensions in uniform [0, 1] distribution.
agent_model (str) – Name of the model type. Check
ai_traineree_client.SUPPORTED_MODELSfor accepted values.
Details of created agent.
remove(*, agent_name: str, quite: bool = True) → bool¶
Deletes the agent.
Note that this action is irreversible. All information about agent will be lost.
agent_name (str) – You are required to pass the name of the agent as a proof that you’re an adult and you know what you’re doing.
quite (bool) – Silently ignores if provided agent_name doesn’t match actual name.
Boolean whether delete was successful.
Whether the agent service exists and is accessible
Dictionary of agent’s hyperparameters. Values are either numbers or strings, even if they could be different.
Gets agents meta-data from sever.
sync() → None¶
Synchronizes local information with the one stored in Agents Bar.
get_state() → ai_traineree_client.types.EncodedAgentState¶
Gets agents state in an encoded snapshot form.
Note that this API has a heavy rate limit.
Snapshot with config, buffer and network states being encoded.
upload_state(state: ai_traineree_client.types.EncodedAgentState) → bool¶
Updates remote agent with provided state.
state – Agent’s state with encoded values for buffer, config and network states.
Bool confirmation whether update was successful.
act(state, noise: float = 0) → Union[int, List[Union[int, float]]]¶
Asks for action based on provided state.
state (List floats) – Python list of floats which represent agent’s state.
noise (float) – Default 0. Value for epsilon in epsilon-greedy paradigm.
- Suggested action to take from this state.
In case of discrete problems this is a single int value. Otwherise it is a list of either floats or ints.
- Return type
action (a number or list of numbers)
step(state: List[float], action: Union[int, List[Union[int, float]]], reward: float, next_state: List[float], done: bool) → bool¶
Providing information from taking a step in environment.
Note that all values have to be python plain values, like ints, floats, lists… Unfortunately, numpy, pandas, tensors… aren’t currently supported.
state (StateType) – Current state.
action (ActionType) – Action taken from the current state.
reward (float) – A reward obtained from getting to the next state.
next_state (StateType) – The state that resulted from taking action at state.
done (bool) – A flag whether the next_state is a terminal state.