Agents¶
The main interaction with agents it through the agents-bar-client-python python module.
Currently supported agent models:
- agents_bar.SUPPORTED_MODELS = ['dqn', 'ppo', 'ddpg', 'rainbow']¶
Built-in mutable sequence.
If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.
The main component that allows communication with the Agents Bar is the agents_bar.RemoteClass
.
Its whole API is provided below. Note, however, that it is intentionally similar to all agents
in the AI Traineree (ai-traineree) package.
- class agents_bar.RemoteAgent(client: agents_bar.client.Client, agent_name: str, **kwargs)[source]¶
- __init__(client: agents_bar.client.Client, agent_name: str, **kwargs)[source]¶
An instance of the agent in the Agents Bar.
- Parameters
description (str) – Optional. Description for the model, if creating a new one.
- Keyword Arguments
access_token (str) – Default None. Access token to use for authentication. If none provided then one is obtained by logging to the service using credentials.
username (str) – Default None. Overrides username from the env variables.
password (str) – Default None. Overrides password from the env variables.
- create_agent(obs_size: int, action_size: int, agent_model: str, active: bool = True, description: Optional[str] = None) Dict [source]¶
Creates a new agent in the service.
Uses provided information on RemoteAgent instantiation to create a new agent. Creating a new agent will fail if the owner already has one with the same name.
Note that it can take a few seconds to create a new agent. In such a case, any calls to the agent might fail. To make sure that your program doesn’t fail either use
agents_bar.wait_until_agent_exists()
or manually sleep for a few seconds.- Parameters
obs_size (int) – Dimensionality of the observation space.
action_size (int) – Dimensionality of the action space. In case of discrete space, that’s a single dimensions with potential values. In case of continuous space, that’s a number of dimensions in uniform [0, 1] distribution.
agent_model (str) – Name of the model type. Check
agents_bar.SUPPORTED_MODELS
for accepted values.active (bool) – Whether to activate the agent.
- Returns
Details of created agent.
- remove(*, agent_name: str, quite: bool = True) bool [source]¶
Deletes the agent.
Note that this action is irreversible. All information about agent will be lost.
- Parameters
agent_name (str) – You are required to pass the name of the agent as a proof that you’re an adult and you know what you’re doing.
quite (bool) – Silently ignores if provided agent_name doesn’t match actual name.
- Returns
Boolean whether an agent was delete. False can mean that the agent didn’t exist.
- property exists¶
Whether the agent service exists and is accessible
- property hparams: Dict[str, Union[str, float, int]]¶
Agents hyperparameters
- Returns
Dictionary of agent’s hyperparameters. Values are either numbers or strings, even if they could be different.
- get_state() agents_bar.types.EncodedAgentState [source]¶
Gets agents state in an encoded snapshot form.
Note that this API has a heavy rate limit.
- Returns
Snapshot with config, buffer and network states being encoded.
- upload_state(state: agents_bar.types.EncodedAgentState) bool [source]¶
Updates remote agent with provided state.
- Parameters
state – Agent’s state with encoded values for buffer, config and network states.
- Returns
Bool confirmation whether update was successful.
- act(obs, noise: float = 0) Union[int, List[Union[int, float]]] [source]¶
Asks for action based on provided observation.
- Parameters
obs (List floats) – Python list of floats which represent agent’s observation.
noise (float) – Default 0. Value for epsilon in epsilon-greedy paradigm.
- Returns
- Suggested action to take from this observation.
In case of discrete problems this is a single int value. Otwherise it is a list of either floats or ints.
- Return type
action (a number or list of numbers)
- step(obs: List[float], action: Union[int, List[Union[int, float]]], reward: float, next_obs: List[float], done: bool) bool [source]¶
Providing information from taking a step in environment.
Note that all values have to be python plain values, like ints, floats, lists… Unfortunately, numpy, pandas, tensors… aren’t currently supported.
- Parameters
obs (ObsType) – Current observation.
action (ActionType) – Action taken from the current observation.
reward (float) – A reward obtained from getting to the next observation.
next_obs (ObsType) – The observation that resulted from taking action at obs.
done (bool) – A flag whether the next_obs is a terminal state.