The main interaction with agents it through the agents-bar-client-python python module.
Currently supported agent models:
- agents_bar.SUPPORTED_MODELS = ['dqn', 'ppo', 'ddpg', 'rainbow']¶
Built-in mutable sequence.
If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.
The main component that allows communication with the Agents Bar is the
Its whole API is provided below. Note, however, that it is intentionally similar to all agents
in the AI Traineree (ai-traineree) package.
- class agents_bar.RemoteAgent(client: agents_bar.client.Client, agent_name: str, **kwargs)¶
- __init__(client: agents_bar.client.Client, agent_name: str, **kwargs)¶
An instance of the agent in the Agents Bar.
description (str) – Optional. Description for the model, if creating a new one.
- Keyword Arguments
access_token (str) – Default None. Access token to use for authentication. If none provided then one is obtained by logging to the service using credentials.
username (str) – Default None. Overrides username from the env variables.
password (str) – Default None. Overrides password from the env variables.
- create_agent(obs_size: int, action_size: int, agent_model: str, active: bool = True, description: Optional[str] = None) Dict ¶
Creates a new agent in the service.
Uses provided information on RemoteAgent instantiation to create a new agent. Creating a new agent will fail if the owner already has one with the same name.
Note that it can take a few seconds to create a new agent. In such a case, any calls to the agent might fail. To make sure that your program doesn’t fail either use
agents_bar.wait_until_agent_exists()or manually sleep for a few seconds.
obs_size (int) – Dimensionality of the observation space.
action_size (int) – Dimensionality of the action space. In case of discrete space, that’s a single dimensions with potential values. In case of continuous space, that’s a number of dimensions in uniform [0, 1] distribution.
agent_model (str) – Name of the model type. Check
agents_bar.SUPPORTED_MODELSfor accepted values.
active (bool) – Whether to activate the agent.
Details of created agent.
- remove(*, agent_name: str, quite: bool = True) bool ¶
Deletes the agent.
Note that this action is irreversible. All information about agent will be lost.
agent_name (str) – You are required to pass the name of the agent as a proof that you’re an adult and you know what you’re doing.
quite (bool) – Silently ignores if provided agent_name doesn’t match actual name.
Boolean whether an agent was delete. False can mean that the agent didn’t exist.
- property exists¶
Whether the agent service exists and is accessible
- property hparams: Dict[str, Union[str, float, int]]¶
Dictionary of agent’s hyperparameters. Values are either numbers or strings, even if they could be different.
- info() Dict[str, Any] ¶
Gets agents meta-data from sever.
- sync() None ¶
Synchronizes local information with the one stored in Agents Bar.
- get_state() agents_bar.types.EncodedAgentState ¶
Gets agents state in an encoded snapshot form.
Note that this API has a heavy rate limit.
Snapshot with config, buffer and network states being encoded.
- upload_state(state: agents_bar.types.EncodedAgentState) bool ¶
Updates remote agent with provided state.
state – Agent’s state with encoded values for buffer, config and network states.
Bool confirmation whether update was successful.
- act(obs, noise: float = 0) Union[int, List[Union[int, float]]] ¶
Asks for action based on provided observation.
obs (List floats) – Python list of floats which represent agent’s observation.
noise (float) – Default 0. Value for epsilon in epsilon-greedy paradigm.
- Suggested action to take from this observation.
In case of discrete problems this is a single int value. Otwherise it is a list of either floats or ints.
- Return type
action (a number or list of numbers)
- step(obs: List[float], action: Union[int, List[Union[int, float]]], reward: float, next_obs: List[float], done: bool) bool ¶
Providing information from taking a step in environment.
Note that all values have to be python plain values, like ints, floats, lists… Unfortunately, numpy, pandas, tensors… aren’t currently supported.
obs (ObsType) – Current observation.
action (ActionType) – Action taken from the current observation.
reward (float) – A reward obtained from getting to the next observation.
next_obs (ObsType) – The observation that resulted from taking action at obs.
done (bool) – A flag whether the next_obs is a terminal state.