Agents

The main interaction with agents it through the agents-bar-client-python python module.

Currently supported agent models:

agents_bar.SUPPORTED_MODELS = ['dqn', 'ppo', 'ddpg', 'rainbow']

Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.

The main component that allows communication with the Agents Bar is the agents_bar.RemoteClass. Its whole API is provided below. Note, however, that it is intentionally similar to all agents in the AI Traineree (ai-traineree) package.

class agents_bar.RemoteAgent(client: agents_bar.client.Client, agent_name: str, **kwargs)[source]
__init__(client: agents_bar.client.Client, agent_name: str, **kwargs)[source]

An instance of the agent in the Agents Bar.

Parameters

description (str) – Optional. Description for the model, if creating a new one.

Keyword Arguments
  • access_token (str) – Default None. Access token to use for authentication. If none provided then one is obtained by logging to the service using credentials.

  • username (str) – Default None. Overrides username from the env variables.

  • password (str) – Default None. Overrides password from the env variables.

create_agent(obs_size: int, action_size: int, agent_model: str, active: bool = True, description: Optional[str] = None) Dict[source]

Creates a new agent in the service.

Uses provided information on RemoteAgent instantiation to create a new agent. Creating a new agent will fail if the owner already has one with the same name.

Note that it can take a few seconds to create a new agent. In such a case, any calls to the agent might fail. To make sure that your program doesn’t fail either use agents_bar.wait_until_agent_exists() or manually sleep for a few seconds.

Parameters
  • obs_size (int) – Dimensionality of the observation space.

  • action_size (int) – Dimensionality of the action space. In case of discrete space, that’s a single dimensions with potential values. In case of continuous space, that’s a number of dimensions in uniform [0, 1] distribution.

  • agent_model (str) – Name of the model type. Check agents_bar.SUPPORTED_MODELS for accepted values.

  • active (bool) – Whether to activate the agent.

Returns

Details of created agent.

remove(*, agent_name: str, quite: bool = True) bool[source]

Deletes the agent.

Note that this action is irreversible. All information about agent will be lost.

Parameters
  • agent_name (str) – You are required to pass the name of the agent as a proof that you’re an adult and you know what you’re doing.

  • quite (bool) – Silently ignores if provided agent_name doesn’t match actual name.

Returns

Boolean whether an agent was delete. False can mean that the agent didn’t exist.

property exists

Whether the agent service exists and is accessible

property hparams: Dict[str, Union[str, float, int]]

Agents hyperparameters

Returns

Dictionary of agent’s hyperparameters. Values are either numbers or strings, even if they could be different.

info() Dict[str, Any][source]

Gets agents meta-data from sever.

sync() None[source]

Synchronizes local information with the one stored in Agents Bar.

get_state() agents_bar.types.EncodedAgentState[source]

Gets agents state in an encoded snapshot form.

Note that this API has a heavy rate limit.

Returns

Snapshot with config, buffer and network states being encoded.

upload_state(state: agents_bar.types.EncodedAgentState) bool[source]

Updates remote agent with provided state.

Parameters

state – Agent’s state with encoded values for buffer, config and network states.

Returns

Bool confirmation whether update was successful.

act(obs, noise: float = 0) Union[int, List[Union[int, float]]][source]

Asks for action based on provided observation.

Parameters
  • obs (List floats) – Python list of floats which represent agent’s observation.

  • noise (float) – Default 0. Value for epsilon in epsilon-greedy paradigm.

Returns

Suggested action to take from this observation.

In case of discrete problems this is a single int value. Otwherise it is a list of either floats or ints.

Return type

action (a number or list of numbers)

step(obs: List[float], action: Union[int, List[Union[int, float]]], reward: float, next_obs: List[float], done: bool) bool[source]

Providing information from taking a step in environment.

Note that all values have to be python plain values, like ints, floats, lists… Unfortunately, numpy, pandas, tensors… aren’t currently supported.

Parameters
  • obs (ObsType) – Current observation.

  • action (ActionType) – Action taken from the current observation.

  • reward (float) – A reward obtained from getting to the next observation.

  • next_obs (ObsType) – The observation that resulted from taking action at obs.

  • done (bool) – A flag whether the next_obs is a terminal state.