Agents

The main interaction with agents it through the ai-traineree-client python module.

Currently supported agent models:

ai_traineree_client.SUPPORTED_MODELS = ['dqn', 'ppo', 'ddpg', 'rainbow']

Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.

The main component that allows communication with the Agents Bar is the ai_traineree_client.RemoteClass. Its whole API is provided below. Note, however, that it is intentionally similar to all agents in the AI Traineree (ai-traineree) package.

class ai_traineree_client.RemoteAgent(agent_name: str, description: str = '', **kwargs)[source]
__init__(agent_name: str, description: str = '', **kwargs)[source]

An instance of the agent in the Agents Bar.

Parameters

description (str) – Optional. Description for the model, if creating a new one.

Keyword Arguments
  • access_token (str) – Default None. Access token to use for authentication. If none provided then one is obtained by logging to the service using credentials.

  • username (str) – Default None. Overrides username from the env variables.

  • password (str) – Default None. Overrides password from the env variables.

get_access_token(username=None, password=None, access_token=None)str[source]

Retrieves access token.

create_agent(state_size: int, action_size: int, agent_model: str)Dict[source]

Creates a new agent in the service.

Uses provided information on RemoteAgent instantiation to create a new agent. Creating a new agent will fail if the owner already has one with the same name.

Note that it can take a few seconds to create a new agent. In such a case, any calls to the agent might fail. To make sure that your program doesn’t fail either use ai_traineree_client.wait_until_agent_exists() or manually sleep for a few seconds.

Parameters
  • state_size (int) – Dimensionality of the state space.

  • action_size (int) – Dimensionality of the action space. In case of discrete space, that’s a single dimensions with potential values. In case of continuous space, that’s a number of dimensions in uniform [0, 1] distribution.

  • agent_model (str) – Name of the model type. Check ai_traineree_client.SUPPORTED_MODELS for accepted values.

Returns

Details of created agent.

remove(*, agent_name: str, quite: bool = True)bool[source]

Deletes the agent.

Note that this action is irreversible. All information about agent will be lost.

Parameters
  • agent_name (str) – You are required to pass the name of the agent as a proof that you’re an adult and you know what you’re doing.

  • quite (bool) – Silently ignores if provided agent_name doesn’t match actual name.

Returns

Boolean whether delete was successful.

property exists

Whether the agent service exists and is accessible

property hparams

Agents hyperparameters

Returns

Dictionary of agent’s hyperparameters. Values are either numbers or strings, even if they could be different.

info()[source]

Gets agents meta-data from sever.

sync()None[source]

Synchronizes local information with the one stored in Agents Bar.

get_state()ai_traineree_client.types.EncodedAgentState[source]

Gets agents state in an encoded snapshot form.

Note that this API has a heavy rate limit.

Returns

Snapshot with config, buffer and network states being encoded.

upload_state(state: ai_traineree_client.types.EncodedAgentState)bool[source]

Updates remote agent with provided state.

Parameters

state – Agent’s state with encoded values for buffer, config and network states.

Returns

Bool confirmation whether update was successful.

act(state, noise: float = 0)Union[int, List[Union[int, float]]][source]

Asks for action based on provided state.

Parameters
  • state (List floats) – Python list of floats which represent agent’s state.

  • noise (float) – Default 0. Value for epsilon in epsilon-greedy paradigm.

Returns

Suggested action to take from this state.

In case of discrete problems this is a single int value. Otwherise it is a list of either floats or ints.

Return type

action (a number or list of numbers)

step(state: List[float], action: Union[int, List[Union[int, float]]], reward: float, next_state: List[float], done: bool)bool[source]

Providing information from taking a step in environment.

Note that all values have to be python plain values, like ints, floats, lists… Unfortunately, numpy, pandas, tensors… aren’t currently supported.

Parameters
  • state (StateType) – Current state.

  • action (ActionType) – Action taken from the current state.

  • reward (float) – A reward obtained from getting to the next state.

  • next_state (StateType) – The state that resulted from taking action at state.

  • done (bool) – A flag whether the next_state is a terminal state.