log_level
int | str, optional - The logging level to use. Defaults to logging.INFO
.display
int, optional - The display number to use for screen interactions. Defaults to 1
.reporters
list[Reporter] | None, optional - List of reporter instances for logging and reporting. If None
, an empty list is used.tools
AgentToolbox | None, optional - Custom toolbox instance. If None
, a default one will be created with AskUiControllerClient
.model
ModelChoice | ModelComposition | str | None, optional - The default choice or name of the model(s) to be used for vision tasks. Can be overridden by the model
parameter in the click()
, get()
, act()
etc. methods.retry
Retry, optional - The retry instance to use for retrying failed actions. Defaults to ConfigurableRetry
with exponential backoff. Currently only supported for locate()
method.models
ModelRegistry | None, optional - A registry of models to make available to the VisionAgent
so that they can be selected using the model
parameter of VisionAgent
or the model
parameter of its click()
, get()
, act()
etc. methods. Entries in the registry override entries in the default model registry.goal
str - A description of what the agent should achieve.model
str | None, optional - The composition or name of the model(s) to be used for achieving the goal
.on_message
OnMessageCb | None, optional - Callback for new messages. If it returns None
, stops and does not add the message.command
str - The command to execute on the command line.locator
str | Locator | None, optional - The identifier or description of the element to click. If None
, clicks at current position.button
‘left’ | ‘middle’ | ‘right’, optional - Specifies which mouse button to click. Defaults to 'left'
.repeat
int, optional - The number of times to click. Must be greater than 0
. Defaults to 1
.model
ModelComposition | str | None, optional - The composition or name of the model(s) to be used for locating the element to click on using the locator
.query
str - The query describing what information to retrieve.image
Img | None, optional - The image to extract information from. Defaults to a screenshot of the current screen. Can be a path to an image file, a PIL Image object or a data URL.response_schema
Type[ResponseSchema] | None, optional - A Pydantic model class that defines the response schema. If not provided, returns a string.model
str | None, optional - The composition or name of the model(s) to be used for retrieving information from the screen or image using the query
. Note: response_schema
is not supported by all models.str
if no response_schema
is provided.
Limitations:
ASKUI_WORKSPACE_ID
and ASKUI_TOKEN
are set) at the momentkey
PcKey | ModifierKey - The key to be pressed.key
PcKey | ModifierKey - The key to be released.key
PcKey | ModifierKey - The main key to press. This can be a letter, number, special character, or function key.modifier_keys
list[ModifierKey] | None, optional - List of modifier keys to press along with the main key. Common modifier keys include 'ctrl'
, 'alt'
, 'shift'
.repeat
int, optional - The number of times to press (and release) the key. Must be greater than 0
. Defaults to 1
.locator
str | Locator - The identifier or description of the element to locate.screenshot
Img | None, optional - The screenshot to use for locating the element. Can be a path to an image file, a PIL Image object or a data URL. If None
, takes a screenshot of the currently selected display.model
ModelComposition | str | None, optional - The composition or name of the model(s) to be used for locating the element using the locator
.Point
- The coordinates of the element as a tuple (x, y).button
‘left’ | ‘middle’ | ‘right’, optional - The mouse button to be pressed. Defaults to 'left'
.locator
str | Locator - The identifier or description of the element to move to.model
ModelComposition | str | None, optional - The composition or name of the model(s) to be used for locating the element to move the mouse to using the locator
.x
int - The horizontal scroll amount. Positive values typically scroll right, negative values scroll left.y
int - The vertical scroll amount. Positive values typically scroll down, negative values scroll up.10
might result in different distances depending on the application and system settings.
Example:
button
‘left’ | ‘middle’ | ‘right’, optional - The mouse button to be released. Defaults to 'left'
.text
str - The text to be typed. Must be at least 1
character long.sec
float - The number of seconds to wait. Must be greater than 0.0
.