Types
askui.ActModel
Abstract base class for models that can execute autonomous actions.
Models implementing this interface can be used with the VisionAgent.act()
.
Example:
askui.exceptions.AiElementNotFound
Exception raised when an AI element is not found.
Arguments:
name
str - The name of the AI element that was not found.locations
list[pathlib.Path] - The locations that were searched for the AI element.
askui.exceptions.AskUiApiError
Base exception for AskUI API errors.
This exception is raised when there is an error communicating with the AskUI API. It serves as a base class for more specific API-related exceptions.
Arguments:
message
str - The error message.
askui.exceptions.AskUiApiRequestFailedError
Exception raised when an API response is not as expected.
This exception is raised when the API returns a response that cannot be processed or indicates an error condition. It includes the HTTP status code and error message from the API response.
Arguments:
status_code
int - The HTTP status code from the API response.message
str - The error message from the API response.
askui.exceptions.AutomationError
Exception raised when the automation step cannot complete.
Arguments:
message
str - The error message.
askui.exceptions.ElementNotFoundError
Exception raised when an element cannot be located.
Arguments:
message
str - The error message.
askui.exceptions.QueryNoResponseError
Exception raised when a query does not return a response.
Arguments:
message
str - The error message.query
str - The query that was made.
askui.exceptions.QueryUnexpectedResponseError
Exception raised when a query returns an unexpected response.
Arguments:
message
str - The error message.query
str - The query that was made.response
Any - The response that was received.
askui.exceptions.ModelNotFoundError
Exception raised when an invalid model is used.
Arguments:
model
str | ModelComposition - The model that was used.model_type
Literal[“Act”, “Grounding (locate)”, “Query (get/extract)”] - The type of model that was used.
askui.exceptions.ModelTypeMismatchError
askui.GetModel
Abstract base class for models that can extract information from images.
Models implementing this interface can be used with the get()
method of
VisionAgent
to extract information from screenshots or other images. These models analyze visual
content and return structured or unstructured information based on queries.
Example:
askui.ImageSource
A Pydantic model that represents an image source and provides methods to convert it to different formats.
The model can be initialized with:
- A PIL Image object
- A file path (str or pathlib.Path)
- A data URL string
Attributes:
root
PILImage.Image - The underlying PIL Image object.
Arguments:
root
Img - The image source to load from.
askui.Img
Type of the input images for askui.VisionAgent.get()
, askui.VisionAgent.locate()
, etc.
Accepts:
PIL.Image.Image
- Relative or absolute file path (
str
orpathlib.Path
) - Data URL (e.g.,
"data:image/png;base64,..."
)
askui.LocateModel
Abstract base class for models that can locate UI elements in images.
Models implementing this interface can be used with the click()
, locate()
, and
mouse_move()
methods of VisionAgent
to find UI elements on screen. These models
analyze visual content to determine the coordinates of elements based on
descriptions or locators.
Example:
askui.Model
Union type of all abstract model classes.
This type represents any model that can be used with VisionAgent
, whether it’s an
ActModel
, GetModel
, or LocateModel
. It’s useful for type hints when you need to
work with models in a generic way.
askui.ModelComposition
A composition of models (list of ModelDefinition
) to be used for a task, e.g., locating an element on the screen to be able to click on it or extracting text from an image.
askui.ModelDefinition
A definition of a model.
Arguments:
task
str - The task the model is trained for, e.g., end-to-end OCR ("e2e_ocr"
) or object detection ("od"
)architecture
str - The architecture of the model, e.g.,"easy_ocr"
or"yolo"
version
str - The version of the modelinterface
str - The interface the model is trained for, e.g.,"online_learning"
use_case
str, optional - The use case the model is trained for. In the case of workspace specific AskUI models, this is often the workspace id but with ”-” replaced by ”_”. Defaults to"00000000_0000_0000_0000_000000000000"
(custom null value).tags
list[str], optional - Tags for identifying the model that cannot be represented by other properties, e.g.,["trained", "word_level"]
askui.ModelRegistry
Type definition for model registry.
A dictionary mapping model names to either model instances or factory functions (for
lazy initialization on first use) that create model instances. Used to register custom
models with VisionAgent
.
Example:
askui.models.ModelName
Enumeration of all available model names in AskUI.
This enum provides type-safe access to model identifiers used throughout the library. Each model name corresponds to a specific AI model or model composition that can be used for different tasks like acting, getting information, or locating elements.
askui.models.OpenRouterGetModel
Get model for OpenRouter.
askui.models.OpenRouterSettings
Settings for OpenRouter.
askui.ModifierKey
Modifier keys for keyboard actions.
askui.PcKey
PC keys for keyboard actions.
askui.models.Point
A tuple of two integers representing the coordinates of a point on the screen.
askui.ResponseSchema
Type of the responses of data extracted, e.g., using askui.VisionAgent.get()
.
The following types are allowed:
ResponseSchemaBase
: Custom Pydantic models that extendResponseSchemaBase
str
: String responsesbool
: Boolean responsesint
: Integer responsesfloat
: Floating point responses
Usually, serialized as a JSON schema, e.g., str
as {"type": "string"}
, to be passed to model(s).
Also used for validating the responses of the model(s) used for data extraction.
askui.ResponseSchemaBase
Base class for response schemas to be used for defining the response of data extraction, e.g., using askui.VisionAgent.get()
.
This class extends Pydantic’s BaseModel and adds constraints and configuration on top so that it can be used with models to define the schema (type) of the data to be extracted.
Example: