Selecting Elements
On this page you will learn the different recommended practices
Selecting the right UI elements is crucial for creating reliable automation workflows. This guide provides strategies to improve accuracy and performance when interacting with UI elements.
Natural Language Element Selection
AskUI’s Vision Agent uses natural language prompts to identify and interact with UI elements on your screen. This approach offers several advantages:
- Intuitive Descriptions: Use everyday language to describe elements like “login button” or “search field”
- Contextual Understanding: The agent understands elements based on their visual appearance and surrounding context
- Flexibility: No need to learn complex selector syntax or DOM structures
How Natural Language Selection Works
When you provide a natural language prompt like “login button”, the Vision Agent:
- Captures the current screen state
- Analyzes the visual elements present
- Identifies the element that best matches your description
- Performs the requested action on that element
Example: Calculator Automation
AI Elements
AI Elements allow you to capture and use specific visual elements from your screen for reliable automation.
Enable the AskUI Development Environment as described in the installation guide and activate experimental commands by running AskUI-ImportExperimentalCommands
in your terminal.
This command captures elements from your screen for use in AskUI workflows.
It then can be used by setting the model config to askui-ai-element
, like:
Parameters:
- Name (Optional): The name for the screenshot file. If defined, indicates only one element is being captured. For multiple elements, you’ll provide names individually.
- WorkspaceId (Optional): Uses the Workspace ID from settings by default. Required if not in settings.
- AlwaysPreview (Optional): Automatically opens preview without prompting. Cannot be used with NoPreview.
- NoPreview (Optional): Skips the preview. Cannot be used with AlwaysPreview.
- OneShot (Optional): Ends snipping after first successful AI Element creation.
- Annotate (Optional): Takes a fullscreen capture for region annotation.
Example Output:
By implementing these techniques, you’ll build more reliable, flexible, and accurate automation workflows that can handle complex UI interactions with confidence.