Finding and Selecting UI Elements

AskUI provides multiple ways to find and select UI elements on your screen. This guide covers the different approaches, from simple text-based selection to advanced locator strategies.

Natural Language Selection

The simplest way to interact with UI elements is using natural language descriptions. AskUI’s Vision Agent understands everyday language to find elements on your screen.

from askui import VisionAgent

with VisionAgent() as agent:
    # Find and click a button
    agent.click("Login button")
    
    # Find and type in a text field
    agent.type("myuser@example.com")
    
    # Find and hover over a menu item
    agent.mouse_move("Settings menu")

Benefits of Natural Language Selection

Intuitive: Use everyday language to describe elements
Context-Aware: Understands elements based on their visual appearance and surroundings
Flexible: Works across different applications and interfaces
Maintainable: No need to update selectors when UI changes slightly

Advanced Selection with Locators

For more precise element selection, use AskUI’s locator system to build sophisticated element selectors.

Basic Locators

from askui import locators as loc

with VisionAgent() as agent:
    # Find by text
    agent.click(loc.Text("Submit"))
    
    # Find by element type
    agent.click(loc.Element("textfield"))  # Find a text field
    agent.click(loc.Element("text"))       # Find a text element
    
    # Find by image
    agent.click(loc.Image("logo.png"))

Relative Locators

Build complex selectors by describing element relationships:

from askui import locators as loc

with VisionAgent() as agent:
    # Find a text field next to a label
    password_label = loc.Text("Password")
    password_field = loc.Element("textfield").right_of(password_label)
    
    # Find text below a heading
    submit_text = loc.Element("text").below_of(loc.Text("Complete Registration"))
    
    # Find an element near another element
    menu_item = loc.Text("Settings").nearest_to(loc.Image("user-icon.png"))

Available Locator Methods

Text(): Find elements by their text content
Element(): Find elements by their type:
- "text": Text elements
- "textfield": Input fields
Image(): Find elements by their visual appearance
AiElement(): Use AI-captured elements for reliable selection

Locator Relationships

right_of(): Element is to the right of another element
left_of(): Element is to the left of another element
above(): Element is above another element
below(): Element is below another element
near(): Element is near another element

Multi-Monitor Support

When working with multiple monitors, specify which display to automate:

with VisionAgent(display=1) as agent:  # Use primary display
    agent.click("Element on first monitor")

with VisionAgent(display=2) as agent:  # Use secondary display
    agent.click("Element on second monitor")

AI Elements

For complex or dynamic elements, use AI Elements to capture and reuse specific visual elements:

Steps:

Open AskUI Shell

askui-shell

Create a new AI Element

# Capture elements from your screen
AskUI-NewAiElement -Name "my-element-name"

Use captured AI Elements in your code:

from askui import locators as loc
...
with VisionAgent() as agent:
    agent.click(loc.AiElement("my-element-name"))

If you cannot use the AskUI-NewAIElement command, activate experimental commands by running AskUI-ImportExperimentalCommands in your terminal.

Best Practices

Start Simple
- Use natural language selection for basic cases
- Only use locators when needed for precision
Be Specific
- Use clear, descriptive text for natural language selection
- Combine multiple locators for unique identification
- Use the correct element types with Element() locator
Handle Dynamic Content
- Use relative locators for elements that move
- Consider AI Elements for complex visual patterns
Multi-Monitor Setup
- Test on each monitor to find the correct display number
- Use consistent display settings across your team

By following these guidelines, you’ll create robust and maintainable element selection strategies for your automation workflows.

Introduction

Getting Started

Core Concepts

Model Usage & Configuration

AskUI Suite

Integrations & Advanced Usage

Updates & Glossary

Element Selection General

Finding and Selecting UI Elements

Natural Language Selection

Benefits of Natural Language Selection

Advanced Selection with Locators

Basic Locators

Relative Locators

Available Locator Methods

Locator Relationships

Multi-Monitor Support

AI Elements

Best Practices

Introduction

Getting Started

Core Concepts

Model Usage & Configuration

AskUI Suite

Integrations & Advanced Usage

Updates & Glossary

​Finding and Selecting UI Elements

​Natural Language Selection

​Benefits of Natural Language Selection

​Advanced Selection with Locators

​Basic Locators

​Relative Locators

​Available Locator Methods

​Locator Relationships

​Multi-Monitor Support

​AI Elements

​Best Practices

Finding and Selecting UI Elements

Natural Language Selection

Benefits of Natural Language Selection

Advanced Selection with Locators

Basic Locators

Relative Locators

Available Locator Methods

Locator Relationships

Multi-Monitor Support

AI Elements

Best Practices