Finding and Selecting UI Elements

AskUI provides multiple ways to find and select UI elements on your screen. This guide covers the different approaches, from simple text-based selection to advanced locator strategies.

Natural Language Selection

The simplest way to interact with UI elements is using natural language descriptions. AskUI’s Vision Agent understands everyday language to find elements on your screen.

from askui import VisionAgent

with VisionAgent() as agent:
    # Find and click a button
    agent.click("Login button")
    
    # Find and type in a text field
    agent.type("myuser@example.com")
    
    # Find and hover over a menu item
    agent.mouse_move("Settings menu")

Benefits of Natural Language Selection

  • Intuitive: Use everyday language to describe elements
  • Context-Aware: Understands elements based on their visual appearance and surroundings
  • Flexible: Works across different applications and interfaces
  • Maintainable: No need to update selectors when UI changes slightly

Advanced Selection with Locators

For more precise element selection, use AskUI’s locator system to build sophisticated element selectors.

Basic Locators

from askui import locators as loc

with VisionAgent() as agent:
    # Find by text
    agent.click(loc.Text("Submit"))
    
    # Find by element type
    agent.click(loc.Element("textfield"))  # Find a text field
    agent.click(loc.Element("text"))       # Find a text element
    
    # Find by image
    agent.click(loc.Image("logo.png"))

Relative Locators

Build complex selectors by describing element relationships:

from askui import locators as loc

with VisionAgent() as agent:
    # Find a text field next to a label
    password_label = loc.Text("Password")
    password_field = loc.Element("textfield").right_of(password_label)
    
    # Find text below a heading
    submit_text = loc.Element("text").below_of(loc.Text("Complete Registration"))
    
    # Find an element near another element
    menu_item = loc.Text("Settings").nearest_to(loc.Image("user-icon.png"))

Available Locator Methods

  • Text(): Find elements by their text content
  • Element(): Find elements by their type:
    • "text": Text elements
    • "textfield": Input fields
  • Image(): Find elements by their visual appearance
  • AiElement(): Use AI-captured elements for reliable selection

Locator Relationships

  • right_of(): Element is to the right of another element
  • left_of(): Element is to the left of another element
  • above(): Element is above another element
  • below(): Element is below another element
  • near(): Element is near another element

Multi-Monitor Support

When working with multiple monitors, specify which display to automate:

with VisionAgent(display=1) as agent:  # Use primary display
    agent.click("Element on first monitor")

with VisionAgent(display=2) as agent:  # Use secondary display
    agent.click("Element on second monitor")

AI Elements

For complex or dynamic elements, use AI Elements to capture and reuse specific visual elements:

# Capture elements from your screen
AskUI-NewAiElement -Name "my-element-name"

Use captured AI Elements in your code:

from askui import locators as loc

with VisionAgent() as agent:
    agent.click(loc.AiElement("my-element-name"))

If you cannot use the AskUI-NewAIElement command, activate experimental commands by running AskUI-ImportExperimentalCommands in your terminal.

Best Practices

  1. Start Simple
    • Use natural language selection for basic cases
    • Only use locators when needed for precision
  2. Be Specific
    • Use clear, descriptive text for natural language selection
    • Combine multiple locators for unique identification
    • Use the correct element types with Element() locator
  3. Handle Dynamic Content
    • Use relative locators for elements that move
    • Consider AI Elements for complex visual patterns
  4. Multi-Monitor Setup
    • Test on each monitor to find the correct display number
    • Use consistent display settings across your team

By following these guidelines, you’ll create robust and maintainable element selection strategies for your automation workflows.