Single-Step Commands

Single-step commands serve as an escape hatch when agentic instructions cannot handle specific automation cases reliably enough. While AI models are still involved in understanding visual elements and executing commands, this paradigm provides explicit control over the automation workflow.

Philosophy of Single-Step Commands

Single-step commands operate on the principle of explicit control, where every action is deliberately specified by the developer. This approach provides:

Fine-grained control: Each command specifies exactly what action to take
Higher reproducibility: More consistent behavior across runs (though not deterministic)
Debugging clarity: Easy to identify exactly where issues occur
Granular specification: Precise control over individual interactions

# Single-step approach: explicit control
with VisionAgent() as agent:
    agent.click("username field")
    agent.type("username field", "john.doe")
    agent.click("password field")
    agent.type("password field", "secret123")
    agent.click("Submit button")

Core Components

Element Selection

Single-step commands rely on precise element selection methods:

Locator-Based Selection: Use structured locators to find elements by text, type, or image
Relative Locators: Find elements based on their spatial relationship to other elements - see Relative locators
AI Elements: Capture and reuse specific visual elements for repeated interactions - see AI Element locators

Interaction Methods

Single-step commands provide granular control over interactions. Available methods vary by platform - see Agent Types for platform-specific details:

Click Operations: Left click, right click, double click with precise targeting
Text Input: Type text into specific fields with options for clearing existing content
Keyboard Operations: Send individual key presses and key combinations (Ctrl+C, Alt+Tab, etc.)
Mouse Operations: Direct mouse movement, scrolling, and drag-and-drop actions

Tools and Utilities

Built-in tools extend single-step command capabilities. Available tools vary by platform - see Agent Types for platform-specific details:

Web Browser Tools: Open new browser windows, navigate to URLs, control browser tabs
Operating System Tools: Clipboard operations, file system access, multi-monitor support
Data Processing Tools: Text extraction, data validation, content manipulation
System Integration: Process management, network operations, configuration handling

Next Steps

Review Element Selection Best Practices for reliable selection
Learn about Interaction Best Practices for effective single-step automation
Explore Agentic Instructions for goal-oriented automation

Documentation

Tutorial

How-to Guides

Understanding AskUI

Philosophy of Single-Step Commands

Core Components

Element Selection

Interaction Methods

Tools and Utilities

Next Steps

Documentation

Tutorial

How-to Guides

Understanding AskUI

​Philosophy of Single-Step Commands

​Core Components

​Element Selection

​Interaction Methods

​Tools and Utilities

​Next Steps

Philosophy of Single-Step Commands

Core Components

Element Selection

Interaction Methods

Tools and Utilities

Next Steps