Understand AskUI’s single-step commands and agentic instruction approaches
AskUI operates through two distinct paradigms that reflect different approaches to automation complexity. Understanding these paradigms helps explain why AskUI can handle both fine-grained controlled tasks and complex, adaptive workflows.
Agentic instructions let you describe what you want to accomplish in natural language. The agent interprets your goals and determines the necessary steps independently.
Recommended for most use cases:
Single-step commands represent the traditional automation approach where each interaction is explicitly defined and controlled. While AI models are still involved in understanding visual elements and executing commands, this paradigm provides explicit control over the automation workflow.
Use this approach when:
It’s important to understand that both paradigms rely on AI models - the difference lies in how decisions are made, not whether AI is involved:
Both single-step commands and agentic instructions use AI for:
AskUI leverages three distinct types of AI models that work together to enable both paradigms (source):
LocateModel (NL, Screenshot -> Model -> Point)
GetModel (NL, Image, Response Schema -> Model -> Structured Output)
ActModel (Goal, Tools -> Model -> Sequence Of Actions)
Agentic instructions: AI models handle both the decision-making (what actions to take) and the execution of those actions.
Single-step commands: The developer decides what actions to take and in what order. AI models handle the execution of those specific actions.
The choice between single-step commands and agentic instructions reflects a fundamental tension in automation design between control and convenience.
Agentic instructions maximize convenience by allowing natural language descriptions of desired outcomes. This simplicity comes at the cost of reduced predictability and debugging complexity.
Single-step commands maximize control by requiring explicit specification of each action. This precision comes at the cost of verbosity and maintenance overhead when interfaces change.
Agentic instructions sacrifice some reproducibility for adaptability. The agent must interpret the instruction, plan a sequence of actions, and adapt to unexpected interface states, which can lead to variation in execution paths.
Single-step commands offer higher reproducibility because the decision-making is controlled by the developer. While AI models still handle execution, the workflow is more consistent across runs.
Start with agentic instructions for your automation workflows. Use single-step commands only when agentic instructions cannot handle specific cases reliably enough for your requirements.
The paradigms also converge in error handling: both rely on AskUI’s visual understanding to detect when actions succeed or fail, regardless of whether the action was explicitly specified or autonomously planned.
Regardless of paradigm, AskUI’s execution follows a consistent philosophy: visual understanding over structural assumptions. Rather than relying on DOM selectors, accessibility labels, or API calls, both single-step commands and agentic instructions operate through visual analysis of the interface as it appears to users.
This approach means that both paradigms work equally well with legacy applications, modern web interfaces, and mobile apps – any interface that can be visually perceived can be automated.