Agentic Mode
Welcome to the home of your new documentation
Agentic Mode with AskUI
Agentic mode allows your automation to operate at a higher level of abstraction by giving the AI agent a goal to accomplish rather than specifying individual steps. This enables more flexible and powerful automations that can adapt to changing interfaces and solve complex tasks.
Agentic mode requires the Anthropic Claude 3.5 Sonnet Computer Use model to function. This feature is currently in beta and not recommended for production use. Make sure you have set up the ANTHROPIC_API_KEY
environment variable as described in the Model Usage documentation.
Using the act()
Command
The act()
command is the gateway to agentic mode in AskUI. It takes a natural language description of your goal and lets the agent work autonomously to accomplish it.
Basic Syntax
Example: Booking a Flight
Getting Started with Agentic Mode
To use agentic mode, you’ll need to:
- Set up authentication with Anthropic (see Model Usage)
- Use the
act()
command with a clear goal description - Let the agent work autonomously to accomplish the task
Understanding Agentic Mode
Unlike single-step commands that perform specific actions (like clicking a button or typing text), agentic mode lets you describe what you want to accomplish, and the AI agent figures out how to achieve it.
Key Benefits
- Goal-oriented automation: Focus on what you want to achieve, not how to achieve it
- Adaptability: Agents can navigate changing UIs and unexpected scenarios
- Reduced maintenance: Less need to update scripts when interfaces change
- Complex task handling: Accomplish multi-step workflows with a single instruction
Using the act()
Command
The act()
command is the gateway to agentic mode in AskUI. It takes a natural language description of your goal and lets the agent work autonomously to accomplish it.
Prompting Best Practices
1. Use “simulate” for Enhanced Stability
When using the Act command, the keyword simulate can improve stability and accuracy. It mimics user interactions more precisely, ensuring that actions like typing or clicking are performed as a real user would.
Example:
By simulating actions rather than directly executing them, you reduce errors caused by UI inconsistencies or timing issues.
2. Add “if” Conditions for Context Awareness
To ensure reliable execution, include conditional prompts like if statements. For example, only proceed with an action if specific fields (e.g., search bars) are empty. This approach prevents unintended overwrites or conflicts in native apps.
Example:
This technique ensures the agent adapts dynamically to the application’s state, improving robustness in automation workflows.
3. Break Actions Into Step-by-Step Instructions
Complex tasks can be simplified by breaking them into smaller steps. This makes debugging easier and enhances reliability by ensuring each step is executed sequentially.
Example:
This approach is particularly useful for multi-step processes like form filling or playing games (e.g., Blackjack), where precision is critical.
4. Add Verification Conditions for Stability
Verification conditions are crucial when dealing with dynamic or ambiguous application states. For example, you can instruct the agent to proceed only if certain fields (e.g., search bars) are empty or specific UI elements are visible.
Example:
This ensures that your automation script adapts dynamically to real-time application states, preventing errors like overwriting existing data.
5. Use Multiple Approaches for Actions
Flexibility is key when automating tasks, especially for repetitive actions like deleting text. The AskUI Vision Agent allows you to use multiple approaches for the same action, ensuring compatibility across different scenarios.
Example:
By combining multiple approaches, you increase flexibility and ensure that your automation script works across various environments and input methods.
Was this page helpful?