Interactions

When interacting with UI elements in AskUI, following these best practices ensures robust and reliable automation workflows.

Action Selection

Choose the Right Action

Use the simplest action that accomplishes your goal
- Prefer click() over complex mouse operations when possible
- Use type() for text input instead of individual keyboard events
- Choose native methods over tool-based approaches when available

Consider Element State

Verify element visibility before interacting
Check if elements are enabled before clicking
Account for loading states and dynamic content
Handle overlapping elements by being specific with locators

Keyboard Operations

Modifier Key Management

Always release modifier keys after use to prevent stuck keys
Use context managers when possible for automatic cleanup
Combine modifiers properly for complex shortcuts

# Good: Automatic cleanup
with VisionAgent() as agent:
    agent.keyboard("a", modifier_keys=["control"])
    
# Also good: Explicit key management
agent.key_down("control")
agent.keyboard("a")
agent.key_up("control")

Key Combination Best Practices

Use standard shortcuts that work across platforms
Allow time between keystrokes for complex sequences
Test keyboard operations across different OS environments

Tool Usage Guidelines

Tool Selection

OS Tools: Use for low-level system operations
Browser Tools: Use for web-specific tasks
Clipboard Tools: Use for cross-application data transfer
Native Methods: Prefer these for standard interactions

Tool Combination

# Example: Combining tools effectively
with VisionAgent() as agent:
    # Copy from one application
    agent.click("data field")
    agent.keyboard("a", modifier_keys=["control"])
    agent.keyboard("c", modifier_keys=["control"])
    
    # Switch application and paste
    agent.keyboard("tab", modifier_keys=["alt"])
    agent.click("input field")
    agent.keyboard("v", modifier_keys=["control"])

Information Extraction

Query Specificity

Be specific in your questions to get accurate results
Use structured schemas for complex data extraction
Validate extracted data before using it

from askui import ResponseSchemaBase

class ProductInfo(ResponseSchemaBase):
    name: str
    price: float
    in_stock: bool

# Specific query with schema
product = agent.get(
    "What is the product name, price, and stock status?",
    response_schema=ProductInfo
)

Performance Considerations

Cache extracted information when used multiple times
Batch queries when extracting multiple data points
Use appropriate timeouts for extraction operations

Timing and Synchronization

Wait Strategies

Implicit waits: Built into AskUI operations
Explicit waits: Use when needed for dynamic content
Smart waits: Wait for specific conditions

# Wait for element to appear
for _ in range(10):
    try:
        agent.click("Dynamic Button")
        break
    except:
        agent.wait(1)

Performance Optimization

Minimize unnecessary waits
Use element detection to verify readiness
Batch operations when possible
Monitor execution time for optimization opportunities

Data Extraction Performance

When extracting data from UI elements, consider these optimization strategies:

Batch Extraction

Instead of multiple calls:

# Inefficient
name = agent.get("What is the username?", response_schema=str)
email = agent.get("What is the email?", response_schema=str)
role = agent.get("What is the role?", response_schema=str)

# Efficient - single extraction
class UserInfo(ResponseSchemaBase):
    name: str
    email: str
    role: str

user = agent.get("Extract user information", response_schema=UserInfo)

Caching Results

class DataCache:
    def __init__(self):
        self._cache = {}
    
    def get_or_extract(self, agent, key, question, schema):
        if key not in self._cache:
            self._cache[key] = agent.get(question, response_schema=schema)
        return self._cache[key]

cache = DataCache()
# Reuse expensive extractions
table_data = cache.get_or_extract(
    agent, 
    'main_table',
    "Extract the main data table",
    List[dict]
)

Cross-Platform Considerations

Platform-Specific Handling

import platform

with VisionAgent() as agent:
    if platform.system() == "Darwin":  # macOS
        agent.keyboard("a", modifier_keys=["command"])
    else:  # Windows/Linux
        agent.keyboard("a", modifier_keys=["control"])

Resolution Independence

Use relative positioning instead of absolute coordinates
Test across different screen resolutions
Account for scaling factors on high-DPI displays

Debugging and Troubleshooting

Logging Best Practices

with VisionAgent(log_level="DEBUG") as agent:
    # Detailed logging for troubleshooting
    agent.click("problematic button")

Visual Debugging

Take screenshots before and after critical actions
Use annotation features to verify element detection
Enable visual feedback during development

Summary

Following these best practices ensures:

Reliability: Actions work consistently across environments
Maintainability: Code is easy to understand and modify
Performance: Operations execute efficiently
Robustness: Automation handles edge cases gracefully

For practical implementation details, see the how-to guide on interacting with elements.

Documentation

Tutorial

How-to Guides

Understanding AskUI

Action Selection

Choose the Right Action

Consider Element State

Keyboard Operations

Modifier Key Management

Key Combination Best Practices

Tool Usage Guidelines

Tool Selection

Tool Combination

Information Extraction

Query Specificity

Performance Considerations

Timing and Synchronization

Wait Strategies

Performance Optimization

Data Extraction Performance

Batch Extraction

Caching Results

Cross-Platform Considerations

Platform-Specific Handling

Resolution Independence

Debugging and Troubleshooting

Logging Best Practices

Visual Debugging

Summary

Documentation

Tutorial

How-to Guides

Understanding AskUI

​Action Selection

​Choose the Right Action

​Consider Element State

​Keyboard Operations

​Modifier Key Management

​Key Combination Best Practices

​Tool Usage Guidelines

​Tool Selection

​Tool Combination

​Information Extraction

​Query Specificity

​Performance Considerations

​Timing and Synchronization

​Wait Strategies

​Performance Optimization

​Data Extraction Performance

Batch Extraction

Caching Results

​Cross-Platform Considerations

​Platform-Specific Handling

​Resolution Independence

​Debugging and Troubleshooting

​Logging Best Practices

​Visual Debugging

​Summary

Action Selection

Choose the Right Action

Consider Element State

Keyboard Operations

Modifier Key Management

Key Combination Best Practices

Tool Usage Guidelines

Tool Selection

Tool Combination

Information Extraction

Query Specificity

Performance Considerations

Timing and Synchronization

Wait Strategies

Performance Optimization

Data Extraction Performance

Cross-Platform Considerations

Platform-Specific Handling

Resolution Independence

Debugging and Troubleshooting

Logging Best Practices

Visual Debugging

Summary