Your First Agent

In this tutorial, you’ll create your first AskUI agent that automates a real-world task: searching for products on Amazon.

Prerequisites: Make sure you’ve completed the installation before starting this tutorial.

What You’ll Build

You’ll create an agent that:

Opens Amazon in a web browser
Searches for a product
Verifies the search results
Generates a report of the automation

Building Your Agent

Create Your Agent Script

Create a new Python file amazon_shopping.py and add the following code:

from askui import VisionAgent
import logging
from askui import locators as loc
from askui.reporting import SimpleHtmlReporter

# Initialize your agent with logging and reporting
with VisionAgent(
    log_level=logging.DEBUG,
    reporters=[SimpleHtmlReporter()]
) as agent:
    # Open Amazon website
    agent.tools.webbrowser.open_new("http://www.amazon.com")
    agent.wait(3)  # Wait for page to load

    # Search for a product
    agent.click(loc.Element("textfield"))
    agent.type("nike shoes")
    agent.keyboard('enter')
    agent.wait(2)  # Wait for search results

    # Verify page contents
    page_status = agent.get("Are Nike shoes visible on the screen?")
    print(f"Cart Status: {page_status}")

Run the script:

python amazon_shopping.py

The script will:

Open Amazon in your default browser
Search for “nike shoes”
Verify the cart contents
Generate an HTML report of the automation

Understanding Your Code

Let’s break down what each part does:

Agent Initialization

with VisionAgent(log_level=logging.DEBUG, reporters=[SimpleHtmlReporter()]) as agent:

Creates a vision agent that can see and interact with your screen
Enables debug logging to see what’s happening
Sets up HTML reporting to review the automation later

Browser Control

agent.tools.webbrowser.open_new("http://www.amazon.com")
agent.wait(3)

Opens a new browser window with Amazon
Waits for the page to load

Element Interaction

agent.click(loc.Element("textfield"))
agent.type("nike shoes")
agent.keyboard('enter')

Finds and clicks the search box
Types the search query
Presses Enter to search

Information Extraction

page_status = agent.get("Are Nike shoes visible on the screen?")

Uses AI to understand what’s on the screen
Returns a natural language response

View the Report

After running your agent, open the generated HTML report:

# The report will be in the same directory as your script
# Look for: report_YYYY-MM-DD_HH-MM-SS.html

The report shows:

Screenshots of each step
Actions performed
Execution time
Any errors encountered

Enhancing Your Agent

Try these modifications to learn more:

1. Add Product to Cart

# After searching, click on the first product
agent.click("first product image")
agent.wait(2)

# Add to cart
agent.click("Add to Cart button")

2. Use Different Selectors

# Using text locator
agent.click(loc.Text("Search"))

# Using relative positioning
search_icon = loc.Element().right_of(loc.Element("textfield"))
agent.click(search_icon)

3. Extract Product Information

from askui import ResponseSchemaBase

class ProductInfo(ResponseSchemaBase):
    name: str
    price: float
    rating: float

product = agent.get(
    "What is the name, price, and rating of the first product?",
    response_schema=ProductInfo
)
print(f"Found: {product.name} - ${product.price} ({product.rating} stars)")

Common Issues and Solutions

Browser doesn't open

Element not found

Script runs too fast

What You’ve Learned

Congratulations! You’ve successfully:

✅ Created your first AskUI agent
✅ Automated browser interactions
✅ Used AI to verify screen content
✅ Generated automation reports

Next Steps

Element Selection

Learn advanced techniques for finding and selecting UI elements

Data Extraction

Extract structured data from any UI

Configure AI Models

Use different AI models for specific tasks

Best Practices

Learn patterns for reliable automation

Documentation

Tutorial

How-to Guides

Understanding AskUI

What You’ll Build

Building Your Agent

Agent Initialization

Browser Control

Element Interaction

Information Extraction

Enhancing Your Agent

1. Add Product to Cart

2. Use Different Selectors

3. Extract Product Information

Common Issues and Solutions

What You’ve Learned

Next Steps

Element Selection

Data Extraction

Configure AI Models

Best Practices

Documentation

Tutorial

How-to Guides

Understanding AskUI

​What You’ll Build

​Building Your Agent

​Agent Initialization

​Browser Control

​Element Interaction

​Information Extraction

​Enhancing Your Agent

​1. Add Product to Cart

​2. Use Different Selectors

​3. Extract Product Information

​Common Issues and Solutions

​What You’ve Learned

​Next Steps

Element Selection

Data Extraction

Configure AI Models

Best Practices

What You’ll Build

Building Your Agent

Agent Initialization

Browser Control

Element Interaction

Information Extraction

Enhancing Your Agent

1. Add Product to Cart

2. Use Different Selectors

3. Extract Product Information

Common Issues and Solutions

What You’ve Learned

Next Steps