Text Extraction

On this page

Basic Usage
Best Practices

Extract text content from your UI using string response schemas. This is useful for reading labels, messages, form values, and any textual content.

Basic Usage

from askui import VisionAgent

with VisionAgent() as agent:
    agent.tools.webbrowser.open_new("http://www.example.com")
    agent.wait(3)
    
    text = agent.get("What is the main heading?", response_schema=str)
    print(f"main heading: {text}")

Best Practices

Be Specific About Location: Mention where the text is located

# Good - specific location
header = agent.get("What is the text in the page header?", response_schema=str)

# Less specific
text = agent.get("What text is shown?", response_schema=str)

Handle Empty or Missing Text: Consider that text might not exist

from typing import Optional

# Text might not be present
optional_text = agent.get("What is the subtitle, if any?", response_schema=Optional[str])

if optional_text:
    print(f"Subtitle: {optional_text}")

Clean and Validate Extracted Text: Post-process extracted text as needed

# Extract and clean price
price_text = agent.get("What is the price?", response_schema=str)
price_value = float(price_text.replace("$", "").replace(",", ""))

Yes/No Questions Extracting Numbers

Documentation

Tutorial

How-to Guides

Understanding AskUI

Basic Usage

Best Practices

Documentation

Tutorial

How-to Guides

Understanding AskUI

​Basic Usage

​Best Practices

Basic Usage

Best Practices