What models is AskUI using?
AskUI uses a layered system of AI models, each with a distinct role in understanding, executing, and interacting with user interfaces. Here’s how we classify and use them:- Locator Models (Locators)
- Locator models identify and interact with UI elements on the screen.
- Query Models (Asks)
- Responsible for answering user queries or generating intelligent responses.
- Action Models (act command) (Multi Step)
- Responsibilites
- Goal to → Planning
- Delegate Locator Models
- Delegate Query Models
- Reflection of Errors
- UI-Tars
- Computer-Use
- Responsibilites
Model Type | Model Name | Purpose | Teachable | Online Trainable |
---|---|---|---|---|
Locator | UIDT-1 | Locate elements & understand screen | No | Partial |
Locator | PTA-1 | Convert prompts into one-click actions | No | Yes |
Query | GPT-4 | Understand & respond to user queries | Yes | No |
Query | Gemini 2.5 Flash | Understand & respond to user queries | Yes | No |
Query | Gemini 2.5 Pro | Understand & respond to user queries | Yes | No |
Query | Computer Use | Understand & respond to user queries | Yes | No |
Large Action (act) | Computer Use | Plan and execute full workflows | Yes | No |
Large Action (act) | UI-Tars | Plan and execute full workflows | Yes | No |
Note: See model names here