What models is AskUI using?

AskUI uses a layered system of AI models, each with a distinct role in understanding, executing, and interacting with user interfaces. Here’s how we classify and use them:
  1. Locator Models (Locators)
    1. Locator models identify and interact with UI elements on the screen.
  2. Query Models (Asks)
    • Responsible for answering user queries or generating intelligent responses.
  3. Action Models (act command) (Multi Step)
    • Responsibilites
      • Goal to → Planning
      • Delegate Locator Models
      • Delegate Query Models
      • Reflection of Errors
    • UI-Tars
    • Computer-Use
Model TypeModel NamePurposeTeachableOnline Trainable
LocatorUIDT-1Locate elements & understand screenNoPartial
LocatorPTA-1Convert prompts into one-click actionsNoYes
QueryGPT-4Understand & respond to user queriesYesNo
QueryGemini 2.5 FlashUnderstand & respond to user queriesYesNo
QueryGemini 2.5 ProUnderstand & respond to user queriesYesNo
QueryComputer UseUnderstand & respond to user queriesYesNo
Large Action (act)Computer UsePlan and execute full workflowsYesNo
Large Action (act)UI-TarsPlan and execute full workflowsYesNo
Note: See model names here