What models is AskUI using?

AskUI uses a layered system of AI models, each with a distinct role in understanding, executing, and interacting with user interfaces. Here’s how we classify and use them:
  1. Locator Models (Locators)
    1. Locator models identify and interact with UI elements on the screen.
  2. Query Models (Asks)
    • Responsible for answering user queries or generating intelligent responses.
  3. Action Models (act command) (Multi Step)
    • Responsibilites
      • Goal to → Planning
      • Delegate Locator Models
      • Delegate Query Models
      • Reflection of Errors
    • UI-Tars
    • Computer-Use
    Model TypeModel NamePurposeTeachableOnline Trainable
    LocatorUIDT-1Locate elements & understand screenNoPartial
    LocatorPTA-1Convert prompts into one-click actionsNoYes
    QueryGPT-4Understand & respond to user queriesYesNo
    QueryComputer UseUnderstand & respond to user queriesYesNo
    Large Action (act)Computer UsePlan and execute full workflowsYesNo
    Large Action (act)UI-TarsPlan and execute full workflowsYesNo