Single-Step Actions in AskUI

AskUI provides a comprehensive set of single-step actions that allow you to interact with any UI element on your screen. These actions range from basic mouse clicks to complex keyboard combinations and system operations.

Core Interaction Commands

DescriptionModelsExample (Python)
click()Clicks on an element described by textAllagent.click('Login button')
type()Types text into a focused elementAllagent.type('username@example.com')
mouse_move()Moves the mouse over located elementAllagent.mouse_move('red button')

Detailed usage on how to use click can be found here.

Tools

AskUI provides several built-in tools to interact with the operating system and applications:

DescriptionExample (Python)
osProvides OS-level operationsagent.tools.os.keyboard_release()
webbrowserControls web browser operationsagent.tools.webbrowser.open_new("https://askui.com")
clipboardManages clipboard operationsagent.tools.clipboard.copy("Text to copy")

Assertion and Information Commands

DescriptionModelsExample (Python)
get()Extracts text or information from the screensonnet-3.5-latesttext = agent.get('What is the value in the total field?')

Supported Keyboard Keys

AskUI Agent OS supports a wide range of keyboard keys for automation. This reference lists all supported keys organized by category, along with their status and description.

Modifier Keys

These keys modify the behavior of other keys when pressed in combination.

Key NameDescriptionStatus
altALT keySpecial case
capslockCAPS LOCK keySpecial case
commandWindows key (both left and right)Special case
controlCTRL keySpecial case
left_controlLeft CONTROL keySpecial case
numpad_lockNUM LOCK keySpecial case
right_altRight ALT keySpecial case
right_controlRight CONTROL keySpecial case
right_shiftRight SHIFT keySpecial case
shiftSHIFT keySpecial case
(Left ALT key)Left ALT keySpecial case
(Left SHIFT key)Left SHIFT keySpecial case

Alphanumeric Keys

Standard alphanumeric keys for text input.

Key NameDescriptionPressRelease
0 - 9Numeric keys 0-9
a - zAlphabetic keys A-Z

Control Keys

Basic control keys for text editing and navigation.

Key NameDescriptionPressRelease
backspaceBACKSPACE key
deleteDEL key
enterENTER key
escapeESC key
spaceSPACEBAR key
tabTAB key

Arrow Keys

Directional arrow keys for navigation.

Key NameDescriptionPressRelease
downDOWN ARROW key
leftLEFT ARROW key
rightRIGHT ARROW key
upUP ARROW key

Function Keys

Function keys F1-F24 for special operations.

Key NameDescriptionPressRelease
f1 - f9Function keys F1-F9✅ (except F10)✅ (except F10)
f10F10 key
f11 - f24Function keys F11-F24

Media Control Keys

Keys for controlling media playback and volume.

Key NameDescriptionPressRelease
audio_muteVolume Mute key
audio_nextNext Track key
audio_playPlay/Pause Media key
audio_prevPrevious Track key
audio_stopStop Media key
audio_vol_downVolume Down key
audio_vol_upVolume Up key

Keypad Keys

Numeric keypad keys for numeric input.

Key NameDescriptionPressRelease
numpad_*Multiply key
numpad_+Add key
numpad_-Subtract key
numpad_.Decimal key
numpad_/Divide key
numpad_0 - numpad_9Numeric keypad 0-9 keys✅ (except 5)✅ (except 5)
numpad_5Numeric keypad 5 key

Special Purpose Keys

Additional keys for navigation and special functions.

Key NameDescriptionPressRelease
endEND key
homeHOME key
insertINS key
pagedownPAGE DOWN key
pageupPAGE UP key
printscreenPRINT SCREEN key

Unsupported Key Categories

The following key categories are currently unsupported:

  • #️⃣ Symbol Keys: Various symbol and punctuation keys
  • 🈯️ International Keys: IME and language-specific keys
  • 🖱️ Mouse Keys: Mouse button emulation keys
  • Various browser control keys
  • Application launch keys
  • System keys (Sleep, Help, etc.)