1. Misspellings of Words

Problem: The OCR model sometimes misreads characters, especially in certain fonts or noisy images. This can result in words being misclassified or misspelled, which then causes the automation to fail when it searches for exact matches.

Example:

Expected (Truth)

Hallo

OCR Output (Prediction)

HaII0

Solution 1: Re-Teach the OCR Model Using AskUI’s OCR Re-Teaching App

You can directly correct OCR predictions and improve model accuracy by training your workspace-specific model.

Steps:

Start the AskUI shell:
```
bash

askui-shell
```
Launch the OCR Teaching App:
```
bash

AskUI-StartOCRTeaching
```
Upload a screenshot containing the misclassified word (e.g., “Hallo”).
Switch to Character-Level Mode for precise corrections.
Select the wrongly detected word (HaII0) and replace it with the correct label: Hallo.
Click “Copy Model” to copy the newly trained model ID.
In your automation code (e.g., askui-helper.ts), update the exec() call to use the new model:

Example: Re-Teach the OCR Model

This example uses character-level OCR with AskUI Vision Agent.

await aui.click().text("Hallo").exec([{
    "task": "e2e_ocr",
    "architecture": "easy_ocr",
    "version": "1",
    "interface": "online_learning",
    "useCase": "<your-workspace-id>",
    "tags": ["char_level"]
}])

2. Text Detection Issues

1. Icon Text Merging

Problem: Sometimes, Text Detector/annotation tool, merges an icon and texts into one, even though they look merged on screen.

Example: Say you want to click just the name “Alice Johnson” field or just the position field in a interface - but OCR detects them as one long string:

✅ Expected Behavior

🖼️ Icon and Text are detected separately:

🧑 ✅ Name ✅ 🤖 ✅ Role ✅

👍 Works with click().text("Name") or click().text("Name")

❌ Actual Issue

🖼️ Icon and text are detected together:

🧑 Name ❌ 🤖 ✅ Role ✅

👎 Can’t find click().text("Name").

Solution 1: Re-Teach the OCR Model to ignore the icon.

Start Re-Teachting App
Teach the Text Recognition to ignore the icon
Use the Custom Model.

Example: Re-Teach the OCR Model

await aui.click().text("Alice Johnson").exec([{
        "task": "e2e_ocr",
        "architecture": "easy_ocr",
        "version": "1",
        "interface": "online_learning",
        "useCase": "<your-workspace-id>",
        "tags": ["word_level"]
    }])

Solution 2: Use Default Word-Level Detection (Best Practice)

Example: Re-Teach the OCR Model

This example uses a specific workspace ID in a production-like scenario.

await aui.click().text("Alice Johnson").exec([{
    "task": "e2e_ocr",
    "architecture": "easy_ocr",
    "version": "1",
    "interface": "online_learning",
    "useCase": "00000000_0000_0000_0000_000000000000",
    "tags": ["word_level"]
}])

Solution 3: Use Custom Model Word-Level Detection

Example: Re-Teach the OCR Model

This accordion includes a code block and some explanation text.

await aui.click().text("Alice Johnson").exec([{
    "task": "e2e_ocr",
    "architecture": "easy_ocr",
    "version": "1",
    "interface": "online_learning",
    "useCase": "<your-workspace-id>",
    "tags": ["word_level"]
}])

2. Merged Texts

Problem: Sometimes, Text Detector/ annotation tool, merges two separate texts into one, even though they look clearly split on screen.

Example: Say you want to click just the name “Alice Johnson” field or just the position field in a interface - but OCR detects them as one long string:

✅ Expected Behavior

🖼️ Text fields detected separately:

Alice Johnson ✅ Software Engineer ✅

👍 Works with text("Alice Johnson") or text("Software Engineer")

❌ Actual Issue

🖼️ Texts merged into one block:

Alice Johnson Software Engineer❌

👎 Can’t find either one on its own.

Solution 1: Use Default Word-Level Detection (Best Practice)

Example: Re-Teach the OCR Model

await aui.click().text("Alice Johnson").exec([{
    "task": "e2e_ocr",
    "architecture": "easy_ocr",
    "version": "1",
    "interface": "online_learning",
    "useCase": "00000000_0000_0000_0000_000000000000",
    "tags": ["word_level"]
}])

Solution 2: Use Custom Model Word-Level Detection

Example: Re-Teach the OCR Model

await aui.click().text("Alice Johnson").exec([{
    "task": "e2e_ocr",
    "architecture": "easy_ocr",
    "version": "1",
    "interface": "online_learning",
    "useCase": "<your-workspace-id>",
    "tags": ["word_level"]
}])

Solution 3: Use Relative Positioning (Fallback)

await aui.moveMouseRelativeTo(0, left).containsText("Name").exec()

3.Text Separation

Problem: Sometimes, Text Detector/ annotation tool, septerates a text into two texts, even though they look clearly merged on screen.

Example: Say you want to click just the name “Alice Johnson” field or just the position field in a interface - but OCR detects them as two words:

✅ Expected Behavior

🖼️ Words are detected as one sentence:

Alice Johnson ✅

👍 Works with text("Alice Johnson")

❌ Actual Issue

🖼️ Words are detected as separated texts:

Alice❌ Johnson❌

👎 Can’t find either text("Alice Johnson") on its own.

Solution 1: Use Default Word-Level Detection (Best Practice)

Example: Re-Teach the OCR Model

await aui.click().text("Alice Johnson").exec([{
    "task": "e2e_ocr",
    "architecture": "easy_ocr",
    "version": "1",
    "interface": "online_learning",
    "useCase": "00000000_0000_0000_0000_000000000000",
    "tags": ["word_level"]
}])

Solution 2: Use Custom Model Word-Level Detection

Example: Re-Teach the OCR Model

await aui.click().text("Alice Johnson").exec([{
    "task": "e2e_ocr",
    "architecture": "easy_ocr",
    "version": "1",
    "interface": "online_learning",
    "useCase": "<your-workspace-id>",
    "tags": ["word_level"]
}])

4. Vertical Text Merging

Problem: Sometimes, Text Detector/ annotation tool, merges two lines to one text, even though they look clearly as two lines on screen.

Example: Say you want to click just the name “Alice Johnson” field or just the position field in a interface - but OCR detects them as one:

✅ Expected Behavior

🖼️ Texts are detected as two lines:

Alice Johnson ✅

👍 Works with text("Alice Johnson")

❌ Actual Issue

🖼️ Texts are detected as one text:

<no words recognized>❌

👎 Can’t find either text("Alice Johnson") on its own.

Solution 1: Use Default Word-Level Detection (Best Practice)

Example: Re-Teach the OCR Model

await aui.click().text("Alice Johnson").exec([{
    "task": "e2e_ocr",
    "architecture": "easy_ocr",
    "version": "1",
    "interface": "online_learning",
    "useCase": "00000000_0000_0000_0000_000000000000",
    "tags": ["word_level"]
}])

Solution 2: Use Custom Model Word-Level Detection

Example: Re-Teach the OCR Model

await aui.click().text("Alice Johnson").exec([{
    "task": "e2e_ocr",
    "architecture": "easy_ocr",
    "version": "1",
    "interface": "online_learning",
    "useCase": "<your-workspace-id>",
    "tags": ["word_level"]
}])

5. Single Character not Detected

Problem: Sometimes, Text Detector/ annotation tool, does not detect single charactors, even though they look clearly on screen.

Example: Say you want to click **just the character “2” - but OCR does not detects them:

✅ Expected Behavior

🖼️ Single chars are detected:

1 ✅ 2 ✅ 3 ✅

👍 Works with text("2")

❌ Actual Issue

🖼️ Char 2 is not detected:

1 ✅ 2 ❌ 3 ✅

👎 Can’t find either text("2") on its own.

Solution 1: AI Element

Follow tutorial here.

6. Text not Detected

Problem: Sometimes, for no apparent reason, Text Detector/ annotation tool does not detect a text, even though you can see it clearly on screen.

Example: Say you want to click just the name “Alice Johnson” field - but OCR does not detects the text at all:

✅ Expected Behavior

🖼️ Text was detected:

Alice Johnson ✅

👍 Works with text("Alice Johnson")

❌ Actual Issue

🖼️ Text wasn’t detected

Alice Johnson❌

👎 Can’t find either text("Alice Johnson") on its own.

Solution 1: AI Element

Select the Text as AI Element.

Introduction

Getting Started

Core Concepts

Model Usage & Configuration

AskUI Suite

Integrations & Advanced Usage

Updates & Glossary

WIP: Best Practices for Reliable Automation

1. Misspellings of Words

Solution 1: Re-Teach the OCR Model Using AskUI’s OCR Re-Teaching App

Steps:

2. Text Detection Issues

1. Icon Text Merging

Solution 1: Re-Teach the OCR Model to ignore the icon.

Solution 2: Use Default Word-Level Detection (Best Practice)

Solution 3: Use Custom Model Word-Level Detection

2. Merged Texts

Solution 1: Use Default Word-Level Detection (Best Practice)

Solution 2: Use Custom Model Word-Level Detection

Solution 3: Use Relative Positioning (Fallback)

3.Text Separation

Solution 1: Use Default Word-Level Detection (Best Practice)

Solution 2: Use Custom Model Word-Level Detection

4. Vertical Text Merging

Solution 1: Use Default Word-Level Detection (Best Practice)

Solution 2: Use Custom Model Word-Level Detection

5. Single Character not Detected

Solution 1: AI Element

6. Text not Detected

Solution 1: AI Element

Introduction

Getting Started

Core Concepts

Model Usage & Configuration

AskUI Suite

Integrations & Advanced Usage

Updates & Glossary

​1. Misspellings of Words

​Solution 1: Re-Teach the OCR Model Using AskUI’s OCR Re-Teaching App

​Steps:

​2. Text Detection Issues

​1. Icon Text Merging

​Solution 1: Re-Teach the OCR Model to ignore the icon.

​Solution 2: Use Default Word-Level Detection (Best Practice)

​Solution 3: Use Custom Model Word-Level Detection

​2. Merged Texts

​Solution 1: Use Default Word-Level Detection (Best Practice)

​Solution 2: Use Custom Model Word-Level Detection

​Solution 3: Use Relative Positioning (Fallback)

​3.Text Separation

​Solution 1: Use Default Word-Level Detection (Best Practice)

​Solution 2: Use Custom Model Word-Level Detection

​4. Vertical Text Merging

​Solution 1: Use Default Word-Level Detection (Best Practice)

​Solution 2: Use Custom Model Word-Level Detection

​5. Single Character not Detected

​Solution 1: AI Element

​6. Text not Detected

​Solution 1: AI Element

1. Misspellings of Words

Solution 1: Re-Teach the OCR Model Using AskUI’s OCR Re-Teaching App

Steps:

2. Text Detection Issues

1. Icon Text Merging

Solution 1: Re-Teach the OCR Model to ignore the icon.

Solution 2: Use Default Word-Level Detection (Best Practice)

Solution 3: Use Custom Model Word-Level Detection

2. Merged Texts

Solution 1: Use Default Word-Level Detection (Best Practice)

Solution 2: Use Custom Model Word-Level Detection

Solution 3: Use Relative Positioning (Fallback)

3.Text Separation

Solution 1: Use Default Word-Level Detection (Best Practice)

Solution 2: Use Custom Model Word-Level Detection

4. Vertical Text Merging

Solution 1: Use Default Word-Level Detection (Best Practice)

Solution 2: Use Custom Model Word-Level Detection

5. Single Character not Detected

Solution 1: AI Element

6. Text not Detected

Solution 1: AI Element