Skip to main content

Overview

AI steps are natural-language instructions that tell Docket’s intelligent agent what to do or verify in your application.
Unlike cached steps, which replay recorded coordinates, AI steps use Docket’s vision and reasoning models to dynamically interpret your instructions and act accordingly.
There are two main types of AI steps in Docket:
  1. AI Step – perform an action
  2. Assert – verify a condition
You can mix and match AI steps and cached steps to build powerful, adaptive, and self-healing tests.
Getting Started: New to Docket? Start with Creating a Test to learn the basics.

AI Step

An AI Step tells Docket to perform an action in your application. You can write these in two styles depending on how explicit or goal-oriented you want to be.

Explicit (Step-by-Step)

This style tells Docket exactly what to do, one action at a time. It’s ideal for debugging or when you need strict determinism. Example:
  • Click on the “Login” button. Click on the “Email” input field. Type “nishant@docketqa.com”. Click on the “Password” input field. Type “password123”. Click on the “Sign In” button.
Pros:
  • Predictable and repeatable
Cons:
  • Cached steps are faster and more accurate in this case
  • Can be verbose for longer flows
  • May need updating after UI redesigns

Objective-Based

This style focuses on what you want to achieve rather than how to achieve it.
It’s useful for dynamic interfaces, forms, or multi-step workflows that may change layout.
Example: Pros:
  • Short and human-readable
  • More resilient to UI changes
  • Uses full page context and reasoning
Cons:
  • Less deterministic when multiple valid paths exist
  • Can be harder to debug exact click paths

Special Capabilities

Certain actions — such as sending or checking emails and uploading files — can only be performed using AI steps. These actions require contextual reasoning and browser-level access that cached steps don’t support. Examples:
  • Upload the file “contract.pdf” to the upload input on the page.
  • Check the inbox for a verification code and enter it into the confirmation field.
Docket’s agent will handle the entire sequence intelligently, linking file uploads and email variables (like @registration_email) to the current test run.

Assert Step

An Assert step tells Docket to verify that a condition holds true at the current state of your application.
When you write an assertion, Docket takes a screenshot, analyzes the page using its vision model, and uses prior test history to understand context.
Examples:
  • Assert that the dashboard shows at least 3 recent orders.
  • Assert that a success message appears saying “Account created successfully.”
If the assertion fails, Docket will mark the test as failed, capture a screenshot and reasoning log, and continue or stop depending on your test configuration.

Writing Good Assertions

Well-written assertions make your tests reliable and meaningful. Poorly defined ones can make results ambiguous.
Follow these guidelines for clarity and precision:
ProblemBad ExampleImproved Example
Missing assert keywordLook for search results on the main bodyAssert that there are search results displayed on the main body
Ambiguous goalAssert that results are correctAssert that search results are sorted by date
Unverifiable statementAssert that the page looks goodAssert that no text in the table overflows its cell
Too precise for minor detailsAssert that the divider color is #ff5733Assert that the divider appears red
Time-based checkAssert that the video plays a 10-second adAvoid timing-based assertions
Note
Writing words like verify, check, ensure, assert or validate during an AI step may also result in an assertion action. but we recommend using explicit assert actions for consistency and accuracy.

Early Termination

Docket can mark a test as failed without completing the entire flow if, during self heal or AI steps, it detects usability issues that would frustrate a real user. The agent will fail the test early if it encounters:
  • The feature is not easily findable and requires navigating through multiple menus or sections
  • The interface uses non-standard or unintuitive patterns
  • The task takes significantly more steps or retries than expected
  • There are visible errors, missing features, or unresponsive UI elements
  • The interface is confusing due to unclear labeling, layout, or navigation
  • The tester is unsure how to proceed because the next step isn’t obvious or discoverable
Even without an explicit assertion, encountering any of the above will cause the test to fail—unless specified otherwise. You can craft your prompt accordingly if you’d like to override this behavior.
Tip: This behavior ensures Docket catches not just broken functionality, but also poor user experiences that could lead to user frustration or abandonment.

Testing AI Steps in the Remote Browser

When working with AI steps during test creation, you can test them in real-time using the remote browser. Simply write an AI step and click the play button next to it while the remote browser is running. Docket will execute that step immediately in the browser, allowing you to:
  • Verify the step works as expected
  • See the agent’s reasoning in real-time
  • Iterate quickly on your test instructions
  • Observe how the agent interprets your natural language commands
This live testing capability helps you refine your AI steps before running the full test, ensuring your instructions are clear and effective.

Combining Cached and AI Steps

You can combine cached steps and AI steps to achieve a balance of precision and flexibility.
Use CaseRecommended Step
Deterministic clicks or scrollsCached Step
Dynamic workflows or variable layoutsAI Step
Page validations or checksAssert Step
File uploads or email actionsAI Step
Example Combined Flow:
  • [Cached Step] Click “Sign In”
  • [AI Step] Login using “nishant@docketqa.com” and “password123”
  • [Assert Step] Assert that the dashboard loads successfully

Important: Page State After AI Steps

When using AI steps followed by cached steps, it’s critical to consider where your AI step will leave the page once complete. Cached steps rely on specific screen coordinates, so if an AI step ends on an unexpected page state, scroll position, or modal, subsequent cached actions may fail or become flaky. Best Practices:
  • Be explicit about the final state in your AI step (e.g., “Login and ensure you’re on the dashboard”)
  • Verify the page has fully loaded before transitioning to cached steps
  • If using modals or overlays, explicitly close them in the AI step if cached steps follow
  • Consider adding a brief assertion to confirm the expected page state before cached steps execute
Example:
❌ Bad: "Login" (leaves page state ambiguous)
✅ Good: "Login using credentials and wait for the dashboard to fully load"
This ensures cached steps start from a predictable, consistent state and reduces test flakiness.
Learn more: See Cached Steps to understand how coordinate-based actions work and when to use them.