You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/user-testing.md
+10-22Lines changed: 10 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,13 +41,10 @@ For manual testing with real people:
41
41
42
42
### 2. AI Agent Test Script
43
43
For automated testing with AI using **real browsers**:
44
-
- AI drives browser like a human (built-in IDE browser, Chrome, etc.)
45
-
-**Discovers UI without source code access** - figures out what to click by looking at the page
46
-
- Interacts with real UI by clicking, typing, scrolling
47
-
- Screenshot capture at checkpoints from browser viewport
48
-
- Persona-based behavioral variation
44
+
- Drives browser like a human, discovers UI by looking (no source code access)
45
+
- Screenshots at checkpoints, persona-based behavior
49
46
50
-
**Why not Playwright/Puppeteer?** Those frameworks require pre-existing knowledge of selectors (`page.click('#submit')`). AI agents discover the UI the same way users do - by looking at what's visible - validating that your UI is actually discoverable and understandable.
47
+
**Why not Playwright/Puppeteer?** Those frameworks require pre-existing knowledge of selectors (`page.click('#submit')`). AI agents discover the UI the same way users do - validating that your UI is actually discoverable.
51
48
52
49
Both scripts test the **same journey** with **identical success criteria**, allowing you to:
53
50
- Compare human vs. AI agent behavior
@@ -101,7 +98,7 @@ This outputs:
101
98
102
99
### 3. Run Human Tests
103
100
104
-
Recruit 3-5 participants matching your persona:
101
+
Recruit participants matching your persona:
105
102
106
103
1.**Setup**: Screen recording software, test environment
107
104
2.**Brief**: Explain think-aloud protocol (say what you're thinking)
@@ -115,23 +112,14 @@ Recruit 3-5 participants matching your persona:
115
112
/run-test checkout-journey-agent.md
116
113
```
117
114
118
-
AI agents drive a real browser like a human would:
119
-
- Navigate to your application in a browser
120
-
-**Discover what's on the page without privileged access to source code**
121
-
- Click, type, scroll through actual UI elements based on what they see
122
-
- Execute the journey with persona-based behavior
123
-
- Capture screenshots from browser viewport at checkpoints and failures
124
-
- Generate feedback on difficulty and expectations
125
-
- Report blockers and completed steps
126
-
127
-
**Important**: Agents have no pre-existing knowledge of your UI - they figure out what to click the same way a real user does. This validates that your interface is actually discoverable, not just technically functional.
115
+
Agents discover what to click by looking (no source code access), execute the journey with persona-based behavior, and capture screenshots at checkpoints/failures. This validates UI discoverability, not just technical functionality.
128
116
129
117
### 5. Compare & Iterate
130
118
131
119
-**Review human videos** for genuine confusion and unexpected behavior
132
120
-**Review agent reports** for systematic failures and patterns
133
121
-**Fix the highest-impact issues** (severity × frequency)
134
-
-**Test again** with a new batch of 3-5 users
122
+
-**Test again**
135
123
136
124
## Best Practices
137
125
@@ -233,10 +221,10 @@ This cadence beats testing with 20 users once.
233
221
234
222
### Combining Human + Agent Tests
235
223
236
-
1.**Initial discovery**: 3-5 human tests to find major issues
224
+
1.**Initial discovery**: Human tests to find major issues
237
225
2.**Verify fixes**: AI agent tests after each fix
238
226
3.**Regression testing**: AI agents test all journeys before releases
239
-
4.**Validation**: 3-5 human tests to confirm fixes landed
227
+
4.**Validation**: Human tests to confirm fixes landed
240
228
241
229
## Resources
242
230
@@ -248,9 +236,9 @@ This cadence beats testing with 20 users once.
248
236
249
237
1. Create your first user journey with `/discover`
250
238
2. Generate test scripts with `/user-test`
251
-
3. Run 3-5 human tests
239
+
3. Run human tests
252
240
4. Fix the highest-impact issues
253
241
5. Validate fixes with AI agent tests
254
242
6. Iterate
255
243
256
-
Remember: **Small, frequent testing beats large, infrequent testing.** Start today with just 3 users.
0 commit comments