Shelley E2E Tests with Playwright
This directory contains end-to-end tests for the Shelley web interface using Playwright.
Features
- Mobile-focused testing: Primary focus on mobile viewports (iPhone, Pixel)
- Predictable LLM: Uses the predictable LLM model for deterministic testing
- Screenshot capture: Automatic screenshot generation for visual inspection
- Tool testing: Tests bash tool, think tool, and patch tool interactions
- Multi-browser support: Tests across Chrome, Firefox, Safari, and mobile variants
Running Tests
Install Dependencies
cd ui/
pnpm install
pnpm exec playwright install
Run All Tests
pnpm run test:e2e
Run Specific Tests
# Run only mobile Chrome tests
pnpm run test:e2e -- --project="Mobile Chrome"
# Run specific test
pnpm run test:e2e -- --grep "should load the main page"
# Run with headed browser (visible)
pnpm run test:e2e:headed
# Open UI mode
pnpm run test:e2e:ui
Debug Failed Tests
# View HTML report
pnpm exec playwright show-report
# View screenshots
ls -la test-results/*/
Test Structure
Basic Interactions (basic-interactions.spec.ts)
- Page loading
- Starting conversations
- Tool usage
- Conversation history
- Responsive design
Mobile-Focused Tests (mobile-focused.spec.ts)
- Mobile layout verification
- Touch interactions
- Text input on mobile
- Scrolling behavior
- Mobile-specific UI patterns
Predictable Behavior (predictable-behavior.spec.ts)
- Deterministic LLM responses
- Tool interaction patterns
- Error handling
- Multi-turn conversations
Screenshot Inspection
Screenshots are automatically saved in test-results/ directory:
- Failed tests: Screenshots at failure point
- All tests: Screenshots at key interaction points
- Mobile-optimized: Focus on mobile viewport sizes
Predictable LLM
The tests use Shelley's predictable LLM model which provides:
- Consistent responses for the same inputs
- Deterministic tool usage
- Predictable conversation flows
- Special test commands (
echo,error,tool)
Configuration
Playwright configuration is in playwright.config.ts:
- Auto-starts Shelley server with predictable model
- Configures mobile-first viewports
- Sets up screenshot and video capture
- Handles test timeouts and retries
Tips
- Mobile First: Most tests are designed for mobile viewports
- Screenshots: Check
e2e/screenshots/for visual debugging - Deterministic: All tests should be repeatable and deterministic
- Fast Feedback: Tests are designed to fail fast with meaningful errors