README.md

  1# Shelley E2E Tests with Playwright
  2
  3This directory contains end-to-end tests for the Shelley web interface using Playwright.
  4
  5## Features
  6
  7- **Mobile-focused testing**: Primary focus on mobile viewports (iPhone, Pixel)
  8- **Predictable LLM**: Uses the predictable LLM model for deterministic testing
  9- **Screenshot capture**: Automatic screenshot generation for visual inspection
 10- **Tool testing**: Tests bash tool, think tool, and patch tool interactions
 11- **Multi-browser support**: Tests across Chrome, Firefox, Safari, and mobile variants
 12
 13## Running Tests
 14
 15### Install Dependencies
 16```bash
 17cd ui/
 18pnpm install
 19pnpm exec playwright install
 20```
 21
 22### Run All Tests
 23```bash
 24pnpm run test:e2e
 25```
 26
 27### Run Specific Tests
 28```bash
 29# Run only mobile Chrome tests
 30pnpm run test:e2e -- --project="Mobile Chrome"
 31
 32# Run specific test
 33pnpm run test:e2e -- --grep "should load the main page"
 34
 35# Run with headed browser (visible)
 36pnpm run test:e2e:headed
 37
 38# Open UI mode
 39pnpm run test:e2e:ui
 40```
 41
 42### Debug Failed Tests
 43```bash
 44# View HTML report
 45pnpm exec playwright show-report
 46
 47# View screenshots
 48ls -la test-results/*/
 49```
 50
 51## Test Structure
 52
 53### Basic Interactions (`basic-interactions.spec.ts`)
 54- Page loading
 55- Starting conversations
 56- Tool usage
 57- Conversation history
 58- Responsive design
 59
 60### Mobile-Focused Tests (`mobile-focused.spec.ts`)
 61- Mobile layout verification
 62- Touch interactions
 63- Text input on mobile
 64- Scrolling behavior
 65- Mobile-specific UI patterns
 66
 67### Predictable Behavior (`predictable-behavior.spec.ts`)
 68- Deterministic LLM responses
 69- Tool interaction patterns
 70- Error handling
 71- Multi-turn conversations
 72
 73## Screenshot Inspection
 74
 75Screenshots are automatically saved in `test-results/` directory:
 76- Failed tests: Screenshots at failure point
 77- All tests: Screenshots at key interaction points
 78- Mobile-optimized: Focus on mobile viewport sizes
 79
 80## Predictable LLM
 81
 82The tests use Shelley's predictable LLM model which provides:
 83- Consistent responses for the same inputs
 84- Deterministic tool usage
 85- Predictable conversation flows
 86- Special test commands (`echo`, `error`, `tool`)
 87
 88## Configuration
 89
 90Playwright configuration is in `playwright.config.ts`:
 91- Auto-starts Shelley server with predictable model
 92- Configures mobile-first viewports
 93- Sets up screenshot and video capture
 94- Handles test timeouts and retries
 95
 96## Tips
 97
 981. **Mobile First**: Most tests are designed for mobile viewports
 992. **Screenshots**: Check `e2e/screenshots/` for visual debugging
1003. **Deterministic**: All tests should be repeatable and deterministic
1014. **Fast Feedback**: Tests are designed to fail fast with meaningful errors