AGENTS.md

  1# Soft Serve Development Documentation
  2
  3## Important Instructions
  4
  5- Use the gopls MCP tooling as much as appropriate. It provides dense, highly-relevant symbolic information. Before making any edit, use the workspace tool, check the relevant APIs and references if necessary, and _then_ proceed with edits.
  6
  7## Code Style Guidelines
  8
  9### Formatting
 10
 11- **Go files**: Use tabs for indentation
 12- **Other files**: Use 2 spaces for indentation
 13- Use LF line endings
 14- Insert final newline
 15- Trim trailing whitespace
 16- UTF-8 character encoding
 17
 18### Go Code Style
 19
 20- Format with `gofumpt` and `goimports` (configured in `.golangci.yml`)
 21- Use standard Go naming conventions (PascalCase for exported, camelCase for unexported)
 22- Wrap exported types with helper methods rather than exposing underlying library types directly
 23- Implement standard interfaces where appropriate (e.g., `sort.Interface`)
 24
 25### Comments
 26
 27- Exported functions, types, and methods must have godoc comments
 28- Comments should end with proper punctuation (enforced by `godot` linter)
 29- Start comments with the name of the item being documented
 30
 31### Error Handling
 32
 33- Wrap errors with context using `wrapcheck` patterns
 34- Check all error returns (enforced by various linters)
 35- Close resources properly (enforced by `bodyclose`, `sqlclosecheck`, `rowserrcheck`)
 36
 37## How to Write Unit Tests
 38
 39### Tooling
 40
 41- Unit tests: Go's built-in `testing` package with `matryer/is` for assertions
 42- Integration tests: `testscript` framework (see `testscript/` directory)
 43
 44### Running Tests
 45
 46```bash
 47# Run all tests
 48go test ./...
 49
 50# Run tests for specific package
 51go test ./pkg/backend
 52
 53# Run integration tests
 54cd testscript && go test -v
 55
 56# Run with PostgreSQL
 57SOFT_SERVE_DB_DRIVER=postgres \
 58SOFT_SERVE_DB_DATA_SOURCE="postgres://postgres:postgres@localhost/postgres?sslmode=disable" \
 59go test ./...
 60
 61# Disable race detection (for slower systems)
 62SOFT_SERVE_DISABLE_RACE_CHECKS=1 go test ./...
 63```
 64
 65### Writing Unit Tests
 66
 67- Place test files alongside the code being tested (e.g., `tree_test.go` next to `tree.go`)
 68- Use `_test.go` suffix for test files
 69- Use `matryer/is` for simple, readable assertions
 70- Mock databases using transactions that rollback in deferred functions
 71- Test both SQLite and PostgreSQL code paths for database-related changes
 72- Use `pkg/test/test.go` helpers for fixtures (e.g., `test.RandomPort()`)
 73
 74### Writing Integration Tests
 75
 76- Add `.txtar` files to `testscript/testdata/`
 77- Each test spins up a complete Soft Serve instance with randomized ports
 78- Available commands: `soft`, `usoft`, `git`, `ugit`, `ui`, `uui`, `curl`, `mkfile`, `readfile`, `envfile`
 79- Use `ensureserverrunning`/`ensureservernotrunning` to wait for server state
 80
 81## Development Commands and Environment
 82
 83### Build and Development Commands
 84
 85#### Building
 86
 87```bash
 88go build -o soft ./cmd/soft
 89```
 90
 91### Environment Variables
 92
 93All configuration can be set via environment variables with the `SOFT_SERVE_` prefix:
 94- `SOFT_SERVE_DATA_PATH`: Data directory location (default: `data`)
 95- `SOFT_SERVE_CONFIG_LOCATION`: Custom config file path
 96- `SOFT_SERVE_INITIAL_ADMIN_KEYS`: SSH keys for initial admin (setup only, newline-separated)
 97- `SOFT_SERVE_SSH_LISTEN_ADDR`: SSH server address (default: `:23231`)
 98- `SOFT_SERVE_HTTP_LISTEN_ADDR`: HTTP server address (default: `:23232`)
 99- `SOFT_SERVE_GIT_LISTEN_ADDR`: Git daemon address (default: `:9418`)
100- `SOFT_SERVE_DB_DRIVER`: Database driver (`sqlite` or `postgres`)
101- `SOFT_SERVE_DB_DATA_SOURCE`: Database connection string
102- `SOFT_SERVE_NO_COLOR`: Disable color output (set to any truthy value)
103- `SOFT_SERVE_DEBUG`: Enable debug logging (enables auth callback logging)
104- `SOFT_SERVE_VERBOSE`: Enable verbose logging (logs DB queries, SSH envs/keys; requires `SOFT_SERVE_DEBUG` to also be set)
105
106Environment variables always override config file settings via `cfg.ParseEnv()`.
107
108## Advanced Features and Storage
109
110### TUI Architecture
111
112The TUI (`pkg/ui/`) is built with Bubble Tea v2 and organized as:
113- `pages/`: Main views (repo browser, file viewer, log viewer, refs viewer)
114- `components/`: Reusable UI elements (code viewer, tabs, header, footer, viewport)
115- `common/`: Shared state and utilities including `common.Common` struct passed to all components
116- `styles/`: Lipgloss style definitions
117- `keymap/`: Keyboard shortcuts
118
119The TUI runs in PTY sessions and uses the BubbleTea middleware (`bm.MiddlewareWithProgramHandler`).
120
121### Git Protocol Version
122
123The git daemon and SSH/HTTP handlers support git protocol v2. The protocol version is negotiated by the client:
124- HTTP: Client sends `Git-Protocol` header (e.g., `Git-Protocol: version=2`)
125- SSH: Client sets `GIT_PROTOCOL` in session environment
126- Git daemon: Client specifies protocol in pktline extra parameters
127
128The server passes the `GIT_PROTOCOL` environment variable through to git processes.
129
130## Architecture and Core Components
131
132### Project Overview
133
134Soft Serve is a self-hostable Git server with SSH/HTTP/Git protocol support, featuring a terminal UI (TUI), Git LFS support, webhooks, and access control. It's built with Go and uses Bubble Tea for the TUI.
135
136### Core Components
137
1381. **Backend** (`pkg/backend/`): Central orchestrator that manages all business logic. The Backend struct ties together the database, store, config, cache, and task manager. It's the primary interface for repository, user, and settings operations. Key responsibilities:
139   - Repository CRUD operations with cache management
140   - User authentication and authorization
141   - Hook execution (implements `hooks.Hooks` interface)
142   - LFS object management
143   - Webhook event creation
144
1452. **Store** (`pkg/store/`): Database abstraction layer implementing repository, user, collaborator, LFS, access token, webhook, and settings persistence. The actual implementation lives in `pkg/store/database/`. This is a pure data layer with no business logic.
146
1473. **Database** (`pkg/db/`): Low-level database wrapper around `sqlx` supporting both SQLite (default) and PostgreSQL. Provides transaction management and migrations (in `pkg/db/migrate/`). Uses context-aware logging when verbose mode is enabled.
148
1494. **Config** (`pkg/config/`): Configuration management supporting both YAML files and environment variables. All `SOFT_SERVE_*` env vars override config file settings. Config is immutable once loaded and passed through context.
150
1515. **Git Operations** (`git/` package): Git command execution and repository manipulation using `aymanbagabas/git-module`. Handles commits, tags, references, patches, trees, and server-side hooks. This is a separate top-level package, not under `pkg/`.
152
153### Server Protocols
154
1551. **SSH Server** (`pkg/ssh/`): Provides both interactive TUI access and git protocol operations. Uses a middleware chain pattern:
156   - `AuthenticationMiddleware`: Validates SSH public key fingerprint matches approved key
157   - `ContextMiddleware`: Injects config, backend, store, DB, logger into session context
158   - `CommandMiddleware`: Routes non-PTY sessions to CLI commands using Cobra
159   - `LoggingMiddleware`: Logs connection, commands, and duration
160   
161   CLI commands live in `pkg/ssh/cmd/` as Cobra commands.
162
1632. **HTTP Server** (`pkg/web/`): Serves git HTTP protocol, Git LFS endpoints, and provides basic auth using access tokens. Uses Gorilla Mux for routing with these controllers:
164   - `GitController`: Git smart HTTP protocol
165   - `HealthController`: Health check endpoint
166   - Also uses middleware chain for compression, recovery, logging, and context
167
1683. **Git Daemon** (`pkg/daemon/`): Native git:// protocol support for anonymous read access. Implements connection pooling with configurable max connections and timeouts.
169
1704. **Stats Server**: Prometheus metrics endpoint for monitoring.
171
172### Access Control
173
174The `pkg/access/` package implements a four-level access system defined as iota constants:
175- `NoAccess` (0): Denies all access
176- `ReadOnlyAccess` (1): Can read public repos
177- `ReadWriteAccess` (2): Full control of repos
178- `AdminAccess` (3): Full server control
179
180Anonymous access is controlled by two settings:
181- `allow-keyless`: Whether to allow connections without SSH keys (affects HTTP/git:// protocols)
182- `anon-access`: Default access level for anonymous users (defaults to read-only)
183
184Access evaluation order:
1851. Admin users (via `initial_admin_keys` or user.is_admin flag) β†’ AdminAccess
1862. Repository owner (user_id matches repo.user_id) β†’ ReadWriteAccess  
1873. Explicit collaborators β†’ their assigned level
1884. Authenticated users β†’ read-only for public repos
1895. Anonymous users β†’ `anon-access` setting (if `allow-keyless` is true)
190
191## Data Flow and Git Hooks
192
193### Key Data Flow
194
1951. **Context Propagation**: Config, Backend, Store, and DB are passed through `context.Context` using type-safe context helpers (e.g., `config.FromContext(ctx)`). Each package defines its own context key as an unexported type to prevent collisions.
196
1972. **Initialization Sequence** (see `cmd/cmd.go:InitBackendContext`):
198   ```
199   Config β†’ DB β†’ Store β†’ Backend β†’ Context
200   ```
201   
202   This happens in `InitBackendContext` which is used as a PersistentPreRunE hook in commands that need backend access.
203
2043. **Request Flow**:
205   ```
206   SSH/HTTP/Git Protocol β†’ Middleware β†’ Backend β†’ Store β†’ Database
207                                   ↓
208                                Cache (LRU, 1000 items)
209   ```
210
2114. **SSH Command Flow**:
212   ```
213   SSH Session β†’ CommandMiddleware β†’ Cobra Command β†’ Backend Method β†’ Store β†’ DB
214   ```
215   
216   Non-PTY sessions are routed to CLI commands. PTY sessions go to the BubbleTea TUI.
217
218### Git Hooks System
219
220Git hooks use a clever two-tier system:
221
2221. **Hook Dispatcher** (generated in each repo's `hooks/` directory): Bash script that reads stdin, then calls all executables in `hooks/<hookname>.d/` directory, passing both stdin and arguments. Exits on first non-zero exit code.
223
2242. **Soft Serve Hook** (in `hooks/<hookname>.d/soft-serve`): Calls back into the `soft` binary via `soft hook <hookname>` command, which invokes Backend methods that implement the `hooks.Hooks` interface.
225
226Hook generation (`pkg/hooks/gen.go`):
227- Creates hook dispatcher at `hooks/<hookname>` 
228- Creates `hooks/<hookname>.d/` directory
229- Generates `hooks/<hookname>.d/soft-serve` that calls `${SOFT_SERVE_BIN_PATH} hook <hookname>`
230- Supports: pre-receive, update, post-receive, post-update
231
232This design allows:
233- Multiple hooks per event (run in alphabetical order from `.d/` directory)
234- User-defined hooks alongside Soft Serve's built-in hooks in each repo's `.d/` directory
235
236### Hook Environment Variables
237
238Git hooks receive environment variables set in the git daemon, SSH middleware, or HTTP handler:
239- `SOFT_SERVE_REPO_NAME`: Repository name (without .git)
240- `SOFT_SERVE_REPO_PATH`: Full path to repository  
241- `SOFT_SERVE_PUBLIC_KEY`: The authorized SSH key (in authorized_keys format)
242- `SOFT_SERVE_USERNAME`: Username for authenticated users
243- `SOFT_SERVE_BIN_PATH`: Path to soft binary (defaults to "soft" if not set)
244- `SOFT_SERVE_HOST`: Hostname from git protocol handshake
245- `SOFT_SERVE_LOG_PATH`: Path to hooks log file
246- `GIT_DIR`: Standard git variable
247- `GIT_PROTOCOL`: Git protocol version and capabilities
248
249The backend reads `SOFT_SERVE_PUBLIC_KEY` and `SOFT_SERVE_USERNAME` to identify users in hooks.
250
251## Database Schema and Migrations
252
253### Database Schema
254
255Key tables (see `pkg/db/models/`):
256- `users`: User accounts with admin flags, created_at, updated_at
257- `public_keys`: SSH public keys linked to users (one-to-many)
258- `repos`: Repository metadata with user_id owner (required)
259- `collabs`: Repository collaborators with access levels (many-to-many users↔repos)
260- `settings`: Server-wide settings (key-value store)
261- `lfs_objects`: Git LFS object metadata linked to repos
262- `lfs_locks`: Git LFS file locks with path and user
263- `access_tokens`: User access tokens for HTTP auth with optional expiry
264- `webhooks`: Repository webhook configurations with secret and events
265- `webhook_events`: Webhook event type subscriptions (many-to-many webhooks↔events)
266- `webhook_deliveries`: Webhook delivery attempts with request/response logs
267
268Important schema notes:
269- `repos.user_id` is `NOT NULL` (legacy repos are assigned to first admin during migration)
270- `public_keys` has a unique constraint on the key content (prevents duplicate keys)
271- `collabs` has a composite unique key on (user_id, repo_id)
272- `lfs_objects` has a composite unique key on (oid, repo_id)
273- `lfs_locks` has a composite unique key on (repo_id, path)
274- `webhook_events` has a composite unique key on (webhook_id, event)
275- Foreign key constraints are enabled and critical for data integrity
276
277### Database Migrations
278
279Migrations live in `pkg/db/migrate/` with separate SQL files for SQLite and PostgreSQL:
280- `NNNN_migration_name_sqlite.up.sql` / `.down.sql`
281- `NNNN_migration_name_postgres.up.sql` / `.down.sql`
282- Migration number prefixes must be unique and sequential
283- Go files (`NNNN_migration_name.go`) can handle data migrations
284
285The migration system in `migrate.go` automatically runs pending migrations on startup.
286
287### Transaction Patterns
288
289Always use `db.TransactionContext` for database operations:
290
291```go
292err := d.db.TransactionContext(ctx, func(tx *db.Tx) error {
293    // All operations use tx, not db
294    return d.store.CreateRepo(ctx, tx, name, ...)
295})
296return db.WrapError(err) // Wraps db errors as proto errors
297```
298
299The transaction automatically:
300- Commits on `nil` return
301- Rolls back on error return
302- Handles `sql.ErrTxDone` gracefully
303
304### Common Gotchas
305
306#### SQLite Foreign Keys
307
308SQLite requires `_pragma=foreign_keys(1)` in the DSN. The default config includes this, but custom configs must include it:
309```
310soft-serve.db?_pragma=busy_timeout(5000)&_pragma=foreign_keys(1)
311```
312
313Without this, cascade deletes won't work and referential integrity is not enforced.
314
315#### Transaction Context
316
317When creating database transactions, use `TransactionContext(ctx, fn)` to properly propagate context with logger and other values. The `Transaction(fn)` method uses `context.Background()` which loses context values.
318
319## Development Workflow Guides
320
321### Adding a New SSH Command
322
3231. Create command file in `pkg/ssh/cmd/` (e.g., `mycommand.go`)
3242. Define Cobra command with `RunE` function
3253. Use `checkIfReadable`, `checkIfCollab`, or `checkIfAdmin` for authorization
3264. Add command to `CommandMiddleware` in `pkg/ssh/middleware.go`
3275. Test with testscript in `testscript/testdata/`
328
329### Adding a New Database Table
330
3311. Create model in `pkg/db/models/`
3322. Create migration files in `pkg/db/migrate/` (both SQLite and Postgres)
3333. Add store interface methods to `pkg/store/` and implement in `pkg/store/database/`
3344. Update `pkg/backend/` to expose operations
3355. Test migrations with both databases
336
337### Adding a New Webhook Event
338
3391. Define event type in `pkg/webhook/event.go`
3402. Create event constructor in `pkg/webhook/` (e.g., `NewMyEvent`)
3413. Call `webhook.SendEvent(ctx, event)` where event should fire
3424. Add test in `testscript/testdata/`
343
344### Working with Git Operations
345
346All git operations should:
347- Use the `git` package, not exec.Command directly
348- Set appropriate timeouts (`CommandOptions.Timeout`)
349- Pass context for cancellation
350- Handle `git.ErrInvalidRepo` for non-existent repos
351- Use `git.EnsureWithin()` to prevent directory traversal attacks
352
353## Patterns and Conventions
354
355### Context Usage
356
357Always retrieve dependencies from context rather than passing them explicitly:
358```go
359cfg := config.FromContext(ctx)
360be := backend.FromContext(ctx)
361dbx := db.FromContext(ctx)
362store := store.FromContext(ctx)
363logger := log.FromContext(ctx)
364user := proto.UserFromContext(ctx)
365```
366
367### Repository Naming
368
369- User-facing repo names omit the `.git` extension (e.g., `dotfiles`)
370- On-disk repos are always `<name>.git` (e.g., `dotfiles.git`)
371- Repos are stored in `<data_path>/repos/<name>.git`
372- Nested repos are supported (e.g., `user/project.git` β†’ `repos/user/project.git`)
373- Always use `utils.SanitizeRepo()` before using user input as repo name
374- Always validate with `utils.ValidateRepo()` before creating/renaming repos
375
376### Cache Invalidation
377
378The backend maintains an LRU cache (1000 items) of repository objects. **Critical**: Always invalidate cache after mutations:
379
380```go
381// Delete cache before Db transaction
382d.cache.Delete(name)
383
384return d.db.TransactionContext(ctx, func(tx *db.Tx) error {
385    // ... database operations
386})
387```
388
389Cache invalidation happens in:
390- `DeleteRepository`: After deletion
391- `RenameRepository`: For old name
392- `SetDescription`, `SetPrivate`, `SetHidden`, `SetProjectName`: Before update
393
394Failure to invalidate cache will cause stale data to be served until cache eviction.
395
396### SSH Authentication Flow
397
3981. Client connects with public key
3992. `PublicKeyHandler` checks if key belongs to a user, stores fingerprint in `Permissions.Extensions["pubkey-fp"]`
4003. If all public keys fail, `KeyboardInteractiveHandler` checks `allow-keyless` setting
4014. `AuthenticationMiddleware` validates that the public key fingerprint in context matches the actual auth key (gossh doesn't guarantee this)
402
403This double-check prevents a security issue where gossh's PublicKeyCallback doesn't guarantee the approved key is the one actually used.
404
405### Task Manager for Long-Running Operations
406
407Repository imports run asynchronously using `pkg/task`:
408
409```go
410tid := "import:" + name
411d.manager.Add(tid, func(ctx context.Context) error {
412    // Long-running operation
413    return nil
414})
415
416done := make(chan error, 1)
417go d.manager.Run(tid, done)
418err := <-done
419```
420
421Tasks are identified by string ID and automatically cleaned up on completion. The manager respects context cancellation.
422
423### Repository Path Construction
424
425Always use `backend.repoPath()` or construct paths consistently:
426```go
427// Correct
428rp := filepath.Join(cfg.DataPath, "repos", repoName+".git")
429
430// Wrong - missing .git suffix
431rp := filepath.Join(cfg.DataPath, "repos", repoName)
432```
433
434The `.git` suffix is mandatory for git to recognize bare repositories.