Rúmilo

Dedicated CLI research agent

Query rumilo (or make an rml alias) in web or repo mode to have an LLM find an answer. Ask about ... anything! It knows the current date/time so it can do a reasonable job at finding the latest version of $thing. It can find some tool's docs and provide usage instructions. It can read a library's source code to provide usage instructions.

Use web mode to search the internet and read webpages. Provide a URL with -u to pre-fetch and attach that page. Rúmilo can still search/fetch if it decides that page isn't enough to answer the query.

Use repo mode with a provided clone URI to clone the repo into a temporary directory and let the LLM do basic searching/listing/reading/etc. inside. If you or the main agent don't know the clone URI, use web mode to find it.

You can use any model from any provider, remote or local. In theory. Rúmilo relies on Pi for all the LLM agenty things. Web search is currently only through Kagi using kagi-ken. Kagi normally charge for API access, kagi-ken uses your existing subscription. Webpage fetching is currently only handled by Mozilla's Tabstack service. It prefers cached copies if available, avoiding additional load on the destination site when possible. I intend to add some sort of extension system so it's easier to add support for any search/fetch provider.

Usage

rumilo web "how do llm model loop werk?"
rumilo web -u https://dev.synthetic.new/docs/openai/models "how should my client parse reasoning content?"
rumilo repo -u https://github.com/synthetic-lab/octofriend "how does octo apply edits?"

Requirements

Bun
Git (for repo mode)
Kagi session token (copy session link from user details page, paste somewhere, your token is after token=)
Tabstack API key (log into the console, and create a new API key)

Configuration

Rúmilo reads configuration from $XDG_CONFIG_HOME/rumilo/config.toml.

Example:

[defaults]
model = "anthropic:claude-sonnet-4-5"
cleanup = true

[web]
model = "anthropic:claude-haiku-4-5"

Custom Models

You can define custom OpenAI-compatible endpoints like Ollama, vLLM, or self-hosted models in the [custom_models] section:

[custom_models.syn_glm47]
provider = "synthetic"
api = "openai-completions"
base_url = "https://api.synthetic.new/openai/v1"
api_key = "!fnox get SYNTHETIC_API_KEY"
id = "hf:zai-org/GLM-4.7"
name = "GLM 4.7"
reasoning = true
input = ["text"]
cost = { input = 0, output = 0 }
context_window = 198000
max_tokens = 64000

Use custom models with the custom: prefix:

rumilo web "query" --model custom:syn_glm47
rumilo repo -u <uri> "query" --model custom:syn_glm47

Custom Model Fields

All custom model fields — including provider/API configuration, value resolution for api_key and headers, and OpenAI compatibility flags — are documented in pi-mono's model configuration docs. That doc uses JSON; in Rúmilo's TOML config, field names use snake_case instead of camelCase (e.g. base_url instead of baseUrl, context_window instead of contextWindow).

Compatibility flags go in a [custom_models.<name>.compat] sub-table:

[custom_models.mistral.compat]
max_tokens_field = "max_tokens"
requires_tool_result_name = true
requires_thinking_as_text = true
requires_mistral_tool_ids = true

Note on supports_developer_role: OpenAI's Responses API uses role: "developer" instead of "system" for reasoning models. Most OpenAI-compatible endpoints do not support this and will return "Incorrect role information" errors. If your custom model has reasoning = true and you encounter this error, set supports_developer_role = false in the compat section to use "system" instead.

Credentials

Set credentials either via config (above) or environment:

KAGI_SESSION_TOKEN: Kagi session token
TABSTACK_API_KEY: Tabstack API key

Clone