Rúmilo
Dedicated CLI research agent
Query rumilo (or make an rml alias) in web or repo mode to have an LLM
find an answer. Ask about ... anything! It knows the current date/time so it
can do a reasonable job at finding the latest version of $thing. It can find
some tool's docs and provide usage instructions. It can read a library's source
code to provide usage instructions.
Use web mode to search the internet and read webpages. Provide a URL with -u
to pre-fetch and attach that page. Rúmilo can still search/fetch if it decides
that page isn't enough to answer the query.
Use repo mode with a provided clone URI to clone the repo into a temporary
directory and let the LLM do basic searching/listing/reading/etc. inside. If you
or the main agent don't know the clone URI, use web mode to find it.
You can use any model from any provider, remote or local. In theory. Rúmilo relies on Pi for all the LLM agenty things. Web search is currently only through Kagi using kagi-ken. Kagi normally charge for API access, kagi-ken uses your existing subscription. Webpage fetching is currently only handled by Mozilla's Tabstack service. It prefers cached copies if available, avoiding additional load on the destination site when possible. I intend to add some sort of extension system so it's easier to add support for any search/fetch provider.
Usage
rumilo web "how do llm model loop werk?"
rumilo web -u https://dev.synthetic.new/docs/openai/models "how should my client parse reasoning content?"
rumilo repo -u https://github.com/synthetic-lab/octofriend "how does octo apply edits?"
Requirements
- Bun
- Git (for repo mode)
- Kagi session token (copy session link from user details
page, paste somewhere, your token is after
token=) - Tabstack API key (log into the console, and create a new API key)
Configuration
Rúmilo reads configuration from $XDG_CONFIG_HOME/rumilo/config.toml.
Example:
[defaults]
model = "anthropic:claude-sonnet-4-5"
cleanup = true
[web]
model = "anthropic:claude-haiku-4-5"
Custom Models
You can define custom OpenAI-compatible endpoints like Ollama, vLLM, or
self-hosted models in the [custom_models] section:
[custom_models.syn_glm47]
provider = "synthetic"
api = "openai-completions"
base_url = "https://api.synthetic.new/openai/v1"
api_key = "!fnox get SYNTHETIC_API_KEY"
id = "hf:zai-org/GLM-4.7"
name = "GLM 4.7"
reasoning = true
input = ["text"]
cost = { input = 0, output = 0 }
context_window = 198000
max_tokens = 64000
Use custom models with the custom: prefix:
rumilo web "query" --model custom:syn_glm47
rumilo repo -u <uri> "query" --model custom:syn_glm47
Custom Model Fields
All custom model fields — including provider/API configuration, value resolution
for api_key and headers, and OpenAI compatibility flags — are documented in
pi-mono's model configuration
docs.
That doc uses JSON; in Rúmilo's TOML config, field names use snake_case
instead of camelCase (e.g. base_url instead of baseUrl, context_window
instead of contextWindow).
Compatibility flags go in a [custom_models.<name>.compat] sub-table:
[custom_models.mistral.compat]
max_tokens_field = "max_tokens"
requires_tool_result_name = true
requires_thinking_as_text = true
requires_mistral_tool_ids = true
Note on supports_developer_role: OpenAI's Responses API uses role: "developer" instead of "system" for reasoning models. Most OpenAI-compatible
endpoints do not support this and will return "Incorrect role information"
errors. If your custom model has reasoning = true and you encounter this
error, set supports_developer_role = false in the compat section to use
"system" instead.
Credentials
Set credentials either via config (above) or environment:
KAGI_SESSION_TOKEN: Kagi session tokenTABSTACK_API_KEY: Tabstack API key