Jan CLI

Available since Jan 0.7.8 (March 2026).

The jan CLI lets you serve local AI models and launch autonomous agents from your terminal — no cloud account, no usage fees, full privacy.


       ██╗ █████╗ ███╗  ██╗
       ██║██╔══██╗████╗ ██║
       ██║███████║██╔██╗██║
  ██   ██║██╔══██║██║╚████║
  ╚█████╔╝██║  ██║██║ ╚███║
   ╚════╝ ╚═╝  ╚═╝╚═╝  ╚══╝
Jan runs local AI models (LlamaCPP / MLX) and exposes them via an
OpenAI-compatible API, then wires AI coding agent like Claude Code
directly to your own hardware — no cloud account, no usage fees, full privacy.
Models downloaded in the Jan desktop app are automatically available here.
Usage: jan <COMMAND>
Commands:
  serve    Load a local model and expose it at localhost:6767/v1 (auto-detects LlamaCPP or MLX)
  launch   Start a local model, then launch an AI agent with it pre-wired (env vars set automatically)
  threads  List and inspect conversation threads saved by the Jan app
  models   List and load models installed in the Jan data folder
  help     Print this message or the help of the given subcommand(s)
Options:
  -h, --help
          Print help (see a summary with '-h')
  -V, --version
          Print version
Examples:
  jan launch claude                                      # pick a model, then run Claude Code against it
  jan launch claude --model janhq/Jan-code-4b-gguf       # use a specific model
  jan serve janhq/Jan-code-4b-gguf                       # expose a model at localhost:6767/v1
  jan serve janhq/Jan-code-4b-gguf --fit                 # auto-fit context to available VRAM
  jan serve janhq/Jan-code-4b-gguf --detach              # run in the background
  jan models list                                        # show all installed models

Models downloaded in the Jan desktop app are automatically available to the CLI.

Installation

Jan CLI is installed automatically when you launch the Jan desktop app for the first time — no extra steps needed. You can uninstall or reinstall it at any time from Settings > General > Jan CLI.

On macOS/Linux the installer writes the binary to /usr/local/bin/jan when that directory is writable, otherwise it falls back to ~/.local/bin/jan. Make sure the chosen directory is on your $PATH so the jan command works from any terminal.

The Jan desktop app helps you manage inference backends (LlamaCPP, MLX) and models — or use the CLI to manage them. Both share the same data folder, so models installed via either interface are available to both.

Quick Start

Getting started takes a single command:


jan launch

Jan will ask you to pick an agent (Claude Code or OpenClaw), then automatically download and set up Jan's foundation model and wire it to the agent for you. No config files, no API keys, no cloud — your agent runs entirely on your own hardware.

Commands

`jan serve`

Load a local model and expose it at localhost:6767/v1 as an OpenAI-compatible API. Auto-detects LlamaCPP or MLX.


jan serve [MODEL_ID] [OPTIONS]

Option	Description	Default
`MODEL_ID`	Model ID to load (omit to pick interactively). Can be a local model ID or a HuggingFace repo ID (e.g., `unsloth/Qwen3.5-9B-GGUF`)	—
`--port`	Port to listen on (0 = random free port)	`6767`
`--api-key`	API key required by clients	`""`
`-d, --detach`	Run in background, print PID	—
`--embedding`	Treat model as an embedding model (not yet supported with llamacpp on the CLI — load embedding models from the desktop app)	—
`--timeout`	Seconds to wait for the model server to become ready	`120`
`-v, --verbose`	Print full server logs	—

⚠️

Since Jan 0.8.0, llama-server runs in router mode under a single process. The flags --ctx-size, --n-gpu-layers, --threads, and --fit are accepted by the CLI for backwards compatibility but are currently ignored — per-model inference settings come from the router preset (<data-folder>/llamacpp/router.preset.ini) generated by the desktop app. Tune them from the desktop UI for now.

When no model ID is provided, an interactive selector is shown. If no models are installed yet, Jan will automatically download its default foundation model to get you started:


$ jan serve
━━━ Select Model ━━━
Choose a model:
> janhq/Jan-v3-4B-base-instruct-gguf [LlamaCPP]
  sentence-transformer-mini [LlamaCPP]
  Jan-v3-4B-base-instruct-4bit [MLX]

Examples:


jan serve                              # pick a model interactively
jan serve qwen3.5-35b-a3b             # serve a specific model
jan serve qwen3.5-35b-a3b --fit       # auto-fit context to available VRAM
jan serve qwen3.5-35b-a3b --detach    # run in background
jan serve qwen3.5-35b-a3b --port 8080 # serve on a custom port
jan serve unsloth/Qwen3.5-9B-GGUF     # download and serve a HuggingFace model

When using a HuggingFace repo ID (e.g., unsloth/Qwen3.5-9B-GGUF), if the model isn't downloaded yet, Jan will automatically download it from HuggingFace.

LlamaCPP Server Arguments

Environment variables with the LLAMA_ARG_ prefix are forwarded to the underlying llama-server router process at startup. Per-model defaults (context size, GPU layers, threads, flash-attn, KV cache type, etc.) come from the router preset generated by the desktop app at <data-folder>/llamacpp/router.preset.ini — edit a model's settings in the desktop UI to change them.


export LLAMA_ARG_HOST=0.0.0.0          # bind the router to a specific host
jan serve qwen3.5-35b-a3b

`jan launch`

Start a local model, then launch an AI agent with it pre-wired — environment variables are set automatically so the agent connects to your local model.


jan launch [PROGRAM] [OPTIONS]

Option	Description	Default
`PROGRAM`	Agent to launch: `claude`, `openclaw` (omit to pick interactively)	—
`--model`	Model ID to load (omit to pick interactively)	—
`--port`	Port for the model server	`6767`
`--api-key`	API key (exported as `OPENAI_API_KEY` and `ANTHROPIC_AUTH_TOKEN`)	`jan`
`-v, --verbose`	Print full server logs	—

As with jan serve, the --ctx-size, --n-gpu-layers, and --fit flags are accepted but currently ignored in router mode. Adjust per-model inference parameters from the desktop app.

When no agent or model is specified, interactive selectors are shown. If no models are installed, Jan will automatically download its default foundation model before launching the agent:


$ jan launch
━━━ Select Agent ━━━
Choose an agent to launch:
> Claude Code  — Anthropic's AI coding agent
  OpenClaw     — Open-source autonomous AI agent

Examples:


jan launch claude                                    # pick a model, then run Claude Code
jan launch claude --model qwen3.5-35b-a3b           # use a specific model with Claude Code
jan launch openclaw --model qwen3.5-35b-a3b         # wire OpenClaw to a local model

`jan models`

List and manage models installed in the Jan data folder.


jan models list              # list all installed models
jan models load <MODEL_ID>   # serve a model (alias for jan serve)
jan models load-mlx <ID>     # load an MLX model (macOS / Apple Silicon only)

`jan threads`

List and inspect conversation threads saved by the Jan desktop app.


jan threads list                    # list all threads
jan threads get <ID>                # get a thread's metadata
jan threads messages <THREAD_ID>    # list all messages in a thread
jan threads delete <ID>             # permanently delete a thread

Common Workflows

Serve a model for use with any OpenAI-compatible client:


jan serve jan-code-4b --fit

Launch Claude Code against a local model:


jan launch claude --model jan-code-4b

Run a model in the background:


jan serve jan-code-4b --detach

List all installed models:


jan models list

Troubleshooting

For common issues — including Windows-specific problems like jan opening the desktop app instead of the CLI — see the Troubleshooting guide.

API Reference Model Context Protocol