# HugstonOne Help Book                                                     2026-03-30

**Version:** HugstonOne Enterprise Edition  


## 1. Executive summary

HugstonOne is a Local-Privacy-first desktop App for Windows In this build the app now does the important runtime things correctly:


- CLI follow-up turns no longer slow down because of a full blind transcript replay.
- Both CLI and server can be launched with raw extra flags and raw disabled flags.
- RAG, Agent mode, Online Search and terminal command mode are available from the UI.

## 2. What this guide covers

This guide documents the current build as it actually behaves, including limits. The unfinished controls are not complete.

## 3. Five-minute setup, Download and install the exe or msi or click the portable version.

### 1. Pick a model

Use Pick model to choose a local GGUF or other supported model file. The app uses the file you pick and keeps that directory as the model directory.

### 2. Optionally pick an MMProj

Use MMProj when you are loading a multimodal model that requires a projector. The app also tries to auto-discover a nearby mmproj/projector file.

### 3. Choose a runtime

Use Load CLI for single-user interactive work or Load Server for a local OpenAI-style HTTP runtime.

### 4. Keep Memory on if you want continuity

When Memory is on, the app saves history per session and per tab, so stopping and continuing in the same tab keeps context.

### 5. Add knowledge only when needed

Turn on RAG for local documents and Agent mode for long CLI task continuity. Turn on Online Search only when you deliberately want live web context.

### 6. Use Offline mode for air-gapped sessions

Offline mode blocks outbound HTTP(S) except loopback, so local CLI, local server and local API still work.


## 4. Modes

### CLI

**What it does:** Runs llama-cli or llama-mtmd-cli as the main interactive runtime.

**Best use:** Highest privacy, long coding sessions, agent continuity, terminal-driven workflows.

**Important notes:** Best-effort soft stop on Windows: the app first tries Ctrl+C / SIGINT / SIGBREAK, then hard-kills after a short timeout if needed.

### Server

**What it does:** Runs llama-server and sends requests through the API-compatible local HTTP endpoint.

**Best use:** Parallel-ish local service usage, multimodal server requests, local integrations.

**Important notes:** Memory is now preserved by session + tab when Memory is on. Stop cancels the request, not the whole server profile.

### Local API

**What it does:** API bridge at http://localhost:8000/v1 by default.

**Best use:** Connecting local tools, scripts or editors to whichever local runtime is active.

**Important notes:** Accepts messages or prompt; routes to CLI first if loaded, otherwise to llama-server if ready.

## 6. File support

### Images (.png/.jpg/.jpeg/.gif/.webp/.svg)

- Accepted: Yes
- Behavior: Previewed inline and sent to runtime. CLI sends /image to multimodal CLI; server queues image for next request.
- RAG-ready: Indirectly
- Notes: Image path is the strongest multimodal flow and was intentionally left intact.

### Audio / video (.mp3/.wav/.m4a/.mp4/.mov/.webm)

- Accepted: Yes
- Behavior: Previewed inline and stored.
- RAG-ready: No dedicated transcript path
- Notes: There is no built-in offline transcription pipeline in this patch.

### Plain text and code (.txt/.md/.json/.csv/.xml/.html/.js/.ts/.tsx/.py/.sql etc.)

- Accepted: Yes
- Behavior: Opened as text, previewed, and sent to the model.
- RAG-ready: Yes
- Notes: Best category for coding, analysis and workspace indexing.

### PDF

- Accepted: Yes
- Behavior: Text extraction uses local pdf.js and the PDF text layer only.
- RAG-ready: Yes
- Notes: Scanned PDFs without selectable text will extract poorly until OCR is added.

### DOCX / XLSX / PPTX

- Accepted: Yes
- Behavior: Text extraction uses JSZip on OOXML internals.
- RAG-ready: Yes
- Notes: Great for text-centric retrieval, not a perfect layout parser.

### ZIP

- Accepted: Yes
- Behavior: Lists archive manifest and extracts text-like contents when possible.
- RAG-ready: Yes
- Notes: Useful for code bundles and structured document packs.

### RAR / 7z

- Accepted: Yes
- Behavior: Currently returns a stub note instead of full extraction.
- RAG-ready: Not yet
- Notes: Add your own local unrar/7z parser to make this fully offline.

### Folders

- Accepted: Yes via Add folder
- Behavior: Recursive file walk; current folder add path indexes up to the first 500 files selected from the chosen folder tree.
- RAG-ready: Yes
- Notes: Best for codebases, notes, and document repositories.

## 7. Terminal commands

- `help` - Shows the terminal command list.
- `status` - Shows network mode, runtime state, current session/tab and RAG counts.
- `load-cli` - Clicks Load CLI with the current model and CLI settings.
- `unload-cli` - Clicks Unload CLI.
- `load-server` - Starts llama-server with the current server profile unless already active.
- `stop-server` - Stops llama-server if it is active.
- `send <message>` - Places text in the prompt box and sends it.
- `model <name-fragment>` - Selects the first model whose filename contains the fragment.
- `flags-cli <raw flags>` - Replaces CLI extra flags.
- `flags-server <raw flags>` - Replaces server extra flags.
- `disable-cli <flags>` - Sets the list of default CLI flags to remove.
- `disable-server <flags>` - Sets the list of default server flags to remove.
- `rag on / off` - Enables or disables RAG mode.
- `rag-files` - Opens the Add files picker for RAG indexing.
- `rag-folder` - Opens the folder picker and recursively indexes the folder.
- `agent on / off` - Enables or disables Agent mode.
- `coding on / off` - Enables or disables Coding mode.
- `online on / off` - Enables or disables Online Search mode.
- `clear` - Clears the visible terminal output area.

## 8. Privacy

- Offline mode blocks outbound HTTP(S) except loopback addresses such as localhost and 127.0.0.1. That means your local llama-server and Local API still work while external web access is blocked.
- Online Search is opt-in only. It does nothing until you enable it and provide one or more URL templates containing {q}.
- The online retrieval path fetches a limited snippet from each configured URL, strips scripts/styles/markup and injects only excerpt text into the model prompt.
- The service worker caches same-origin static assets only. It does not cache localhost API calls and does not cache cross-origin content.
- The app is local-Privacy-first, but it is not air-gapped unless you use Offline mode and possibly work with a firewall. 

## 9. Troubleshooting

### CLI stop does stop instantly on Windows

Expected sometimes. The app first tries a soft interrupt and then hard-kills after about 1.5 seconds if the process does not exit.

### Server forgets context

Make sure Memory is on and stay in the same session and tab. The patch keys history by sessionId + tabId.


### PDF or Office retrieval is weak

The current extraction is text-centric. Scanned PDFs need OCR; Office files are extracted through OOXML text, not a full semantic parser.


### Newsfeed button useful

The button loads ./newsfeed.html into the preview frame if the file exists. If it does not, the UI alerts. The renderer also uses the button as a shortcut to Online Sources.

