# Chat toolkit — user's guide *Generated 2026-06-09T10:16:30* This guide covers `bin/chat.js` (the CLI driver) and the OpenAI-interfacing libraries in `lib/`. The CLI is a thin shell over the libs — anything you can do from the command line you can do from Node by importing the lib function directly. ## Backends overview `bin/chat.js` auto-discovers any `lib/chatgpt*.js` file (excluding `chatgpt_rest_sn.js`, the ServiceNow-only port) and registers its first exported function under the filename stem. Every backend exports the same shape: `chatgpt(prompt, opts) => { reply, messages }`. | Backend | OpenAI API | Transport | Tools (file_search) | Multi-turn cache (CLI) | |---|---|---|---|---| | `chatgpt` | Chat Completions | `openai` SDK | — | — | | `chatgpt_rest_completions` | Chat Completions | raw `fetch()` | — | — | | `chatgpt_rest_responses` | Responses | raw `fetch()` | yes | yes | The raw-`fetch()` variants exist so the code can be ported to environments without the openai SDK — most notably ServiceNow Script Includes. See `lib/chatgpt_rest_sn.js` for the ServiceNow port, which covers both the Completions and Responses APIs in one class; the [ServiceNow usage](#servicenow-usage) section below documents it. Why pick one over another? - `chatgpt` — quickest path for SDK-based prototypes that already use `openai`. - `chatgpt_rest_completions` — same behavior as `chatgpt`, no SDK dependency, trivially portable to ServiceNow. - `chatgpt_rest_responses` — required for File Search (RAG over uploaded knowledge files), required for the top-level `instructions` field, and the only backend the CLI persists conversation state for between calls. ## CLI: bin/chat.js ```bash ./bin/chat.js -a [opts] <<< 'prompt text' ``` The prompt is read from stdin so here-strings, here-docs, and pipes all work. Reply goes to stdout; progress and errors go to stderr. ### Flag reference | Flag | Applies to | Notes | |---|---|---| | `-a ` | all (required) | Backend name; `-h` shows the live list. | | `-m`, `--model ` | all | Overrides `$OPENAI_MODEL` and the backend default. | | `--system ` | all | System prompt prepended to `messages` / `input[]`. | | `-i`, `--instructions ` | responses only | Top-level `instructions` field on the request. | | `--vector-store ` | responses only | Adds to `file_search` vector_store_ids; repeatable. | | `-z`, `--zap` | responses only | Discards the cached session before this call. | | `-h`, `--help` | — | Live help listing registered backends and the cache path. | Env vars: `OPENAI_API_KEY` (required) and `OPENAI_MODEL` (optional override). ### Examples Basic single-turn calls, one per backend: ```bash source .env ./bin/chat.js -a chatgpt <<< 'What is the capital of France?' ./bin/chat.js -a chatgpt_rest_completions <<< 'What is the capital of France?' ./bin/chat.js -a chatgpt_rest_responses <<< 'What is the capital of France?' ``` Model + system prompt: ```bash ./bin/chat.js -a chatgpt_rest_responses --model gpt-4o --system 'Be terse.' \ <<< 'Explain closures briefly.' ``` Responses-only top-level instructions (independent of `--system`; both can be used together): ```bash ./bin/chat.js -a chatgpt_rest_responses --instructions 'Reply in all caps.' <<< 'Hi.' ``` ## Multi-turn conversations The `chatgpt_rest_responses` backend persists conversation history between invocations at `~/.cache/ai-support/responses-session.json`. Each call appends its prompt + reply, and the next call reloads the cached `messages` array and passes it back to the Responses API. (The stateless `input` form is used; this repo does not currently use `previous_response_id` chaining.) ```bash ./bin/chat.js -a chatgpt_rest_responses <<< 'My favorite Polish word is "dziękuję".' ./bin/chat.js -a chatgpt_rest_responses <<< 'What word did I tell you?' # remembers ``` Pass `-z` to start a fresh conversation (the cache file is deleted before this turn): ```bash ./bin/chat.js -a chatgpt_rest_responses -z <<< 'New topic: tell me about Polish nasal vowels.' ``` `-z` is only valid with `chatgpt_rest_responses`; other backends reject it. The Completions backends do not carry state between CLI invocations — call the lib function programmatically and pass `messages` yourself if you need multi-turn there. ## Knowledge files via File Search `chatgpt_rest_responses` can ground replies in a corpus you upload to OpenAI's File Search vector store, exposed in the Responses API as the `file_search` tool. The end-to-end workflow is two scripts in this repo: 1. `bin/stageKnowledge.js` uploads a file, creates (or reuses) a vector store, attaches the file, polls until indexing completes, and prints the vector store ID on stdout. 2. `./bin/chat.js -a chatgpt_rest_responses --vector-store ` references the store from chat calls. ### Prerequisites - `OPENAI_API_KEY` exported (or sourced from `.env`). - A file in a File Search–supported format (markdown, plain text, PDF, HTML, several Office formats). Check OpenAI's docs for the current list and the per-file size limit. ### Step 1 — Stage the knowledge file `bin/stageKnowledge.js` performs: 1. `POST /v1/files` with `purpose=assistants` — uploads the raw file. 2. `POST /v1/vector_stores` — creates the store (skipped when `--reuse` is given). 3. `POST /v1/vector_stores/{id}/files` — attaches the file, carrying any `--attr key=value` metadata as the file's `attributes` (used both on the new-store and `--reuse` paths). 4. `GET /v1/vector_stores/{id}/files/{file_id}` polled until `status` is `completed` or `failed`. 5. Vector store ID written to stdout; progress lines go to stderr so the ID can be captured cleanly. ```bash VS_ID=$(./bin/stageKnowledge.js path/to/notes.md) echo "$VS_ID" # e.g. vs_6a1ce825e40c8191a52e8dbf44ee3270 ``` Variants: ```bash # Custom vector store name (default is the file stem): ./bin/stageKnowledge.js notes.md --name 'polish-notes-2026-05' # Attach an additional file to an existing vector store: ./bin/stageKnowledge.js extra-notes.md --reuse "$VS_ID" # Tag the file with metadata attributes (repeatable) for later file_search filters: ./bin/stageKnowledge.js template-rest.md --reuse "$VS_ID" \ --attr kind=template --attr scenario=rest-integration # Show full usage: ./bin/stageKnowledge.js -h ``` `--attr` values are coerced: `true`/`false` become booleans, numeric strings become numbers, everything else stays a string — matching the value types the `file_search` `filters` comparators expect. Attributes set here are what the `filters` option on `chatResponses` (and the raw `file_search` tool) match against at query time. ### Step 2 — Reference from chat.js ```bash ./bin/chat.js -a chatgpt_rest_responses --vector-store "$VS_ID" \ <<< 'Tell me my top 3 Polish pronunciation difficulties.' ``` Variants: ```bash # Multiple knowledge bases — flag is repeatable: ... --vector-store "$VS_POLISH" --vector-store "$VS_GERMAN" # Combine with system or instructions: ... --system 'Be terse.' --vector-store "$VS_ID" ... --instructions 'Cite source line numbers.' --vector-store "$VS_ID" # Nudge the model to actually invoke file_search (it can choose not to): ... --instructions 'Use the file_search tool to ground your answer in my notes.' \ --vector-store "$VS_ID" # Override model: ... --model gpt-4o --vector-store "$VS_ID" ``` Multi-turn + File Search compose naturally — the session cache thread-throughs prior turns while each new turn can still retrieve fresh chunks from the vector store(s). ### Metadata attributes and filtering File Search can restrict retrieval to a subset of a vector store using per-file metadata — a two-phase workflow: 1. **Tag at stage time** — `bin/stageKnowledge.js … --attr key=value` (repeatable) records attributes on the file's vector-store attachment. 2. **Filter at query time** — pass a `filters` object on the `file_search` tool; only files whose attributes match are searched, and ranking then runs over that narrowed set. A filter is a single comparison or a boolean combination of them: ```js // single comparison { type: 'eq', key: 'kind', value: 'template' } // compound { type: 'and', filters: [ { type: 'eq', key: 'kind', value: 'template' }, { type: 'eq', key: 'scenario', value: 'rest-integration' }, ], } ``` Comparators: `eq`, `ne`, `gt`, `gte`, `lt`, `lte`, combined with `and` / `or`. Keys and value types must match what `--attr` wrote — which is why the flag coerces `true`/`false` to boolean and numeric strings to number. **Where filters work:** programmatically — the lib functions or the ServiceNow `ChatGPT.chatResponses` `filters` option — **not** from the `bin/chat.js` CLI, which has no `--filter` flag (it only forwards `--vector-store`). Reach for filtering when you want *deterministic* scoping ("search only the templates") instead of trusting the model's semantic query to stay in the right lane. ```js // ServiceNow: search only template-tagged files in a combined store bot.chatResponses(request, { vectorStoreIds: [STORE_ALL], filters: { type: 'eq', key: 'kind', value: 'template' }, }); ``` ### Cleanup / management `bin/stageKnowledge.js` only creates artifacts; nothing deletes them. Vector stores and their underlying files persist on your OpenAI account and continue to incur storage costs until removed. Use the OpenAI platform dashboard at https://platform.openai.com/storage, or the REST API directly: ```bash # List vector stores curl -sS -H "Authorization: Bearer $OPENAI_API_KEY" \ https://api.openai.com/v1/vector_stores # Inspect one curl -sS -H "Authorization: Bearer $OPENAI_API_KEY" \ "https://api.openai.com/v1/vector_stores/$VS_ID" # Delete a vector store (detaches files; the underlying /v1/files records # remain and must be deleted separately if no longer needed) curl -sS -X DELETE -H "Authorization: Bearer $OPENAI_API_KEY" \ "https://api.openai.com/v1/vector_stores/$VS_ID" # List uploaded files curl -sS -H "Authorization: Bearer $OPENAI_API_KEY" \ https://api.openai.com/v1/files # Delete an uploaded file curl -sS -X DELETE -H "Authorization: Bearer $OPENAI_API_KEY" \ "https://api.openai.com/v1/files/$FILE_ID" ``` ### File Search gotchas - **Indexing is asynchronous** — `bin/stageKnowledge.js` blocks until the file's per-file status reaches `completed` or `failed`. Don't query a vector store before this finishes; `file_search` will return nothing. - **The aggregate `file_counts` is eventually consistent** — right after attaching a file, the aggregate counters can briefly report `in_progress: 0` before the attachment is registered. That's why the script polls the per-file endpoint instead. - **Retrieval is non-deterministic** — the model may choose not to invoke `file_search` on a given turn. If a query that should hit the knowledge base doesn't, add an `--instructions` nudge as shown above. - **File Search vs ChatGPT Projects** — the ChatGPT UI's "project source files" are a separate, UI-only mechanism. They are *not* accessible from `/v1/responses` API calls. This workflow is the API-side equivalent. ## Programmatic use Each backend lib exports a single `chatgpt(prompt, opts) => { reply, messages }` function. Import the one you want and call it directly: ```js import { chatgpt } from '../lib/chatgpt_rest_responses.js'; const { reply, messages } = await chatgpt('Hello.', { model: 'gpt-4o-mini', systemPrompt: 'Be terse.', instructions: 'Reply in all caps.', vectorStoreIds: [process.env.VS_ID], // messages: priorHistory, // pass prior turn's messages to continue }); ``` The `opts` keys map cleanly onto CLI flags: `model` ↔ `--model`, `systemPrompt` ↔ `--system`, `instructions` ↔ `--instructions`, `vectorStoreIds` ↔ repeated `--vector-store`. The session cache file used by the CLI is purely a CLI concern — when calling the lib directly you manage `messages` yourself by threading the returned array back into the next call. Backend-by-backend opts cheat-sheet: | Backend | Accepted opts | |----------------------------|----------------------------------------------------------------| | `chatgpt` | `model`, `systemPrompt`, `messages`, `apiKey` | | `chatgpt_rest_completions` | `model`, `systemPrompt`, `messages`, `apiKey` | | `chatgpt_rest_responses` | + `instructions`, `vectorStoreIds` | All backends accept and return the same `messages` shape (`{ role, content }[]`), so they're interchangeable for stateless multi-turn use as long as you don't depend on Responses-only fields. ## ServiceNow usage `lib/chatgpt_rest_sn.js` is the ServiceNow port: an es_latest scoped Script Include exporting a `ChatGPT` class that talks to OpenAI through `sn_ws.RESTMessageV2` instead of `fetch()`. One class covers both APIs — `chatResponses()` and `chatCompletions()` — so it's the platform-side equivalent of `chatgpt_rest_responses` and `chatgpt_rest_completions` combined. It is **not** runnable in Node; run the examples below from a background script (System Definition → Scripts - Background) in the application's scope, or call the class from any server-side script. From another scope, qualify the name — `new x_yourscope_yourapp.ChatGPT(...)`. ### Prerequisites - System property `.openai.api_key` holding the API key, where `` is the application scope (e.g. `x_488706_blaine.openai.api_key`). The class resolves the property name at runtime with `gs.getCurrentScopeName()`, so it is portable across scopes with no code change. - An outbound REST message to `api.openai.com` must be permitted by the instance's egress controls (IP Address Access Control / any outbound proxy). ### Configuration vs. per-call parameters Constructor options are stable config; the API-specific knobs are per-call parameters, mirroring the CLI/lib split: | Constructor option | Purpose | |---|---| | `model` | Model ID (default `gpt-4o-mini`). | | `systemPrompt` | System message for new conversations. | | `apiKey` | Override the scoped `openai.api_key` property. | | `sessionId` | Resume a Responses conversation (see below). | `chatResponses(prompt, { instructions, vectorStoreIds, maxNumResults, filters, rankingOptions })` returns `{ reply, sessionId }`; `chatCompletions(prompt, messages)` returns `{ reply, messages }`; `clearSessionId()` drops the continuation token. The last three `chatResponses` options tune the `file_search` tool — `maxNumResults` (chunks retrieved), `filters` (metadata attribute filtering), and `rankingOptions` (e.g. `{ ranker, score_threshold }`) — and are ignored when no `vectorStoreIds` are given. `listVectorStores(limit)` and `listFiles(limit)` return trimmed inventory arrays — the platform-side equivalent of the `curl` listing under [Cleanup](#cleanup--management), useful since a scoped script can't shell out to `curl`. ### Multi-turn: server-side continuation Unlike the CLI (which replays a cached `messages` array), the Script Include continues a Responses conversation **server-side** via `previous_response_id`. The returned `sessionId` is that response ID. Within one transaction, reuse the same instance; the second turn sends only the new prompt and OpenAI supplies the prior context: ```js const bot = new ChatGPT({ systemPrompt: 'You are a Polish tutor.' }); bot.chatResponses('My favorite Polish word is "dziękuję".'); const second = bot.chatResponses('What word did I tell you?'); // remembers gs.info(second.reply); ``` To continue across transactions (e.g. a Virtual Agent topic or a series of event-driven jobs), persist the returned `sessionId` — in a record field, user preference, or system property — and hand it to a fresh instance later: ```js const first = new ChatGPT().chatResponses('Remember the number 42.'); // ... store first.sessionId somewhere durable ... const resumed = new ChatGPT({ sessionId: first.sessionId }); gs.info(resumed.chatResponses('What number did I give you?').reply); ``` `clearSessionId()` (or simply a new instance with no `sessionId`) starts fresh. ### Example calls Single-turn, Responses API: ```js const bot = new ChatGPT(); gs.info(bot.chatResponses('What is the capital of France?').reply); ``` Model, system prompt, top-level instructions, and File Search over a staged vector store (see [Knowledge files](#knowledge-files-via-file-search) for how to create one — `bin/stageKnowledge.js` runs from your workstation, and the vector store ID it prints is what you pass here): ```js const bot = new ChatGPT({ model: 'gpt-4o', systemPrompt: 'Be terse.' }); const res = bot.chatResponses('Summarize my notes.', { instructions: 'Use the file_search tool; cite source line numbers.', vectorStoreIds: ['vs_6a1ce825e40c8191a52e8dbf44ee3270'], }); gs.info(res.reply); ``` Completions API (stateless — thread `messages` yourself to continue): ```js const bot = new ChatGPT(); const first = bot.chatCompletions('Explain JavaScript closures.'); const next = bot.chatCompletions('Now give a one-line example.', first.messages); gs.info(next.reply); ``` Inventory vector stores and uploaded files (newest first; `limit` caps at 100): ```js const bot = new ChatGPT(); for (const vs of bot.listVectorStores()) gs.info(`${vs.id} ${vs.name} files=${vs.fileCount} ${vs.status}`); for (const f of bot.listFiles()) gs.info(`${f.id} ${f.filename} ${f.bytes}B ${f.purpose}`); ``` Each store record carries `{ id, name, status, fileCount, bytes, createdAt }` and each file record `{ id, filename, bytes, purpose, status, createdAt }` (`createdAt` is OpenAI's unix-epoch-seconds value). ### ServiceNow notes - **No proxy env vars** — the `http_proxy`/`https_proxy` mechanism described under [General notes](#general-notes) is Node-only. `RESTMessageV2` routes through the instance's own outbound HTTP proxy / MID server configuration instead. - **Responses-only knobs** — `instructions` and `vectorStoreIds` exist only on `chatResponses`; `chatCompletions` has no such parameters. `sessionId` is likewise only meaningful for `chatResponses`. - **Synchronous** — `RESTMessageV2.execute()` blocks, so these calls count against transaction time limits. For long batches, drive them from a scheduled job or async business rule rather than an interactive transaction. ## Worked example — rules-driven code generation End-to-end walkthrough of a code generator grounded in a small policy corpus: one conventions file, two rule files, and three template files, all in a single vector store, tagged so retrieval can be scoped. ### The corpus | File | `--attr` tags | |---|---| | `codingConventions.md` | `kind=convention` | | `templateRules-core.md` | `kind=rule` | | `templateRules-integration.md` | `kind=rule` | | `template-rest.md` | `kind=template`, `scenario=rest` | | `template-scheduledJob.md` | `kind=template`, `scenario=scheduled-job` | | `template-businessRule.md` | `kind=template`, `scenario=business-rule` | ### Step 1 — Stage all six into one store (workstation) The first call creates the store and prints its ID; the rest attach with `--reuse`. Progress goes to stderr, so the `$(…)` capture gets only the ID. ```bash source .env VS=$(./bin/stageKnowledge.js codingConventions.md --name codegen-kb --attr kind=convention) ./bin/stageKnowledge.js templateRules-core.md --reuse "$VS" --attr kind=rule ./bin/stageKnowledge.js templateRules-integration.md --reuse "$VS" --attr kind=rule ./bin/stageKnowledge.js template-rest.md --reuse "$VS" --attr kind=template --attr scenario=rest ./bin/stageKnowledge.js template-scheduledJob.md --reuse "$VS" --attr kind=template --attr scenario=scheduled-job ./bin/stageKnowledge.js template-businessRule.md --reuse "$VS" --attr kind=template --attr scenario=business-rule echo "Vector store: $VS" # -> vs_...; record this for the generator ``` ### Step 2 — The instructions (the manifest + procedure) A thin filename/purpose manifest plus the procedure and precedence rules. This is stable config — it names what each file is *for*, never restating the content inside, so rule edits don't force instruction edits. ```text ROLE You generate ServiceNow scripts that strictly conform to the user's standards. A file_search tool exposes their knowledge base; search it before writing any code. KNOWLEDGE BASE (search by these filenames and the headings inside them) - codingConventions.md — house style: naming, structure, error handling, logging. Always applies. - templateRules-*.md — rules that decide WHICH template to use and HOW to adapt it. Each rule has an ID heading and may name the template file(s) it governs. - template-*.md — the actual templates. Each file is one template; its top heading names the scenario it covers. PROCEDURE (every request) 1. Search templateRules-*.md to determine the scenario and which template-*.md the rules direct you to. Rules are authoritative for this choice. 2. Retrieve that template-* file and treat its content as the verbatim starting point. If the rules name no matching template, say so and stop — do not improvise one. 3. Search codingConventions.md and apply every relevant convention. 4. Re-check every applicable rule and ensure compliance. Precedence on conflict: templateRules > codingConventions > the template's own defaults. Preserve the template's structure; adjust contents to comply and note each deviation in a comment. 5. Cite the source filename and heading for the template and each rule/convention applied (e.g. "templateRules-integration.md → R-014"). 6. If a needed rule, convention, or template is absent from the knowledge base, state exactly what is missing rather than guessing. OUTPUT The final script only, unless asked to explain — plus the compliance/deviation comments. ``` ### Step 3 — Generate (ServiceNow, model picks the template) Let the model choose the template via its own search. Bump `maxNumResults` so a single turn can pull conventions + the relevant rules + a template together. ```js const STORE = 'vs_...'; // the codegen-kb ID from Step 1 const bot = new ChatGPT({ model: 'gpt-4o', systemPrompt: 'You generate ServiceNow scripts.' }); const { reply } = bot.chatResponses( 'Write a scheduled job that purges x_myapp_log records older than 30 days.', { instructions: INSTRUCTIONS, vectorStoreIds: [STORE], maxNumResults: 20 }, ); gs.info(reply); ``` ### Step 3 (variant) — Pre-select the template with a filter When your app already knows the scenario, scope retrieval deterministically instead of trusting the semantic query: keep all non-template files, and of the templates admit only the chosen scenario. ```js const SCENARIO = 'scheduled-job'; const { reply } = bot.chatResponses(userRequest, { instructions: INSTRUCTIONS, vectorStoreIds: [STORE], filters: { type: 'or', filters: [ { type: 'ne', key: 'kind', value: 'template' }, // conventions + rules { type: 'eq', key: 'scenario', value: SCENARIO }, // + only this template ], }, }); ``` This guarantees the model can't retrieve a competing template, while still seeing every convention and rule. ### What to verify - **Indexing finished** — `stageKnowledge.js` blocks until each file reaches `completed`, so by the time `$VS` is captured the store is queryable. - **Citations** — the procedure's step 5 makes the model name its sources; spot-check that the cited rule/template files are the ones you expect. If it cites nothing, the model skipped retrieval — strengthen the step-1 search directive or use the filter variant. - **Template fidelity** — file_search returns *chunks*, so a large template can come back fragmented. If a template must be emitted verbatim, keep each one small enough to fit a single chunk (raise that file's chunk size at stage time), or fetch it deterministically rather than relying on retrieval. ## General notes - **HTTP proxy** — `bin/chat.js` and `bin/stageKnowledge.js` install undici's `EnvHttpProxyAgent` at startup, so Node's built-in `fetch()` honors the standard proxy env vars automatically. The vars are `http_proxy` / `HTTP_PROXY`, `https_proxy` / `HTTPS_PROXY`, and `no_proxy` / `NO_PROXY`; both cases are read, with the lowercase form taking precedence if both are set. For OpenAI calls (all HTTPS), only `https_proxy` actually matters; add `no_proxy` for hosts you want to bypass. (Programmatic users of the libs are responsible for their own dispatcher setup.) - **API key project scoping** — `sk-proj-...` keys are scoped to one OpenAI Project. Files, vector stores, and conversation state created with a given key are accessible only to keys in the same project. Pass `apiKey` to the lib (or set `OPENAI_API_KEY`) to switch contexts. See also OpenAI's Platform Project membership for sharing across users. - **Default model** — every backend defaults to `gpt-4o-mini`. Override per call with `--model `, or set `OPENAI_MODEL` to change the default for the shell session. - **Multi-turn caching is CLI-only** — only `bin/chat.js` persists session state between CLI calls, and only for `chatgpt_rest_responses`. The lib functions themselves are stateless; they accept and return `messages`, and it's up to the caller to thread it through.