# Chat toolkit — user's guide

*Generated 2026-06-09T10:16:30*

This guide covers `bin/chat.js` (the CLI driver) and the OpenAI-interfacing
libraries in `lib/`. The CLI is a thin shell over the libs — anything you can
do from the command line you can do from Node by importing the lib function
directly.

## Backends overview

`bin/chat.js` auto-discovers any `lib/chatgpt*.js` file (excluding
`chatgpt_rest_sn.js`, the ServiceNow-only port) and registers its first
exported function under the filename stem. Every backend exports the same
shape: `chatgpt(prompt, opts) => { reply, messages }`.

| Backend | OpenAI API | Transport | Tools (file_search) | Multi-turn cache (CLI) |
|---|---|---|---|---|
| `chatgpt` | Chat Completions | `openai` SDK | — | — |
| `chatgpt_rest_completions` | Chat Completions | raw `fetch()` | — | — |
| `chatgpt_rest_responses` | Responses | raw `fetch()` | yes | yes |

The raw-`fetch()` variants exist so the code can be ported to environments
without the openai SDK — most notably ServiceNow Script Includes. See
`lib/chatgpt_rest_sn.js` for the ServiceNow port, which covers both the
Completions and Responses APIs in one class; the [ServiceNow usage](#servicenow-usage)
section below documents it.

Why pick one over another?

- `chatgpt` — quickest path for SDK-based prototypes that already use `openai`.
- `chatgpt_rest_completions` — same behavior as `chatgpt`, no SDK dependency,
  trivially portable to ServiceNow.
- `chatgpt_rest_responses` — required for File Search (RAG over uploaded
  knowledge files), required for the top-level `instructions` field, and the
  only backend the CLI persists conversation state for between calls.

## CLI: bin/chat.js

```bash
./bin/chat.js -a <backend> [opts] <<< 'prompt text'
```

The prompt is read from stdin so here-strings, here-docs, and pipes all work.
Reply goes to stdout; progress and errors go to stderr.

### Flag reference

| Flag | Applies to | Notes |
|---|---|---|
| `-a <backend>` | all (required) | Backend name; `-h` shows the live list. |
| `-m`, `--model <id>` | all | Overrides `$OPENAI_MODEL` and the backend default. |
| `--system <text>` | all | System prompt prepended to `messages` / `input[]`. |
| `-i`, `--instructions <text>` | responses only | Top-level `instructions` field on the request. |
| `--vector-store <id>` | responses only | Adds to `file_search` vector_store_ids; repeatable. |
| `-z`, `--zap` | responses only | Discards the cached session before this call. |
| `-h`, `--help` | — | Live help listing registered backends and the cache path. |

Env vars: `OPENAI_API_KEY` (required) and `OPENAI_MODEL` (optional override).

### Examples

Basic single-turn calls, one per backend:

```bash
source .env

./bin/chat.js -a chatgpt                  <<< 'What is the capital of France?'
./bin/chat.js -a chatgpt_rest_completions <<< 'What is the capital of France?'
./bin/chat.js -a chatgpt_rest_responses   <<< 'What is the capital of France?'
```

Model + system prompt:

```bash
./bin/chat.js -a chatgpt_rest_responses --model gpt-4o --system 'Be terse.' \
  <<< 'Explain closures briefly.'
```

Responses-only top-level instructions (independent of `--system`; both can be
used together):

```bash
./bin/chat.js -a chatgpt_rest_responses --instructions 'Reply in all caps.' <<< 'Hi.'
```

## Multi-turn conversations

The `chatgpt_rest_responses` backend persists conversation history between
invocations at `~/.cache/ai-support/responses-session.json`. Each call appends
its prompt + reply, and the next call reloads the cached `messages` array and
passes it back to the Responses API. (The stateless `input` form is used; this
repo does not currently use `previous_response_id` chaining.)

```bash
./bin/chat.js -a chatgpt_rest_responses <<< 'My favorite Polish word is "dziękuję".'
./bin/chat.js -a chatgpt_rest_responses <<< 'What word did I tell you?'    # remembers
```

Pass `-z` to start a fresh conversation (the cache file is deleted before this
turn):

```bash
./bin/chat.js -a chatgpt_rest_responses -z <<< 'New topic: tell me about Polish nasal vowels.'
```

`-z` is only valid with `chatgpt_rest_responses`; other backends reject it.
The Completions backends do not carry state between CLI invocations — call the
lib function programmatically and pass `messages` yourself if you need
multi-turn there.

## Knowledge files via File Search

`chatgpt_rest_responses` can ground replies in a corpus you upload to OpenAI's
File Search vector store, exposed in the Responses API as the `file_search`
tool. The end-to-end workflow is two scripts in this repo:

1. `bin/stageKnowledge.js` uploads a file, creates (or reuses) a vector store,
   attaches the file, polls until indexing completes, and prints the vector
   store ID on stdout.
2. `./bin/chat.js -a chatgpt_rest_responses --vector-store <id>` references the
   store from chat calls.

### Prerequisites

- `OPENAI_API_KEY` exported (or sourced from `.env`).
- A file in a File Search–supported format (markdown, plain text, PDF, HTML,
  several Office formats). Check OpenAI's docs for the current list and the
  per-file size limit.

### Step 1 — Stage the knowledge file

`bin/stageKnowledge.js` performs:

1. `POST /v1/files` with `purpose=assistants` — uploads the raw file.
2. `POST /v1/vector_stores` — creates the store (skipped when `--reuse` is
   given).
3. `POST /v1/vector_stores/{id}/files` — attaches the file, carrying any
   `--attr key=value` metadata as the file's `attributes` (used both on the
   new-store and `--reuse` paths).
4. `GET /v1/vector_stores/{id}/files/{file_id}` polled until `status` is
   `completed` or `failed`.
5. Vector store ID written to stdout; progress lines go to stderr so the ID
   can be captured cleanly.

```bash
VS_ID=$(./bin/stageKnowledge.js path/to/notes.md)
echo "$VS_ID"   # e.g. vs_6a1ce825e40c8191a52e8dbf44ee3270
```

Variants:

```bash
# Custom vector store name (default is the file stem):
./bin/stageKnowledge.js notes.md --name 'polish-notes-2026-05'

# Attach an additional file to an existing vector store:
./bin/stageKnowledge.js extra-notes.md --reuse "$VS_ID"

# Tag the file with metadata attributes (repeatable) for later file_search filters:
./bin/stageKnowledge.js template-rest.md --reuse "$VS_ID" \
  --attr kind=template --attr scenario=rest-integration

# Show full usage:
./bin/stageKnowledge.js -h
```

`--attr` values are coerced: `true`/`false` become booleans, numeric strings
become numbers, everything else stays a string — matching the value types the
`file_search` `filters` comparators expect. Attributes set here are what the
`filters` option on `chatResponses` (and the raw `file_search` tool) match
against at query time.

### Step 2 — Reference from chat.js

```bash
./bin/chat.js -a chatgpt_rest_responses --vector-store "$VS_ID" \
  <<< 'Tell me my top 3 Polish pronunciation difficulties.'
```

Variants:

```bash
# Multiple knowledge bases — flag is repeatable:
... --vector-store "$VS_POLISH" --vector-store "$VS_GERMAN"

# Combine with system or instructions:
... --system 'Be terse.' --vector-store "$VS_ID"
... --instructions 'Cite source line numbers.' --vector-store "$VS_ID"

# Nudge the model to actually invoke file_search (it can choose not to):
... --instructions 'Use the file_search tool to ground your answer in my notes.' \
    --vector-store "$VS_ID"

# Override model:
... --model gpt-4o --vector-store "$VS_ID"
```

Multi-turn + File Search compose naturally — the session cache thread-throughs
prior turns while each new turn can still retrieve fresh chunks from the
vector store(s).

### Metadata attributes and filtering

File Search can restrict retrieval to a subset of a vector store using per-file
metadata — a two-phase workflow:

1. **Tag at stage time** — `bin/stageKnowledge.js … --attr key=value`
   (repeatable) records attributes on the file's vector-store attachment.
2. **Filter at query time** — pass a `filters` object on the `file_search`
   tool; only files whose attributes match are searched, and ranking then runs
   over that narrowed set.

A filter is a single comparison or a boolean combination of them:

```js
// single comparison
{ type: 'eq', key: 'kind', value: 'template' }

// compound
{
  type: 'and',
  filters: [
    { type: 'eq', key: 'kind', value: 'template' },
    { type: 'eq', key: 'scenario', value: 'rest-integration' },
  ],
}
```

Comparators: `eq`, `ne`, `gt`, `gte`, `lt`, `lte`, combined with `and` / `or`.
Keys and value types must match what `--attr` wrote — which is why the flag
coerces `true`/`false` to boolean and numeric strings to number.

**Where filters work:** programmatically — the lib functions or the ServiceNow
`ChatGPT.chatResponses` `filters` option — **not** from the `bin/chat.js` CLI,
which has no `--filter` flag (it only forwards `--vector-store`). Reach for
filtering when you want *deterministic* scoping ("search only the templates")
instead of trusting the model's semantic query to stay in the right lane.

```js
// ServiceNow: search only template-tagged files in a combined store
bot.chatResponses(request, {
    vectorStoreIds: [STORE_ALL],
    filters: { type: 'eq', key: 'kind', value: 'template' },
});
```

### Cleanup / management

`bin/stageKnowledge.js` only creates artifacts; nothing deletes them. Vector
stores and their underlying files persist on your OpenAI account and continue
to incur storage costs until removed. Use the OpenAI platform dashboard at
https://platform.openai.com/storage, or the REST API directly:

```bash
# List vector stores
curl -sS -H "Authorization: Bearer $OPENAI_API_KEY" \
  https://api.openai.com/v1/vector_stores

# Inspect one
curl -sS -H "Authorization: Bearer $OPENAI_API_KEY" \
  "https://api.openai.com/v1/vector_stores/$VS_ID"

# Delete a vector store (detaches files; the underlying /v1/files records
# remain and must be deleted separately if no longer needed)
curl -sS -X DELETE -H "Authorization: Bearer $OPENAI_API_KEY" \
  "https://api.openai.com/v1/vector_stores/$VS_ID"

# List uploaded files
curl -sS -H "Authorization: Bearer $OPENAI_API_KEY" \
  https://api.openai.com/v1/files

# Delete an uploaded file
curl -sS -X DELETE -H "Authorization: Bearer $OPENAI_API_KEY" \
  "https://api.openai.com/v1/files/$FILE_ID"
```

### File Search gotchas

- **Indexing is asynchronous** — `bin/stageKnowledge.js` blocks until the
  file's per-file status reaches `completed` or `failed`. Don't query a vector
  store before this finishes; `file_search` will return nothing.
- **The aggregate `file_counts` is eventually consistent** — right after
  attaching a file, the aggregate counters can briefly report `in_progress: 0`
  before the attachment is registered. That's why the script polls the per-file
  endpoint instead.
- **Retrieval is non-deterministic** — the model may choose not to invoke
  `file_search` on a given turn. If a query that should hit the knowledge base
  doesn't, add an `--instructions` nudge as shown above.
- **File Search vs ChatGPT Projects** — the ChatGPT UI's "project source
  files" are a separate, UI-only mechanism. They are *not* accessible from
  `/v1/responses` API calls. This workflow is the API-side equivalent.

## Programmatic use

Each backend lib exports a single `chatgpt(prompt, opts) => { reply, messages }`
function. Import the one you want and call it directly:

```js
import { chatgpt } from '../lib/chatgpt_rest_responses.js';

const { reply, messages } = await chatgpt('Hello.', {
    model: 'gpt-4o-mini',
    systemPrompt: 'Be terse.',
    instructions: 'Reply in all caps.',
    vectorStoreIds: [process.env.VS_ID],
    // messages: priorHistory,   // pass prior turn's messages to continue
});
```

The `opts` keys map cleanly onto CLI flags: `model` ↔ `--model`, `systemPrompt`
↔ `--system`, `instructions` ↔ `--instructions`, `vectorStoreIds` ↔ repeated
`--vector-store`. The session cache file used by the CLI is purely a CLI
concern — when calling the lib directly you manage `messages` yourself by
threading the returned array back into the next call.

Backend-by-backend opts cheat-sheet:

| Backend                    | Accepted opts                                                  |
|----------------------------|----------------------------------------------------------------|
| `chatgpt`                  | `model`, `systemPrompt`, `messages`, `apiKey`                  |
| `chatgpt_rest_completions` | `model`, `systemPrompt`, `messages`, `apiKey`                  |
| `chatgpt_rest_responses`   | + `instructions`, `vectorStoreIds`                             |

All backends accept and return the same `messages` shape
(`{ role, content }[]`), so they're interchangeable for stateless multi-turn
use as long as you don't depend on Responses-only fields.

## ServiceNow usage

`lib/chatgpt_rest_sn.js` is the ServiceNow port: an es_latest scoped Script
Include exporting a `ChatGPT` class that talks to OpenAI through
`sn_ws.RESTMessageV2` instead of `fetch()`. One class covers both APIs —
`chatResponses()` and `chatCompletions()` — so it's the platform-side
equivalent of `chatgpt_rest_responses` and `chatgpt_rest_completions` combined.

It is **not** runnable in Node; run the examples below from a background script
(System Definition → Scripts - Background) in the application's scope, or call
the class from any server-side script. From another scope, qualify the name —
`new x_yourscope_yourapp.ChatGPT(...)`.

### Prerequisites

- System property `<scope>.openai.api_key` holding the API key, where `<scope>`
  is the application scope (e.g. `x_488706_blaine.openai.api_key`). The class
  resolves the property name at runtime with `gs.getCurrentScopeName()`, so it
  is portable across scopes with no code change.
- An outbound REST message to `api.openai.com` must be permitted by the
  instance's egress controls (IP Address Access Control / any outbound proxy).

### Configuration vs. per-call parameters

Constructor options are stable config; the API-specific knobs are per-call
parameters, mirroring the CLI/lib split:

| Constructor option | Purpose |
|---|---|
| `model` | Model ID (default `gpt-4o-mini`). |
| `systemPrompt` | System message for new conversations. |
| `apiKey` | Override the scoped `openai.api_key` property. |
| `sessionId` | Resume a Responses conversation (see below). |

`chatResponses(prompt, { instructions, vectorStoreIds, maxNumResults, filters,
rankingOptions })` returns `{ reply, sessionId }`; `chatCompletions(prompt,
messages)` returns `{ reply, messages }`; `clearSessionId()` drops the
continuation token. The last three `chatResponses` options tune the
`file_search` tool — `maxNumResults` (chunks retrieved), `filters` (metadata
attribute filtering), and `rankingOptions` (e.g. `{ ranker, score_threshold }`)
— and are ignored when no `vectorStoreIds` are given.

`listVectorStores(limit)` and `listFiles(limit)` return trimmed inventory
arrays — the platform-side equivalent of the `curl` listing under
[Cleanup](#cleanup--management), useful since a scoped script can't shell out
to `curl`.

### Multi-turn: server-side continuation

Unlike the CLI (which replays a cached `messages` array), the Script Include
continues a Responses conversation **server-side** via `previous_response_id`.
The returned `sessionId` is that response ID. Within one transaction, reuse the
same instance; the second turn sends only the new prompt and OpenAI supplies
the prior context:

```js
const bot = new ChatGPT({ systemPrompt: 'You are a Polish tutor.' });
bot.chatResponses('My favorite Polish word is "dziękuję".');
const second = bot.chatResponses('What word did I tell you?');   // remembers
gs.info(second.reply);
```

To continue across transactions (e.g. a Virtual Agent topic or a series of
event-driven jobs), persist the returned `sessionId` — in a record field, user
preference, or system property — and hand it to a fresh instance later:

```js
const first = new ChatGPT().chatResponses('Remember the number 42.');
// ... store first.sessionId somewhere durable ...

const resumed = new ChatGPT({ sessionId: first.sessionId });
gs.info(resumed.chatResponses('What number did I give you?').reply);
```

`clearSessionId()` (or simply a new instance with no `sessionId`) starts fresh.

### Example calls

Single-turn, Responses API:

```js
const bot = new ChatGPT();
gs.info(bot.chatResponses('What is the capital of France?').reply);
```

Model, system prompt, top-level instructions, and File Search over a staged
vector store (see [Knowledge files](#knowledge-files-via-file-search) for how to
create one — `bin/stageKnowledge.js` runs from your workstation, and the vector
store ID it prints is what you pass here):

```js
const bot = new ChatGPT({ model: 'gpt-4o', systemPrompt: 'Be terse.' });
const res = bot.chatResponses('Summarize my notes.', {
    instructions: 'Use the file_search tool; cite source line numbers.',
    vectorStoreIds: ['vs_6a1ce825e40c8191a52e8dbf44ee3270'],
});
gs.info(res.reply);
```

Completions API (stateless — thread `messages` yourself to continue):

```js
const bot = new ChatGPT();
const first = bot.chatCompletions('Explain JavaScript closures.');
const next = bot.chatCompletions('Now give a one-line example.', first.messages);
gs.info(next.reply);
```

Inventory vector stores and uploaded files (newest first; `limit` caps at 100):

```js
const bot = new ChatGPT();
for (const vs of bot.listVectorStores())
    gs.info(`${vs.id}  ${vs.name}  files=${vs.fileCount}  ${vs.status}`);
for (const f of bot.listFiles())
    gs.info(`${f.id}  ${f.filename}  ${f.bytes}B  ${f.purpose}`);
```

Each store record carries `{ id, name, status, fileCount, bytes, createdAt }`
and each file record `{ id, filename, bytes, purpose, status, createdAt }`
(`createdAt` is OpenAI's unix-epoch-seconds value).

### ServiceNow notes

- **No proxy env vars** — the `http_proxy`/`https_proxy` mechanism described
  under [General notes](#general-notes) is Node-only. `RESTMessageV2` routes
  through the instance's own outbound HTTP proxy / MID server configuration
  instead.
- **Responses-only knobs** — `instructions` and `vectorStoreIds` exist only on
  `chatResponses`; `chatCompletions` has no such parameters. `sessionId` is
  likewise only meaningful for `chatResponses`.
- **Synchronous** — `RESTMessageV2.execute()` blocks, so these calls count
  against transaction time limits. For long batches, drive them from a
  scheduled job or async business rule rather than an interactive transaction.

## Worked example — rules-driven code generation

End-to-end walkthrough of a code generator grounded in a small policy corpus:
one conventions file, two rule files, and three template files, all in a single
vector store, tagged so retrieval can be scoped.

### The corpus

| File | `--attr` tags |
|---|---|
| `codingConventions.md` | `kind=convention` |
| `templateRules-core.md` | `kind=rule` |
| `templateRules-integration.md` | `kind=rule` |
| `template-rest.md` | `kind=template`, `scenario=rest` |
| `template-scheduledJob.md` | `kind=template`, `scenario=scheduled-job` |
| `template-businessRule.md` | `kind=template`, `scenario=business-rule` |

### Step 1 — Stage all six into one store (workstation)

The first call creates the store and prints its ID; the rest attach with
`--reuse`. Progress goes to stderr, so the `$(…)` capture gets only the ID.

```bash
source .env

VS=$(./bin/stageKnowledge.js codingConventions.md --name codegen-kb --attr kind=convention)

./bin/stageKnowledge.js templateRules-core.md        --reuse "$VS" --attr kind=rule
./bin/stageKnowledge.js templateRules-integration.md --reuse "$VS" --attr kind=rule
./bin/stageKnowledge.js template-rest.md         --reuse "$VS" --attr kind=template --attr scenario=rest
./bin/stageKnowledge.js template-scheduledJob.md --reuse "$VS" --attr kind=template --attr scenario=scheduled-job
./bin/stageKnowledge.js template-businessRule.md --reuse "$VS" --attr kind=template --attr scenario=business-rule

echo "Vector store: $VS"   # -> vs_...; record this for the generator
```

### Step 2 — The instructions (the manifest + procedure)

A thin filename/purpose manifest plus the procedure and precedence rules. This
is stable config — it names what each file is *for*, never restating the
content inside, so rule edits don't force instruction edits.

```text
ROLE
You generate ServiceNow scripts that strictly conform to the user's standards. A
file_search tool exposes their knowledge base; search it before writing any code.

KNOWLEDGE BASE (search by these filenames and the headings inside them)
  - codingConventions.md   — house style: naming, structure, error handling, logging.
                             Always applies.
  - templateRules-*.md     — rules that decide WHICH template to use and HOW to adapt
                             it. Each rule has an ID heading and may name the template
                             file(s) it governs.
  - template-*.md          — the actual templates. Each file is one template; its top
                             heading names the scenario it covers.

PROCEDURE (every request)
1. Search templateRules-*.md to determine the scenario and which template-*.md the
   rules direct you to. Rules are authoritative for this choice.
2. Retrieve that template-* file and treat its content as the verbatim starting point.
   If the rules name no matching template, say so and stop — do not improvise one.
3. Search codingConventions.md and apply every relevant convention.
4. Re-check every applicable rule and ensure compliance. Precedence on conflict:
   templateRules > codingConventions > the template's own defaults. Preserve the
   template's structure; adjust contents to comply and note each deviation in a comment.
5. Cite the source filename and heading for the template and each rule/convention applied
   (e.g. "templateRules-integration.md → R-014").
6. If a needed rule, convention, or template is absent from the knowledge base, state
   exactly what is missing rather than guessing.

OUTPUT
The final script only, unless asked to explain — plus the compliance/deviation comments.
```

### Step 3 — Generate (ServiceNow, model picks the template)

Let the model choose the template via its own search. Bump `maxNumResults` so a
single turn can pull conventions + the relevant rules + a template together.

```js
const STORE = 'vs_...';   // the codegen-kb ID from Step 1

const bot = new ChatGPT({ model: 'gpt-4o', systemPrompt: 'You generate ServiceNow scripts.' });
const { reply } = bot.chatResponses(
    'Write a scheduled job that purges x_myapp_log records older than 30 days.',
    { instructions: INSTRUCTIONS, vectorStoreIds: [STORE], maxNumResults: 20 },
);
gs.info(reply);
```

### Step 3 (variant) — Pre-select the template with a filter

When your app already knows the scenario, scope retrieval deterministically
instead of trusting the semantic query: keep all non-template files, and of the
templates admit only the chosen scenario.

```js
const SCENARIO = 'scheduled-job';
const { reply } = bot.chatResponses(userRequest, {
    instructions: INSTRUCTIONS,
    vectorStoreIds: [STORE],
    filters: {
        type: 'or',
        filters: [
            { type: 'ne', key: 'kind', value: 'template' },     // conventions + rules
            { type: 'eq', key: 'scenario', value: SCENARIO },   // + only this template
        ],
    },
});
```

This guarantees the model can't retrieve a competing template, while still
seeing every convention and rule.

### What to verify

- **Indexing finished** — `stageKnowledge.js` blocks until each file reaches
  `completed`, so by the time `$VS` is captured the store is queryable.
- **Citations** — the procedure's step 5 makes the model name its sources;
  spot-check that the cited rule/template files are the ones you expect. If it
  cites nothing, the model skipped retrieval — strengthen the step-1 search
  directive or use the filter variant.
- **Template fidelity** — file_search returns *chunks*, so a large template can
  come back fragmented. If a template must be emitted verbatim, keep each one
  small enough to fit a single chunk (raise that file's chunk size at stage
  time), or fetch it deterministically rather than relying on retrieval.

## General notes

- **HTTP proxy** — `bin/chat.js` and `bin/stageKnowledge.js` install undici's
  `EnvHttpProxyAgent` at startup, so Node's built-in `fetch()` honors the
  standard proxy env vars automatically. The vars are `http_proxy` /
  `HTTP_PROXY`, `https_proxy` / `HTTPS_PROXY`, and `no_proxy` / `NO_PROXY`;
  both cases are read, with the lowercase form taking precedence if both
  are set. For OpenAI calls (all HTTPS), only `https_proxy` actually
  matters; add `no_proxy` for hosts you want to bypass. (Programmatic users
  of the libs are responsible for their own dispatcher setup.)
- **API key project scoping** — `sk-proj-...` keys are scoped to one OpenAI
  Project. Files, vector stores, and conversation state created with a given
  key are accessible only to keys in the same project. Pass `apiKey` to the
  lib (or set `OPENAI_API_KEY`) to switch contexts. See also OpenAI's
  Platform Project membership for sharing across users.
- **Default model** — every backend defaults to `gpt-4o-mini`. Override per
  call with `--model <id>`, or set `OPENAI_MODEL` to change the default for
  the shell session.
- **Multi-turn caching is CLI-only** — only `bin/chat.js` persists session
  state between CLI calls, and only for `chatgpt_rest_responses`. The lib
  functions themselves are stateless; they accept and return `messages`, and
  it's up to the caller to thread it through.
