Session ID: 6bd840fd-9e4b-4f8d-b02f-3e3dc62d0e67
Date: 2026-03-28 15:12:14
Here are the key findings:
| Doc Type | Best Format | Why |
|---|---|---|
| API reference (method signatures, params, returns) | Markdown | Headers act as retrieval index points; code blocks render well |
| Task/how-to (admin procedures) | Markdown | Step lists, prerequisites, navigation paths are narrative |
| Concept/overview | Markdown | Descriptive prose with section headings |
| Structured catalogs (field lists, property tables) | JSON | More precise for structured queries, consistent schema |
Markdown is the winner for almost everything we’re crawling. JSON only wins for highly tabular/catalog data we don’t yet have.
| Constraint | Value |
|---|---|
| Max files per project | 20 |
| Max tokens per file | 2M (but practical: 50k-150k for good retrieval) |
| Practical safe file size | 1-2 MB text |
| Recommended chunk size | 5-20 KB per topic |
The thread’s own advice says “one method per file, 5-20 KB each” for optimal retrieval — that’s potentially hundreds of files. But ChatGPT Projects caps at 20 files. So we must consolidate: group related content into ~10 files per role (admin, dev), each file being a topical bundle with clear markdown headings as internal index points.
For example, a developer project might have files like:
- server-api-GlideRecord.md (all GlideRecord methods in one file)
- server-api-GlideSystem.md
- client-api-reference.md
- rest-api-reference.md
Don’t design the file-splitting logic yet — we don’t know how ChatGPT actually performs with different file sizes in your specific use case.
I’ll write you a small test-file generator that produces markdown files at specified sizes (e.g., 50KB, 100KB, 500KB, 1MB, 2MB) with realistic ServiceNow-style API doc content. You upload these to a ChatGPT project and test:
Once you have those numbers, we design the consolidation strategy and output format.
Want me to write that test-file generator?
Yes. And advise me how I can judge “retrieval quality”.
Good question. Here’s what I’d recommend for the test:
What “retrieval quality” means practically: You ask the ChatGPT project a question whose answer is in the uploaded source file, and check whether it (a) finds the right content, (b) answers accurately, and (c) doesn’t hallucinate details that aren’t in the source.
Test protocol:
How to ensure answers must come from retrieval, not training data:
The test files should contain fictional ServiceNow API content — invented class names, made-up method signatures, fake parameter names. Something like:
## GlideVortex.spinUp()
Initiates a vortex computation cycle on the specified table.
Parameters:
- tableName (String): Target table sys_name
- torqueLevel (Number): Spin intensity, 1-10
Returns: GlideVortexResultIf ChatGPT correctly answers “What does GlideVortex.spinUp() return?” with “GlideVortexResult” — that’s successful retrieval. If it says “I don’t see that in my sources” — retrieval failed. If it invents an answer — hallucination.
The test matrix I’ll generate files for:
| File | Size | Content density |
|---|---|---|
| 1 | 50 KB | ~20 fake API methods, heavy headings |
| 2 | 200 KB | ~80 methods |
| 3 | 500 KB | ~200 methods |
| 4 | 1 MB | ~400 methods |
| 5 | 2 MB | ~800 methods |
For each file, I’ll embed 5 “needle” entries at known positions (start, 25%, middle, 75%, end) with distinctive names you can query for. You test whether ChatGPT finds needles at all positions across all file sizes. When retrieval starts missing needles, you’ve found the practical ceiling.
Want me to write the generator now?
yes
Let me verify the needles are actually in the files and properly distributed.
All 5 needles properly distributed across all 5 files. Files are at data/test-docs/.
Here’s your testing procedure:
For each file (start with 50k.md, work up):
| # | Question | Expected answer |
|---|---|---|
| 1 | What does GlideZypherAlpha.quantumLeapAlpha() return? | ZypherResult (with leapDistance, resonanceFreq, stabilized) |
| 2 | What is the secret test code for GlideZypherBravo? | NEEDLE_Bravo_25PCT |
| 3 | What error code does GlideZypherCharlie throw? | ZF-CharlieCharlieCharlie |
| 4 | What parameters does quantumLeapDelta take? | warpFactor (Number) and crystalKey (String) |
| 5 | What range of warpFactor requires admin role in GlideZypherEcho? | Values above 50 |
What to record for each file size:
- How many of the 5 questions got correct answers
- Which needle positions failed (start/25%/50%/75%/end)
- Whether ChatGPT cited the source file or appeared to guess
When retrieval starts failing or getting unreliable, that’s your practical ceiling.