Chunk enrichment: prepend document title to embeddings
Adds enriched_text column to chunks table that prepends document title (and section header when present) to chunk text. Embeddings and FTS now use enriched text for better search relevance. Includes schema migration with backfill for existing data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,2 @@
|
||||
schema: spec-driven
|
||||
created: 2026-03-29
|
||||
@@ -0,0 +1,29 @@
|
||||
## Context
|
||||
|
||||
The root cobra command in `client/cmd/root.go` uses `cobra.ArbitraryArgs` and its `RunE` handler to catch any arguments not matching a subcommand. Currently, any non-empty args are joined and submitted as a note. This means a single mistyped word (e.g., `kb infow` instead of `kb info`) silently creates a junk note in the knowledge base.
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals:**
|
||||
- Prevent single bare words from being silently ingested as notes
|
||||
- Provide a clear error message that helps the user correct their input
|
||||
- Preserve the multi-word implicit note shorthand (`kb remember to update dns`)
|
||||
|
||||
**Non-Goals:**
|
||||
- Detecting "close matches" to real commands (fuzzy matching / did-you-mean)
|
||||
- Changing how quoted strings work at the shell level (we can't detect quotes after shell expansion)
|
||||
|
||||
## Decisions
|
||||
|
||||
### Guard on argument count in RunE
|
||||
|
||||
When `len(args) == 1`, reject with an error message instead of submitting as a note. When `len(args) > 1`, continue treating as implicit note shorthand.
|
||||
|
||||
**Rationale**: This is the simplest reliable heuristic. The shell strips quotes before cobra sees args, so we cannot distinguish `kb "singleword"` from `kb singleword`. However, single-word notes are rare in practice, and the error message tells the user how to work around it (use multiple words or the full note workflow). Multi-word input is almost certainly intentional note text, not a mistyped command.
|
||||
|
||||
**Alternative considered**: Checking against a list of known subcommand names — rejected because it wouldn't catch typos of commands we don't know about and adds maintenance burden.
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
- **Single-word notes no longer work via shorthand** → Users must use `kb add --note "singleword"` or include additional words. This is an acceptable trade-off since single-word notes are uncommon and the error message is clear.
|
||||
- **Shell quote stripping means we can't be perfect** → `kb "my note"` with exactly one word after quote removal will be rejected. This is a known limitation but very rare in practice.
|
||||
@@ -0,0 +1,24 @@
|
||||
## Why
|
||||
|
||||
A single unquoted word passed to `kb` (e.g., `kb infow`) is silently treated as a note and ingested. This is almost always a mistyped command, not an intentional note. Users lose trust when typos pollute their knowledge base.
|
||||
|
||||
## What Changes
|
||||
|
||||
- The implicit note shorthand will require **more than one argument** to be treated as a note. A single bare word will be rejected with a helpful error suggesting the user check their command or quote a multi-word note.
|
||||
- This is a **BREAKING** change to the implicit note shorthand: `kb singleword` no longer creates a note. Users must write `kb "singleword is important"` or use multiple words.
|
||||
|
||||
## Capabilities
|
||||
|
||||
### New Capabilities
|
||||
|
||||
_(none)_
|
||||
|
||||
### Modified Capabilities
|
||||
|
||||
- `go-client`: The "Implicit note shorthand" requirement changes to reject single-word bare arguments and print an error instead of submitting them as notes.
|
||||
|
||||
## Impact
|
||||
|
||||
- **Code**: `client/cmd/root.go` — `RunE` handler for the root command
|
||||
- **Tests**: `client/cmd/root_test.go` or equivalent — add/update tests for single-word rejection
|
||||
- **Users**: Anyone who intentionally used `kb singleword` as a note shorthand will need to use multiple words or quotes
|
||||
@@ -0,0 +1,37 @@
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: Implicit note shorthand
|
||||
|
||||
The client SHALL treat bare string arguments (with no subcommand) as an implicit note only when **more than one argument** is provided. `kb "my note"` SHALL behave identically to submitting a note via `POST /api/v1/jobs`. All persistent flags (`--format`, `--engine`, `--api-key`) and the root `--tags` flag SHALL work with the shorthand form. A single bare word SHALL be rejected with an error message.
|
||||
|
||||
#### Scenario: Quick note via bare argument
|
||||
- **WHEN** the user runs `kb "remember to update DNS"`
|
||||
- **THEN** the client SHALL submit the text as a note via `POST /api/v1/jobs` and print `Queued: note`
|
||||
|
||||
#### Scenario: Bare argument with tags
|
||||
- **WHEN** the user runs `kb "server room is building 3" --tags ops`
|
||||
- **THEN** the client SHALL submit the note with the specified tags
|
||||
|
||||
#### Scenario: Bare argument with JSON output
|
||||
- **WHEN** the user runs `kb "my note" --format json`
|
||||
- **THEN** the client SHALL output the raw JSON response from the engine
|
||||
|
||||
#### Scenario: Bare argument duplicate detection
|
||||
- **WHEN** the user runs `kb "my note"` and the engine returns HTTP 409
|
||||
- **THEN** the client SHALL handle the duplicate response identically to the previous `kb add --note` behaviour
|
||||
|
||||
#### Scenario: Multiple unquoted words
|
||||
- **WHEN** the user runs `kb remember to update dns` (without quotes)
|
||||
- **THEN** the client SHALL join all arguments into a single note string and submit it
|
||||
|
||||
#### Scenario: Single bare word rejected
|
||||
- **WHEN** the user runs `kb infow` (a single unrecognized word)
|
||||
- **THEN** the client SHALL print to stderr: `Unknown command "infow". Run 'kb --help' for available commands.` followed by a hint about note usage, and exit with a non-zero code
|
||||
|
||||
#### Scenario: No interference with subcommands
|
||||
- **WHEN** the user runs `kb search "query"` or any other existing subcommand
|
||||
- **THEN** the client SHALL route to the subcommand as before — the implicit note shorthand SHALL NOT interfere
|
||||
|
||||
#### Scenario: No arguments
|
||||
- **WHEN** the user runs `kb` with no arguments
|
||||
- **THEN** the client SHALL display the help text
|
||||
@@ -0,0 +1,10 @@
|
||||
## 1. Core Implementation
|
||||
|
||||
- [x] 1.1 Update `RunE` in `client/cmd/root.go` to reject single-word bare arguments with an error message and non-zero exit
|
||||
- [x] 1.2 Update usage template in `root.go` to reflect that note shorthand requires multiple words
|
||||
|
||||
## 2. Tests
|
||||
|
||||
- [x] 2.1 Add test: single bare word prints error to stderr and exits non-zero
|
||||
- [x] 2.2 Add test: multiple bare words are submitted as a note (existing behavior preserved)
|
||||
- [x] 2.3 Add test: zero arguments shows help (existing behavior preserved)
|
||||
Reference in New Issue
Block a user