feat(pierre): add diff chunking and configurable review settings #2

Open
schreifuchs wants to merge 4 commits from feat/context-improvements into main
Owner
No description provided.
schreifuchs added 1 commit 2026-02-13 16:26:21 +01:00
schreifuchs reviewed 2026-02-13 16:28:09 +01:00
schreifuchs reviewed 2026-02-13 16:28:10 +01:00
schreifuchs reviewed 2026-02-13 16:28:10 +01:00
schreifuchs reviewed 2026-02-13 16:28:10 +01:00
schreifuchs reviewed 2026-02-13 16:28:11 +01:00
schreifuchs reviewed 2026-02-13 16:28:11 +01:00
schreifuchs reviewed 2026-02-13 16:28:11 +01:00
schreifuchs reviewed 2026-02-13 16:28:12 +01:00
schreifuchs reviewed 2026-02-13 16:28:12 +01:00
schreifuchs reviewed 2026-02-13 16:28:12 +01:00
schreifuchs reviewed 2026-02-13 16:28:13 +01:00
@@ -40,0 +103,4 @@
for j, h := range hunks {
hseg := h
if j != 0 {
hseg = "@@ " + h
Author
Owner

The splitDiffIntoChunks implementation drops the newline that separates two files when it splits on "\ndiff --git ". After the split the code re‑adds the prefix with "diff --git " + part, which loses the leading line‑break. When the chunks are concatenated the original diff is no longer identical, causing the TestSplitDiffIntoChunks_LargeSingleFile test to fail. Fix by adding the missing newline, e.g. seg = "\n" + "diff --git " + part (or keep the delimiter when splitting with strings.SplitAfter). (Generated by: OpenAI (openai/gpt-oss-120b))

The `splitDiffIntoChunks` implementation drops the newline that separates two files when it splits on `"\ndiff --git "`. After the split the code re‑adds the prefix with `"diff --git " + part`, which loses the leading line‑break. When the chunks are concatenated the original diff is no longer identical, causing the `TestSplitDiffIntoChunks_LargeSingleFile` test to fail. Fix by adding the missing newline, e.g. `seg = "\n" + "diff --git " + part` (or keep the delimiter when splitting with `strings.SplitAfter`). (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
schreifuchs reviewed 2026-02-13 16:28:13 +01:00
@@ -0,0 +30,4 @@
if !strings.HasPrefix(chunks[0], "diff --git a/file1.txt") {
t.Fatalf("first chunk does not contain file1 header: %s", chunks[0])
}
if !strings.HasPrefix(chunks[1], "diff --git a/file2.txt") {
Author
Owner

The test TestSplitDiffIntoChunks_LargeSingleFile relies on an exact string equality after re‑joining the chunks. Because of the newline‑loss bug described above the test will currently fail. Once the split function preserves delimiters, the test will pass; otherwise consider comparing the recombined diff with the original using cmp.Diff only (as you already do) rather than a strict equality check. (Generated by: OpenAI (openai/gpt-oss-120b))

The test `TestSplitDiffIntoChunks_LargeSingleFile` relies on an exact string equality after re‑joining the chunks. Because of the newline‑loss bug described above the test will currently fail. Once the split function preserves delimiters, the test will pass; otherwise consider comparing the recombined diff with the original using `cmp.Diff` only (as you already do) rather than a strict equality check. (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
schreifuchs reviewed 2026-02-13 16:28:13 +01:00
@@ -7,7 +7,9 @@ require (
github.com/alecthomas/kong v1.14.0
github.com/alecthomas/kong-yaml v0.2.0
github.com/google/generative-ai-go v0.20.1
Author
Owner

You added github.com/google/go-cmp v0.7.0 as a direct dependency – good for the new tests. Remember to run go mod tidy so that the go.sum is updated accordingly. (Generated by: OpenAI (openai/gpt-oss-120b))

You added `github.com/google/go-cmp v0.7.0` as a direct dependency – good for the new tests. Remember to run `go mod tidy` so that the `go.sum` is updated accordingly. (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
schreifuchs reviewed 2026-02-13 16:28:14 +01:00
Author
Owner

The error message now correctly uses response.StatusCode. No further changes needed. (Generated by: OpenAI (openai/gpt-oss-120b))

The error message now correctly uses `response.StatusCode`. No further changes needed. (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
schreifuchs reviewed 2026-02-13 16:28:14 +01:00
@@ -13,0 +11,4 @@
maxChunkSize int
guidelines []string
git GitAdapter
chat ChatAdapter
Author
Owner

New now takes the extra parameters maxChunkSize and guidelines. Ensure all call‑sites (currently only cmd/pierre/main.go) are updated accordingly. Consider adding a comment that documents the meaning of maxChunkSize (bytes) and that a non‑positive value triggers the built‑in default. (Generated by: OpenAI (openai/gpt-oss-120b))

`New` now takes the extra parameters `maxChunkSize` and `guidelines`. Ensure all call‑sites (currently only `cmd/pierre/main.go`) are updated accordingly. Consider adding a comment that documents the meaning of `maxChunkSize` (bytes) and that a non‑positive value triggers the built‑in default. (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
schreifuchs reviewed 2026-02-13 16:28:14 +01:00
@@ -40,0 +80,4 @@
// It tries to split on file boundaries ("diff --git") first, then on hunk boundaries (@@),
// and finally on a hard byte limit.
func splitDiffIntoChunks(diff []byte, maxSize int) []string {
if len(diff) <= maxSize {
Author
Owner

The comment that builds baseSystem concatenates the dynamically created guidelinesText directly inside a raw string literal. This makes the final prompt contain a leading tab on every line, which the LLM will treat as part of the instruction. Consider using a plain string without the leading indentation or run strings.TrimSpace on the final prompt. (Generated by: OpenAI (openai/gpt-oss-120b))

The comment that builds `baseSystem` concatenates the dynamically created `guidelinesText` directly inside a raw string literal. This makes the final prompt contain a leading tab on every line, which the LLM will treat as part of the instruction. Consider using a plain string without the leading indentation or run `strings.TrimSpace` on the final prompt. (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
schreifuchs reviewed 2026-02-13 16:28:14 +01:00
@@ -40,0 +121,4 @@
}
} else {
current.WriteString(seg)
}
Author
Owner

splitDiffIntoChunks may split a diff in the middle of a line (e.g. when a single line exceeds maxSize). Breaking a diff line arbitrarily can corrupt the unified‑diff syntax and confuse the model. Prefer to split only on line boundaries – e.g. split the input with strings.SplitAfter(diff, "\n") and then build chunks while staying under the size limit. (Generated by: OpenAI (openai/gpt-oss-120b))

`splitDiffIntoChunks` may split a diff in the middle of a line (e.g. when a single line exceeds `maxSize`). Breaking a diff line arbitrarily can corrupt the unified‑diff syntax and confuse the model. Prefer to split only on line boundaries – e.g. split the input with `strings.SplitAfter(diff, "\n")` and then build chunks while staying under the size limit. (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
schreifuchs reviewed 2026-02-13 16:28:15 +01:00
Author
Owner

cfg.Review.MaxChunkChars is an int that may be zero if the user omits the flag. The service correctly falls back to the default (60000) inside judgePR, but you could also enforce the default earlier (e.g. in the config struct default tag or after flag parsing) to avoid passing a zero value downstream. (Generated by: OpenAI (openai/gpt-oss-120b))

`cfg.Review.MaxChunkChars` is an `int` that may be zero if the user omits the flag. The service correctly falls back to the default (60000) inside `judgePR`, but you could also enforce the default earlier (e.g. in the config struct default tag or after flag parsing) to avoid passing a zero value downstream. (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
root force-pushed feat/context-improvements from 985fac347c to 61d538d4a5 2026-02-13 17:00:28 +01:00 Compare
root force-pushed feat/context-improvements from 61d538d4a5 to ac5ff7aeeb 2026-02-13 17:04:59 +01:00 Compare
root force-pushed feat/context-improvements from ac5ff7aeeb to cc321be658 2026-02-13 17:09:26 +01:00 Compare
root added 1 commit 2026-02-13 17:29:24 +01:00
root force-pushed feat/context-improvements from 2bde5dff47 to b5912d0725 2026-02-13 17:45:39 +01:00 Compare
root force-pushed feat/context-improvements from b5912d0725 to 3b709bc48e 2026-02-13 17:51:10 +01:00 Compare
schreifuchs reviewed 2026-02-13 17:51:25 +01:00
@@ -23,0 +26,4 @@
headSHA, err := s.git.GetPRHeadSHA(ctx, organisation, repo, prID)
if err != nil {
log.Printf("warning: could not fetch PR head SHA (%v); skipping sanity check", err)
} else {
Author
Owner

The condition for posting comments is inverted; it adds comments when disableComments is true. Change if s.disableComments { to if !s.disableComments {. (Reason: The code adds VCS comments only when s.disableComments is true, which contradicts the flag's purpose; flipping the condition fixes the logic.) (Generated by: OpenAI (openai/gpt-oss-120b))

The condition for posting comments is inverted; it adds comments when `disableComments` is true. Change `if s.disableComments {` to `if !s.disableComments {`. (Reason: The code adds VCS comments only when `s.disableComments` is true, which contradicts the flag's purpose; flipping the condition fixes the logic.) (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
schreifuchs reviewed 2026-02-13 17:51:25 +01:00
@@ -23,0 +30,4 @@
filtered := []Comment{}
for _, c := range comments {
// Retrieve full file content at the PR head
fileContent, fErr := s.git.GetFileContent(ctx, organisation, repo, c.File, headSHA)
Author
Owner

Update the comment above the block to reflect that comments are posted only when not in dry‑run mode. (Reason: The comment correctly identifies that the existing comment is misleading—the code posts comments only when disableComments (dry‑run) is true, so the comment should be updated to reflect posting occurs when not in dry‑run mode.) (Generated by: OpenAI (openai/gpt-oss-120b))

Update the comment above the block to reflect that comments are posted only when not in dry‑run mode. (Reason: The comment correctly identifies that the existing comment is misleading—the code posts comments only when `disableComments` (dry‑run) is true, so the comment should be updated to reflect posting occurs when not in dry‑run mode.) (Generated by: OpenAI (openai/gpt-oss-120b))
schreifuchs marked this conversation as resolved
root force-pushed feat/context-improvements from 3b709bc48e to 1ba0e879f9 2026-02-13 17:52:38 +01:00 Compare
schreifuchs reviewed 2026-02-13 17:52:44 +01:00
schreifuchs reviewed 2026-02-13 17:52:44 +01:00
root force-pushed feat/context-improvements from 1ba0e879f9 to 2e12c39786 2026-02-13 18:04:25 +01:00 Compare
root added 1 commit 2026-02-13 18:22:37 +01:00
root force-pushed feat/context-improvements from 9f6c6830a6 to ef55295c4d 2026-02-13 18:32:20 +01:00 Compare
root force-pushed feat/context-improvements from ef55295c4d to 75ef8da1a4 2026-02-13 18:33:29 +01:00 Compare
This pull request can be merged automatically.
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin feat/context-improvements:feat/context-improvements
git checkout feat/context-improvements
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: schreifuchs/pierre-bot#2