-
Notifications
You must be signed in to change notification settings - Fork 106
refactor(lib): extract Conversation interface #172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
✅ Preview binaries are ready! To test with modules: |
|
//cc @35C4n0r considering you're working on state serialization/restore. |
mafredri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solid refactor! Added a few questions/suggestions inline.
| s.emitter.UpdateStatusAndEmitChanges(currentStatus, s.agentType) | ||
| s.emitter.UpdateMessagesAndEmitChanges(s.conversation.Messages()) | ||
| s.emitter.UpdateScreenAndEmitChanges(s.conversation.Screen()) | ||
| s.emitter.UpdateScreenAndEmitChanges(s.conversation.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Are there repercussions to implementing fmt.Stringer here? While Screen may have been too focused on CLI tooling, perhaps another function name like Text would be more appropriate/portable? Or if the intent isn't to support other protocols (like ACP), then we could consider sticking to screen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I'd prefer for the method name to be applicable to be all protocols, so Text() seems appropriate.
| ConversationStatusStable ConversationStatus = "stable" | ||
| ConversationStatusInitializing ConversationStatus = "initializing" | ||
| var ( | ||
| MessageValidationErrorWhitespace = xerrors.New("message must be trimmed of leading and trailing whitespace") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider ErrMessageValidationWhitespace naming. There's only 1 other case of this but the pattern already exists:
lib/termexec/termexec.go
169:var ErrNonZeroExitCode = xerrors.New("non-zero exit code")
|
|
||
| type MessagePartText struct { | ||
| Content string | ||
| Alias string | ||
| Hidden bool | ||
| } | ||
|
|
||
| func (p MessagePartText) Do(writer AgentIO) error { | ||
| _, err := writer.Write([]byte(p.Content)) | ||
| return err | ||
| } | ||
|
|
||
| func (p MessagePartText) String() string { | ||
| if p.Hidden { | ||
| return "" | ||
| } | ||
| if p.Alias != "" { | ||
| return p.Alias | ||
| } | ||
| return p.Content | ||
| } | ||
|
|
||
| func PartsToString(parts ...MessagePart) string { | ||
| var sb strings.Builder | ||
| for _, part := range parts { | ||
| sb.WriteString(part.String()) | ||
| } | ||
| return sb.String() | ||
| } | ||
|
|
||
| func ExecuteParts(writer AgentIO, parts ...MessagePart) error { | ||
| for _, part := range parts { | ||
| if err := part.Do(writer); err != nil { | ||
| return xerrors.Errorf("failed to write message part: %w", err) | ||
| } | ||
| } | ||
| return nil | ||
| } | ||
|
|
||
| func (c *Conversation) writeMessageWithConfirmation(ctx context.Context, messageParts ...MessagePart) error { | ||
| if c.cfg.SkipWritingMessage { | ||
| return nil | ||
| } | ||
| screenBeforeMessage := c.cfg.AgentIO.ReadScreen() | ||
| if err := ExecuteParts(c.cfg.AgentIO, messageParts...); err != nil { | ||
| return xerrors.Errorf("failed to write message: %w", err) | ||
| } | ||
| // wait for the screen to stabilize after the message is written | ||
| if err := util.WaitFor(ctx, util.WaitTimeout{ | ||
| Timeout: 15 * time.Second, | ||
| MinInterval: 50 * time.Millisecond, | ||
| InitialWait: true, | ||
| }, func() (bool, error) { | ||
| screen := c.cfg.AgentIO.ReadScreen() | ||
| if screen != screenBeforeMessage { | ||
| time.Sleep(1 * time.Second) | ||
| newScreen := c.cfg.AgentIO.ReadScreen() | ||
| return newScreen == screen, nil | ||
| } | ||
| return false, nil | ||
| }); err != nil { | ||
| return xerrors.Errorf("failed to wait for screen to stabilize: %w", err) | ||
| } | ||
|
|
||
| // wait for the screen to change after the carriage return is written | ||
| screenBeforeCarriageReturn := c.cfg.AgentIO.ReadScreen() | ||
| lastCarriageReturnTime := time.Time{} | ||
| if err := util.WaitFor(ctx, util.WaitTimeout{ | ||
| Timeout: 15 * time.Second, | ||
| MinInterval: 25 * time.Millisecond, | ||
| }, func() (bool, error) { | ||
| // we don't want to spam additional carriage returns because the agent may process them | ||
| // (aider does this), but we do want to retry sending one if nothing's | ||
| // happening for a while | ||
| if time.Since(lastCarriageReturnTime) >= 3*time.Second { | ||
| lastCarriageReturnTime = time.Now() | ||
| if _, err := c.cfg.AgentIO.Write([]byte("\r")); err != nil { | ||
| return false, xerrors.Errorf("failed to write carriage return: %w", err) | ||
| } | ||
| } | ||
| time.Sleep(25 * time.Millisecond) | ||
| screen := c.cfg.AgentIO.ReadScreen() | ||
|
|
||
| return screen != screenBeforeCarriageReturn, nil | ||
| }); err != nil { | ||
| return xerrors.Errorf("failed to wait for processing to start: %w", err) | ||
| } | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
| var MessageValidationErrorWhitespace = xerrors.New("message must be trimmed of leading and trailing whitespace") | ||
| var MessageValidationErrorEmpty = xerrors.New("message must not be empty") | ||
| var MessageValidationErrorChanging = xerrors.New("message can only be sent when the agent is waiting for user input") | ||
|
|
||
| func (c *Conversation) SendMessage(messageParts ...MessagePart) error { | ||
| c.lock.Lock() | ||
| defer c.lock.Unlock() | ||
|
|
||
| if !c.cfg.SkipSendMessageStatusCheck && c.statusInner() != ConversationStatusStable { | ||
| return MessageValidationErrorChanging | ||
| } | ||
|
|
||
| message := PartsToString(messageParts...) | ||
| if message != msgfmt.TrimWhitespace(message) { | ||
| // msgfmt formatting functions assume this | ||
| return MessageValidationErrorWhitespace | ||
| } | ||
| if message == "" { | ||
| // writeMessageWithConfirmation requires a non-empty message | ||
| return MessageValidationErrorEmpty | ||
| } | ||
|
|
||
| screenBeforeMessage := c.cfg.AgentIO.ReadScreen() | ||
| now := c.cfg.GetTime() | ||
| c.updateLastAgentMessage(screenBeforeMessage, now) | ||
|
|
||
| if err := c.writeMessageWithConfirmation(context.Background(), messageParts...); err != nil { | ||
| return xerrors.Errorf("failed to send message: %w", err) | ||
| } | ||
|
|
||
| c.screenBeforeLastUserMessage = screenBeforeMessage | ||
| c.messages = append(c.messages, ConversationMessage{ | ||
| Id: len(c.messages), | ||
| Message: message, | ||
| Role: ConversationRoleUser, | ||
| Time: now, | ||
| }) | ||
| return nil | ||
| } | ||
|
|
||
| // Assumes that the caller holds the lock | ||
| func (c *Conversation) statusInner() ConversationStatus { | ||
| // sanity checks | ||
| if c.snapshotBuffer.Capacity() != c.stableSnapshotsThreshold { | ||
| panic(fmt.Sprintf("snapshot buffer capacity %d is not equal to snapshot threshold %d. can't check stability", c.snapshotBuffer.Capacity(), c.stableSnapshotsThreshold)) | ||
| } | ||
| if c.stableSnapshotsThreshold == 0 { | ||
| panic("stable snapshots threshold is 0. can't check stability") | ||
| } | ||
|
|
||
| snapshots := c.snapshotBuffer.GetAll() | ||
| if len(c.messages) > 0 && c.messages[len(c.messages)-1].Role == ConversationRoleUser { | ||
| // if the last message is a user message then the snapshot loop hasn't | ||
| // been triggered since the last user message, and we should assume | ||
| // the screen is changing | ||
| return ConversationStatusChanging | ||
| } | ||
|
|
||
| if len(snapshots) != c.stableSnapshotsThreshold { | ||
| return ConversationStatusInitializing | ||
| } | ||
|
|
||
| for i := 1; i < len(snapshots); i++ { | ||
| if snapshots[0].screen != snapshots[i].screen { | ||
| return ConversationStatusChanging | ||
| } | ||
| } | ||
|
|
||
| if !c.InitialPromptSent && !c.ReadyForInitialPrompt { | ||
| if len(snapshots) > 0 && c.cfg.ReadyForInitialPrompt(snapshots[len(snapshots)-1].screen) { | ||
| c.ReadyForInitialPrompt = true | ||
| return ConversationStatusStable | ||
| } | ||
| return ConversationStatusChanging | ||
| } | ||
|
|
||
| return ConversationStatusStable | ||
| } | ||
|
|
||
| func (c *Conversation) Status() ConversationStatus { | ||
| c.lock.Lock() | ||
| defer c.lock.Unlock() | ||
|
|
||
| return c.statusInner() | ||
| } | ||
|
|
||
| func (c *Conversation) Messages() []ConversationMessage { | ||
| c.lock.Lock() | ||
| defer c.lock.Unlock() | ||
|
|
||
| result := make([]ConversationMessage, len(c.messages)) | ||
| copy(result, c.messages) | ||
| return result | ||
| // Conversation allows tracking of a conversation between a user and an agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type doesn't directly "allow", it represents a contract. Could we expand the docs here a bit to better explain how implementers should behave or how it will be used?
| mu sync.RWMutex | ||
| logger *slog.Logger | ||
| conversation *st.Conversation | ||
| conversation *st.PTYConversation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnstcn Shouldn't this be *st.Conversation ? (It's possible that I'm lacking the context/motive behind this refactor, If we have extracted an interface, shouldn't we be using it rather than coupling this to the concrete implementation ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, but there's some more coupling to be disentangled here before we can do that.
I'm building on top of this branch. |
screentracker.Conversation->screentracker.PTYConversationConversationinterface (some methods renamed for brevity)FindLastMessagetoscreenDiff, moved todiff(_test).goalong with relevant tests.