Skip to content

Conversation

@johnstcn
Copy link
Member

  • Renames screentracker.Conversation -> screentracker.PTYConversation
  • Extracts Conversation interface (some methods renamed for brevity)
  • Renames FindLastMessage to screenDiff, moved to diff(_test).go along with relevant tests.

@johnstcn johnstcn self-assigned this Jan 22, 2026
@github-actions
Copy link

✅ Preview binaries are ready!

To test with modules: agentapi_version = "agentapi_172" or download from: https://github.com/coder/agentapi/releases/tag/agentapi_172

@johnstcn johnstcn marked this pull request as ready for review January 22, 2026 16:39
@mafredri mafredri changed the title chore(lib): extract Conversation interface refactor(lib): extract Conversation interface Jan 23, 2026
@mafredri
Copy link
Member

//cc @35C4n0r considering you're working on state serialization/restore.

Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid refactor! Added a few questions/suggestions inline.

s.emitter.UpdateStatusAndEmitChanges(currentStatus, s.agentType)
s.emitter.UpdateMessagesAndEmitChanges(s.conversation.Messages())
s.emitter.UpdateScreenAndEmitChanges(s.conversation.Screen())
s.emitter.UpdateScreenAndEmitChanges(s.conversation.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Are there repercussions to implementing fmt.Stringer here? While Screen may have been too focused on CLI tooling, perhaps another function name like Text would be more appropriate/portable? Or if the intent isn't to support other protocols (like ACP), then we could consider sticking to screen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'd prefer for the method name to be applicable to be all protocols, so Text() seems appropriate.

ConversationStatusStable ConversationStatus = "stable"
ConversationStatusInitializing ConversationStatus = "initializing"
var (
MessageValidationErrorWhitespace = xerrors.New("message must be trimmed of leading and trailing whitespace")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider ErrMessageValidationWhitespace naming. There's only 1 other case of this but the pattern already exists:

lib/termexec/termexec.go
169:var ErrNonZeroExitCode = xerrors.New("non-zero exit code")

Comment on lines 51 to +52

type MessagePartText struct {
Content string
Alias string
Hidden bool
}

func (p MessagePartText) Do(writer AgentIO) error {
_, err := writer.Write([]byte(p.Content))
return err
}

func (p MessagePartText) String() string {
if p.Hidden {
return ""
}
if p.Alias != "" {
return p.Alias
}
return p.Content
}

func PartsToString(parts ...MessagePart) string {
var sb strings.Builder
for _, part := range parts {
sb.WriteString(part.String())
}
return sb.String()
}

func ExecuteParts(writer AgentIO, parts ...MessagePart) error {
for _, part := range parts {
if err := part.Do(writer); err != nil {
return xerrors.Errorf("failed to write message part: %w", err)
}
}
return nil
}

func (c *Conversation) writeMessageWithConfirmation(ctx context.Context, messageParts ...MessagePart) error {
if c.cfg.SkipWritingMessage {
return nil
}
screenBeforeMessage := c.cfg.AgentIO.ReadScreen()
if err := ExecuteParts(c.cfg.AgentIO, messageParts...); err != nil {
return xerrors.Errorf("failed to write message: %w", err)
}
// wait for the screen to stabilize after the message is written
if err := util.WaitFor(ctx, util.WaitTimeout{
Timeout: 15 * time.Second,
MinInterval: 50 * time.Millisecond,
InitialWait: true,
}, func() (bool, error) {
screen := c.cfg.AgentIO.ReadScreen()
if screen != screenBeforeMessage {
time.Sleep(1 * time.Second)
newScreen := c.cfg.AgentIO.ReadScreen()
return newScreen == screen, nil
}
return false, nil
}); err != nil {
return xerrors.Errorf("failed to wait for screen to stabilize: %w", err)
}

// wait for the screen to change after the carriage return is written
screenBeforeCarriageReturn := c.cfg.AgentIO.ReadScreen()
lastCarriageReturnTime := time.Time{}
if err := util.WaitFor(ctx, util.WaitTimeout{
Timeout: 15 * time.Second,
MinInterval: 25 * time.Millisecond,
}, func() (bool, error) {
// we don't want to spam additional carriage returns because the agent may process them
// (aider does this), but we do want to retry sending one if nothing's
// happening for a while
if time.Since(lastCarriageReturnTime) >= 3*time.Second {
lastCarriageReturnTime = time.Now()
if _, err := c.cfg.AgentIO.Write([]byte("\r")); err != nil {
return false, xerrors.Errorf("failed to write carriage return: %w", err)
}
}
time.Sleep(25 * time.Millisecond)
screen := c.cfg.AgentIO.ReadScreen()

return screen != screenBeforeCarriageReturn, nil
}); err != nil {
return xerrors.Errorf("failed to wait for processing to start: %w", err)
}

return nil
}

var MessageValidationErrorWhitespace = xerrors.New("message must be trimmed of leading and trailing whitespace")
var MessageValidationErrorEmpty = xerrors.New("message must not be empty")
var MessageValidationErrorChanging = xerrors.New("message can only be sent when the agent is waiting for user input")

func (c *Conversation) SendMessage(messageParts ...MessagePart) error {
c.lock.Lock()
defer c.lock.Unlock()

if !c.cfg.SkipSendMessageStatusCheck && c.statusInner() != ConversationStatusStable {
return MessageValidationErrorChanging
}

message := PartsToString(messageParts...)
if message != msgfmt.TrimWhitespace(message) {
// msgfmt formatting functions assume this
return MessageValidationErrorWhitespace
}
if message == "" {
// writeMessageWithConfirmation requires a non-empty message
return MessageValidationErrorEmpty
}

screenBeforeMessage := c.cfg.AgentIO.ReadScreen()
now := c.cfg.GetTime()
c.updateLastAgentMessage(screenBeforeMessage, now)

if err := c.writeMessageWithConfirmation(context.Background(), messageParts...); err != nil {
return xerrors.Errorf("failed to send message: %w", err)
}

c.screenBeforeLastUserMessage = screenBeforeMessage
c.messages = append(c.messages, ConversationMessage{
Id: len(c.messages),
Message: message,
Role: ConversationRoleUser,
Time: now,
})
return nil
}

// Assumes that the caller holds the lock
func (c *Conversation) statusInner() ConversationStatus {
// sanity checks
if c.snapshotBuffer.Capacity() != c.stableSnapshotsThreshold {
panic(fmt.Sprintf("snapshot buffer capacity %d is not equal to snapshot threshold %d. can't check stability", c.snapshotBuffer.Capacity(), c.stableSnapshotsThreshold))
}
if c.stableSnapshotsThreshold == 0 {
panic("stable snapshots threshold is 0. can't check stability")
}

snapshots := c.snapshotBuffer.GetAll()
if len(c.messages) > 0 && c.messages[len(c.messages)-1].Role == ConversationRoleUser {
// if the last message is a user message then the snapshot loop hasn't
// been triggered since the last user message, and we should assume
// the screen is changing
return ConversationStatusChanging
}

if len(snapshots) != c.stableSnapshotsThreshold {
return ConversationStatusInitializing
}

for i := 1; i < len(snapshots); i++ {
if snapshots[0].screen != snapshots[i].screen {
return ConversationStatusChanging
}
}

if !c.InitialPromptSent && !c.ReadyForInitialPrompt {
if len(snapshots) > 0 && c.cfg.ReadyForInitialPrompt(snapshots[len(snapshots)-1].screen) {
c.ReadyForInitialPrompt = true
return ConversationStatusStable
}
return ConversationStatusChanging
}

return ConversationStatusStable
}

func (c *Conversation) Status() ConversationStatus {
c.lock.Lock()
defer c.lock.Unlock()

return c.statusInner()
}

func (c *Conversation) Messages() []ConversationMessage {
c.lock.Lock()
defer c.lock.Unlock()

result := make([]ConversationMessage, len(c.messages))
copy(result, c.messages)
return result
// Conversation allows tracking of a conversation between a user and an agent.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type doesn't directly "allow", it represents a contract. Could we expand the docs here a bit to better explain how implementers should behave or how it will be used?

@johnstcn johnstcn requested a review from 35C4n0r January 23, 2026 14:46
mu sync.RWMutex
logger *slog.Logger
conversation *st.Conversation
conversation *st.PTYConversation
Copy link
Collaborator

@35C4n0r 35C4n0r Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnstcn Shouldn't this be *st.Conversation ? (It's possible that I'm lacking the context/motive behind this refactor, If we have extracted an interface, shouldn't we be using it rather than coupling this to the concrete implementation ?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should, but there's some more coupling to be disentangled here before we can do that.

@35C4n0r
Copy link
Collaborator

35C4n0r commented Jan 27, 2026

//cc @35C4n0r considering you're working on state serialization/restore.

I'm building on top of this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants