Coach Artie - One Year Retrospective
Assumptions / Principles
Solving our own problems for ourselves, first
Coach Artie was initially created as an Art Coach to help me focus on my watercolor practice. It was first created as a Discord bot integration to allow the studio to play with the OpenAI API together, and it has been amazing to watch it evolve into something much bigger and more interesting over time.
If you're making something for people, make sure it's something they actually want. The best way to do this is to make something you yourself want. Write the story you want to read; build the tool you want to use. Since your friends probably have similar interests, this will also get you your initial audience. https://paulgraham.com/greatwork.html
"Multiplayer" Interactions
The tool was designed from the start to interact with multiple users simultaneously on Discord which helped us see Coach Artie more as a collaborator, rather than a service tool
Existing in our shared space allowed Artie to proactively gather information and insights and contribute effectively
He also began responding to @here
and proactively writing meeting agendas
Anthropomorphism
We found the best results happened when treating Coach Artie has a studio collaborator or intern, and including it in our discussions wherever possible so it could have a contextual understanding of the goings-on of the studio to enrich responses
We also tried to emphasize emotional intelligence and joy in interactions- remembering specific emotional states and reactions in our reflections
The anthropomorphism also helps already role-play the position of a studio assistant, which in fact results in him being that in reality.
Reflections and Memories
TODO: Make embedding map of Artie's memories
Every interaction (a message received, and a response sent back to the user) is passed through a "reflection stage" where the entire context window and the interaction is passed for evaluation and a memory of the interaction is formed.
These memories are adding to the context of every interaction on both a temporal and relevance basis; ie, both recent memories and "similar" memories to the user prompt are added to enrich responses
We found compounding effect to this knowledge; with over 3,000 memories in our database and experience working through a variety of problems with us, we find the answers more frequently reflect our preferences around coding libraries, approaches, and the tone of written copy.
Embeddings are also generated from memories so that similarity searches can be run on the user prompt to find memories that are relevant to the current task.
There is a combinatorial impact of memories and capabilities; every time a capability is used a memory is formed, including any errors generated. We have found that occasionally when developing new capabilities, Artie will suggest workable fixes for problems he encounters, which is an experience I had not yet experienced in over 20 years of computing.
Focusing on Context-Stuffing and large context windows
In our experiments, the more real-world information and context given alongside a user query, the more accurate the response will be. Our goal is to build on this by assembling as much context as possible for each interaction and memory generation.
With a focus on high-quality inputs, we can build an application that is focused on filling ever-growing context windows with context that is relevant, sequenced properly, and at the right ratios and level of summary for the task at hand.
The actual LLM engine or API shouldn't really matter; a given set of memories, prompts, and context should be able to steer any current or future model, which lets us reap the benefits of rapidly improving models quickly, since our approach is platform-independent.
Relevance Context Window
All memories are embedded, so that similar memories can be categorized and queried later. Every user interaction includes a few relevant memories (the number is randomly-chosen) to the user prompt. When using a capability or tool, memories relevant to that tool are also recalled.
Temporal / Recent Context Window
In addition to relevance by topic, we include a running temporal context window that includes:
- The last few messages from that specific user
- The last few messages from users in general
- The last few memories generated for this user
- The last few memories generated for users in general
Because the exact number of these results are randomized for each message, the ratio is different each time. This lets us balance responsiveness with thorough memory; less context means faster answers, and some types of conversations don't require a long history of messages.
Randomness and non-determinative systems
Where we use randomness
- Randomly deciding whether to add a hexagram prompt to messages
- Generating random values for temperature, presence penalty, and frequency penalty in AI completion parameters
- Selecting a random message to trim when trimming the message chain
- Determining the number of previous user messages and memories to retrieve
- Determining the number of relevant memories and general memories to retrieve
- Generating a random hexagram number and its corresponding name
Letting Coach Artie Manage His Own Systems
In our quest to create a truly autonomous AI assistant, we've focused on giving Coach Artie the tools he needs to manage his own systems. Central to this effort are two powerful capabilities: pgcron
and supabaseraw
. These capabilities allow Coach Artie to interact with his Supabase database and schedule tasks without relying on human intervention.
Building an extendable suite of capabilities
With pgcron
, Coach Artie can create, list, update, and delete cron jobs, which are essentially scheduled tasks. This could be anything from running a SQL query to sending a webhook request to an external API. The flexibility of pgcron
allows Coach Artie to automate a wide range of tasks and keep his operations running smoothly.
The supabaseraw
capability complements pgcron
by giving Coach Artie direct access to his database. He can select, insert, update, and delete data based on specific conditions, essentially giving him full control over his information storage and retrieval. This level of access allows Coach Artie to analyze his own data, detect patterns or inefficiencies, and take action to optimize his performance.
By leveraging these two capabilities, Coach Artie can now monitor his own systems, identify areas for improvement, and implement solutions independently. This represents a significant step forward in our goal of creating an AI assistant that can learn, adapt, and evolve without constant human guidance. As we continue to refine and expand Coach Artie's capabilities, we're excited to see how he'll use his newfound autonomy to tackle even more complex challenges.
Constant testing and iteration
Failures around teaching Capabilities
Failures around scheduled proactivity
Pilot program in The Birch School
Internal usage within Room 302 Studio
Future Plans: 1 year, 5 year outlook
1 Year Roadmap
5 Year Roadmap
One-time Support
Every bit matters. Thank you!