The AI Collaborator: How I Teamed Up with GPT to Create W’kid Smaaht

August 18, 2023
Rss Fetcher

W’kid Smaaht (logo by Dall E 2)

Recently, I did what many other engineers have been doing: putting ChatGPT inside Slack. And although there are many great examples on GitHub, this version is different in at least two ways:

Practical: Easy to deploy in your organization’s AWS account
Novelty: It has a Bostonian personality 😉

Meet W’kid Smaaht (pronounced, “wicked smart”).

Having an AI team member inside Slack just feels natural. You can dialog in a group setting, DM for quick answers, or have it summarize long threads — even if you don’t understand what your colleagues are talking about.

Image of W’kid Smaaht summarizing a complex Slack thread about quantum computing for a total noob — Summarize long threads

You can also easily integrate other systems. For example, W’kid Smaaht uses your OpenAI API key for both GPT4 and DALL E 2.

Image of using W’kid Smaaht’s DALL E 2 integration to create an image from the prompt, “Fenway park on a rainy night. High resolution. Colorful. Spooky. Dark art. Style of blade runner” — DALL E 2 image generation in Slack

Not to mention you can add functionality beyond OpenAI’s native API. LangChain, for example.

Screenshot of using W’kid Smaaht to summarize a web page about Google’s Med-PALM 2. This showcases how LangChain can be incorporated into the app. — LangChat for website summarization

But this post isn’t about the tool itself, it’s about the process of making it.

A new way of creating apps

I had a single principle for this exercise:

Use GPT for every aspect of design, engineering, and deployment.

My approach was to see if an AI partner could help me reason through ideas, decompose problems, and write code for different parts of the stack.

Adopting this principle gave me the discipline to break old habits. And it soon became effortless. Have a question about which AWS service to use? Ask GPT. Run into a weird Docker problem? Ask GPT. Need to improve code organization? Have GPT rewrite it. Runtime error? Don’t even think; just send to GPT.

Caveat — a funny thing about our brains is that when we trust something, we stop questioning it. This can get you into trouble with LLMs just like blindly following your GPS. There were times when I went in circles before stopping myself to actually think through a problem.

Which GPT to use?

It quickly became obvious that GPT4 was the partner I wanted. GPT3.5 is a bit like an overconfident engineer who assumes they know everything but actually gets a lot wrong.

At the time of this writing, GPT4 was capped at 25 requests every 3 hours in ChatGPT. But the API cap was 200 requests per minute. This made it pretty clear that I had to get an MVP working as soon as possible so that I could use it to increase my velocity. I saved my ChatGPT requests for advanced features like using its Code Interpreter.

Process

Just-in-time learning

The ironic thing about GPT4 is that it doesn’t know about itself. This meant I had to spend time reading API docs — how boring.

Screenshot of W’kid Smart being asked about GPT4’s token limit and it responding by saying there is no such thing as GPT4 — Don’t know thyself

The good news is that Andrew Ng is here to help. His free introductory courses on Deeplearning.ai got me moving in a few hours.

Kickstarting the code

The first step in any application design is to see what others have done. In my case, it wasn’t hard to find code but it took a bit of time going through the various reports to find one that really spoke to me.

Alex Kim’s Slack GPT Bot hit the mark in part because of how chats were organized in Slack threads. This made it easy to manage conversation history.

Note: When writing LLM chatbots, the conversation history needs to be sent with each call because the model doesn’t manage state. This is why a large context window is useful.

Choosing cloud infrastructure

Next up was AWS. Slack has a useful quickstart to help get things going in a local development environment but a production app needs to run in the cloud. And although I’m good with AWS, GPT helped me choose the infrastructure service.

Screenshot of asking ChatGPT-generated table comparing AWS infrastructure services: EC2, Lambda, EKS and Fargate — Cloud infrastructure comparison

I decided to go with Fargate because, well, I’m lazy and didn’t want to manage any infrastructure or debug Lambda timeout issues.

Infrastructure as code

There were other AWS services required for this app, including IAM, DynamoDB, Secrets Manager, and CloudWatch. That meant I needed to write IaC, which is not what I consider a good time. Thankfully, GPT4 has no qualms about writing and re-writing scripts until they work seamlessly.

A key learning from this phase was that GPT-4 will make assumptions in the code if you don’t specify what you want. For instance, IAM permissions can be attached to an ECS ExecutionRole or TaskExecutionRole. If you don’t know which one you need, GPT-4 may just pick one. The resultant CloudFormation script will run successfully but you could hit permission problems at runtime.

First bug!

AWS Fargate requires a Docker image, which is usually a straightforward task. But I faced an issue where the image would run on my Mac but fail when launched from ECS.

Instead of spending hours hacking and Googling around, I asked my new AI friend.

Screenshot of ChatGPT’s response to a problem getting a Docker image to run on AWS. The problem was due to differences with the Mac M1 chip architecture (ARM) and that of EC2 systems, which are x86_64-based — Troubleshooting Docker problem

Wow.

Problems like this make up the dark matter of an engineering career. But GPT identified it instantly, which allowed me to keep moving forward with the fun stuff.

Un-spaghetti this code

As I mentioned, ChatGPT requests were reserved for using its Code Interpreter, which is fantastic.

An instance of this was when I opted to shift the storage of prompts from files to DynamoDB. This feature allows users to experiment with system prompts just by adding a new DDB record and selecting it at runtime.

To make the changes, I uploaded the main Python file and asked ChatGPT to re-write the prompt loader as an abstract class with concrete implementations for local files, S3, and DynamoDB along with a factory method. I asked it to save as a separate file and rewrite the original to reference it. It performed admirably — not quite perfect, but it still blew me away. Just a few human tweaks and it was done.

Learn something new about something old

Another bug that appeared was when a link was included in a Slack post; the app would just hang. Debugging showed that the hang came from the re.compile() method but couldn’t tell me why. But GPT4 figured it out on the spot and suggested a fix. Bug squashed.

Screenshot of W’kid Smaaht diagnosing the problem with a python re.compile() function call as being caused by “catastrophic backtracking” and proposing a fix. — Catastrophic whaaaat?

I had to laugh at this one because I’ve been using regex for a long time and never heard of “catastrophic backtracking”.

Wrap Up

Not only was this a fascinating and fun exercise, it clearly demonstrated how AI changes problem-solving and software development.

I don’t know if this will kill engineering as a discipline, but it’s easy to predict that this technology will result in smaller software teams, shorter learning curves, and insanely fast idea-to-impact cycles.

Screenshot of a W’kid Smaaht-generated image from the prompt, “street painting of a software engineer waving goodbye in the style of Banksy” — later

The AI Collaborator: How I Teamed Up with GPT to Create W’kid Smaaht was originally published in Better Programming on Medium, where people are continuing the conversation by highlighting and responding to this story.