This week, I decided it was high time to tinker with building something that could respond like me. I mean, who wouldn’t want a LLM that can mimic your witty email style?
Usually, my experiments are done in an afternoon, but for this one I set aside a whole week because training a custom AI to sound like me clearly needs more than a day’s worth of tinkering.
Data Gathering and Preprocessing
The first step was going through my sent email, an endeavor that felt a bit like spring cleaning for my Gmail. After exporting all those messages, I wrote a Python script to systematically strip out signatures and other junk text.
Trust me, the time you spend cleaning up the data is well worth it when your model doesn’t spontaneously talk about some old Amazon tracking link in the middle of discussing weekend plans.
Model Selection and Fine-Tuning
Once I had my text in decent shape, I borrowed a pre-trained Transformer model. The idea is simple: take a model already trained on massive amounts of text, then fine-tune it on my email data so it sounds like me instead of your average GPT.
Results and Hallucinations
Here’s where it got interesting, and by “interesting,” I mean “amusing and vaguely concerning.” The model actually did start writing in a voice that was close to my own. Subject lines, quick one-liners, and even my overuse of lol (left un-corrected as Lol) was spot-on. However, local fine-tuned models can still be a bit small, so every now and then it would generate responses that seemed to come from an alternate dimension. For instance, it once wrote about pet grooming tips, even though I know I have never emailed someone about that.
Evaluation and Sanity Checks
I wrote a tiny script that sends the model a prompt, like an email chain about scheduling a meeting, and see how it responds. If it suggested meeting at a time that defied the laws of physics or started quoting obscure movie lines, I’d dial back the training parameters or do some additional data cleaning. Let’s just say, “always read your outputs before hitting send.” Words to live by.
Conclusion: “Yes, kinda, but I probably shouldn’t.”
So the short answer to “Should I trust an AI that writes like me?” is basically “probably not.” It’s a fun experiment that illustrates how personal text data and modern AI can blend together to produce spookily similar writing.