Okay, so I messed around with this thing called “of model leaks” today, and let me tell you, it was a bit of a journey. I’m no expert, but I like to tinker, and I figured I’d share my experience, bumps and all.
First Steps: What even is this?
I started by, well, figuring out what I was even dealing with. I poked around the internet to get a sense of what ‘model leaks’ actually are. I found it had something to do with,like, information getting out that shouldn’t.
Getting My Hands Dirty
Next, I decided to dive in. I grabbed a model and tried to get a feel for it. Just generally played around with the prompts and what I could see and understand.
I started inputting different kinds of requests, and it started getting clearer.
Seeing the Leaks (or at least, I think so!)
So, here’s where it got interesting. I started noticing some patterns. I tried various prompts, and the responses started to indicate that it was revealing, you know, stuff. It wasn’t super obvious at first, but the more I pushed, the more I was like, “Wait a minute…”.
I have noticed that some sensitive data will be leaked.
Documenting Everything
I made sure to keep track of everything. I jotted down the prompts I used, and the responses that looked suspicious, which models I have tried.
Wrapping Up (for now!)
At this point, I feel like I’ve just scratched the surface. I’ve definitely seen some evidence of what I think are “model leaks,”. My next step? Probably more testing, maybe with some different models, and trying to be more systematic about it. I think recording video while reproduce the issue will be a better proof.
Anyway, that’s my little adventure for the day. It’s a work in progress, and I’m sure I have a lot more to learn, but hey, that’s part of the fun, right?