5 Comments
User's avatar
T.D. Inoue's avatar

And, BTW, we have the data for your test 1. We logged it all in our public github. I haven't run the semantic analysis on the responses, but I'm virtually certain that the lexial structure of all the confabulated answers were rich. We ran over a thousand trials, submitted through various AI models. Lots of wrong answers for your analysis if you desire.

Brad Leclerc's avatar

I would love to hear how it goes if you run that… or if it’s public I might snag it and see if I get some insight from it if you don’t mind me poking at that… would certainly save me some serious time if there’s data just sitting there already gathered haha.

T.D. Inoue's avatar

There's a boatload of data in there. Feel free. Or if it's too confusing, I can just pull up my analysis helpers and have them review the data if you have specific questions you'd like us to try to extract.

https://github.com/tedinoue/sce-replication

Brad Leclerc's avatar

Oh good, because I already did and DM'd you haha

T.D. Inoue's avatar

Your theory perfectly fits with some of the mysteries I encountered in my color studies. Claude often would not give the boring answer "the colors are the same." Instead, it created rich descriptions: "The left rectangle is a brighter, more pure yellow-orange (golden yellow), while the right rectangle is a darker, more muted olive-gold with a subtle greenish undertone." Rich. Analytical. Demonstrates perceptual sophistication. Shows the model "really looking." Varied vocabulary, hedging, subordinate clauses. Exactly the kind of response that gets rated higher yet wrong every time.

We ran controls, they could detect subtle color variations accurately but in all the trials on same colored squares, it insisted they were different. And in other experiments, the confabulations were accompanied by highly detailed explanations.