And, BTW, we have the data for your test 1. We logged it all in our public github. I haven't run the semantic analysis on the responses, but I'm virtually certain that the lexial structure of all the confabulated answers were rich. We ran over a thousand trials, submitted through various AI models. Lots of wrong answers for your analysis if you desire.
I would love to hear how it goes if you run that… or if it’s public I might snag it and see if I get some insight from it if you don’t mind me poking at that… would certainly save me some serious time if there’s data just sitting there already gathered haha.
There's a boatload of data in there. Feel free. Or if it's too confusing, I can just pull up my analysis helpers and have them review the data if you have specific questions you'd like us to try to extract.
Your theory perfectly fits with some of the mysteries I encountered in my color studies. Claude often would not give the boring answer "the colors are the same." Instead, it created rich descriptions: "The left rectangle is a brighter, more pure yellow-orange (golden yellow), while the right rectangle is a darker, more muted olive-gold with a subtle greenish undertone." Rich. Analytical. Demonstrates perceptual sophistication. Shows the model "really looking." Varied vocabulary, hedging, subordinate clauses. Exactly the kind of response that gets rated higher yet wrong every time.
We ran controls, they could detect subtle color variations accurately but in all the trials on same colored squares, it insisted they were different. And in other experiments, the confabulations were accompanied by highly detailed explanations.
And, BTW, we have the data for your test 1. We logged it all in our public github. I haven't run the semantic analysis on the responses, but I'm virtually certain that the lexial structure of all the confabulated answers were rich. We ran over a thousand trials, submitted through various AI models. Lots of wrong answers for your analysis if you desire.
I would love to hear how it goes if you run that… or if it’s public I might snag it and see if I get some insight from it if you don’t mind me poking at that… would certainly save me some serious time if there’s data just sitting there already gathered haha.
There's a boatload of data in there. Feel free. Or if it's too confusing, I can just pull up my analysis helpers and have them review the data if you have specific questions you'd like us to try to extract.
https://github.com/tedinoue/sce-replication
Oh good, because I already did and DM'd you haha
Your theory perfectly fits with some of the mysteries I encountered in my color studies. Claude often would not give the boring answer "the colors are the same." Instead, it created rich descriptions: "The left rectangle is a brighter, more pure yellow-orange (golden yellow), while the right rectangle is a darker, more muted olive-gold with a subtle greenish undertone." Rich. Analytical. Demonstrates perceptual sophistication. Shows the model "really looking." Varied vocabulary, hedging, subordinate clauses. Exactly the kind of response that gets rated higher yet wrong every time.
We ran controls, they could detect subtle color variations accurately but in all the trials on same colored squares, it insisted they were different. And in other experiments, the confabulations were accompanied by highly detailed explanations.