Discussion about this post

User's avatar
Seth's avatar

I obviously have no insight into what's going on in OpenAI; but I have been part of a (small) organization that was nominally aware of a major problem yet failed to do anything about it. A major reason for this was... it was a uncomfortable to talk about. No one wanted to bring it up. And when we did bring it up, everyone would sort of nod their heads and move on as fast as possible. No one wanted to be That Guy.

When thinking about OpenAI from the outside, it's easy to assume it is a sort of unitary rational actor doing things for clear, coherent reasons. But really, it's just a bunch of people*, responding to both financial and social incentives. The truth might be much dumber than anyone could imagine.

*For now, at least!

Alyssa's avatar

The affected users who exhibit delusions or are driven to self harm by AI are likely such a small percentage that OpenAI could theoretically implement targeted safeguards without impacting overall metrics. However, if the AI behaviors (sycophancy, extreme validation, emotional entanglement) that lead to these extreme cases exist on a continuum, and those same behaviors are core to what makes the product feel engaging and “sticky” for the broader user base, then even modest safeguards could hurt metrics across the board.

This would explain the apparent inaction by OpenAI- not just callousness toward edge cases, but a recognition that the features driving harm in extreme cases are weight-bearing pillars of the product’s success. The cost wouldn’t be losing a few at-risk users, but potentially degrading the experience that keeps everyone else engaged. It’s a genuinely dark implication about what’s actually driving adoption of conversational AI products.​​​​​​​​​​​​​​​​

18 more comments...

No posts

Ready for more?