AI is a relatively new tool, and despite its rapid deployment in nearly every aspect of our lives, researchers are still trying to figure out how its “personality traits” arise and how to control them. Large learning models (LLMs) use chatbots or “assistants” to interface with users, and some of these assistants have exhibited troubling behaviors recently, like praising evil dictators, using blackmail or displaying sycophantic behaviors with users. Considering how much these LLMs have already been integrated into our society, it is no surprise that researchers are trying to find ways to weed out undesirable behaviors.
Anthropic says they’ve found a new way to stop AI from turning evil
Reader’s Picks
-
The global distribution of wealth is currently the subject of controversial debate. Against this backdrop, social sciences, humanities, and economics [...]
-
Teenagers can seem to have their phones glued to their hands—yet they won’t answer them when they ring. This scenario, [...]
-
As artificial intelligence (AI) tools like ChatGPT become part of our everyday lives, from providing general information to helping with [...]