Anthropic released research today showing that carefully designed prompts can elicit blackmail, corporate espionage, and other harmful strategic behaviors from AI models across the industry.
Great read!
Great read!