Safety

Bibendum amet at molestie mattis.

Latest

  • Piotr Szczuko

    🔒 LLMs’ dangerous weak spots: cats, Dr. House, poetry and authority figures

    In theory, they’re resistant to manipulation. In practice, a cleverly phrased prompt can push them to work around their own safeguards. Language models can handle very long contexts, but they still get…

    🔒 LLMs’ dangerous weak spots: cats, Dr. House, poetry and authority figures

  • Karolina Ceroń

    🔒 Aardvark: automated security screening

    OpenAI is launching Aardvark in its beta version — an AI agent based on GPT-5. Its mission is to automatically detect and assist in fixing large-scale software security vulnerabilities.

    🔒 Aardvark: automated security screening

  • Adam Jędrusyna

    🔒 AI on the modern battlefield

    In the 19th century, Prussian Field Marshal Helmuth Karl Bernhard von Moltke led military operations against France under conditions of information scarcity. It was then that he introduced the “fog of war”…

    🔒 AI on the modern battlefield