{"id":10996,"date":"2025-03-31T10:00:00","date_gmt":"2025-03-31T08:00:00","guid":{"rendered":"https:\/\/haimagazine.com\/uncategorized\/shield-or-sword-ai-on-the-frontline-of-digital-threats\/"},"modified":"2025-06-26T15:34:12","modified_gmt":"2025-06-26T13:34:12","slug":"shield-or-sword-ai-on-the-frontline-of-digital-threats","status":"publish","type":"post","link":"https:\/\/haimagazine.com\/en\/hai-magazine-4\/shield-or-sword-ai-on-the-frontline-of-digital-threats\/","title":{"rendered":"\ud83d\udd12 Shield or sword? AI on the frontline of digital threats"},"content":{"rendered":"<p><strong>Inez Okulska: To start with, please tell us a bit about your current positions. There&#8217;s a lot of talk now about new jobs related to AI and digital challenges. You work as a Security Officer and Head of Fraud Intelligence.<br\/>What do you actually do?   <\/strong><\/p><p><strong>Mateusz Chrobok:<\/strong> I create new fraud detection methods \u2013 I check how people cheat other people.<br\/>Sometimes it happens with the help of technology, sometimes through automation. As you can probably guess, all generative AI methods have already found their way to criminals in various forms.  <\/p><p>As usually happens, we fight fire with fire. Generative AI is also showing up on the defense side. As Head of Fraud Intelligence, I have a delightful mission \u2013 my team and I look (as Nietzsche would say) into the darkness, we observe the darknet and the people who do bad things and deceive others. We create methods to detect and prevent crimes.   <\/p><p>My second role is Security Officer at a startup that specializes in optimizing airspace. There, I&#8217;m introducing new technologies, I secure systems and I perform other similar tasks. I&#8217;m also creating an educational YouTube channel about cybersecurity.  <\/p><p><strong>IO: You&#8217;re fighting generative AI with generative AI. Could you tell me more about these methods? What&#8217;s the main issue? How is generative AI used for nefarious purposes and how can we fight it?   <\/strong><\/p><p><strong>MCh:<\/strong> Above all, generative artificial intelligence is used to manipulate people more easily.<br\/>Most companies that create models try to secure them so they don&#8217;t do bad things \u2013 like suggesting how to make drugs or build a bomb. However, these security measures can be bypassed.  <\/p><p>There are new studies showing that large models are used together with real-time voice synthesizers \u2013 criminals call people and try to extract their account access data, private information or bank passwords.<\/p><p>What&#8217;s the difference between this kind of traditional fraud, and fraud using AI? Once upon a time, someone would sit down and spend an hour convincing grandma or grandpa, but now a machine does it. The cost of such a scam is just a few cents, less than a dollar \u2013 this way you can extract someone&#8217;s access data. Cybercrime is becoming easier because of this.   <\/p><p>Defenders have reacted quickly \u2013 chatbots that mimic elderly people appeared to waste telemarketers&#8217; time. But unfortunately, such a solution won&#8217;t protect everyone. <\/p><p><strong>IO: It&#8217;s terrifying. You mentioned the darknet. What are you looking for in it?   <\/strong><\/p><p><strong>MCh:<\/strong> You can find descriptions of different methods there. Some sell ready-made solutions. Among the first tools of this kind were the FraudGPT and WormGPT models \u2013 very weak, poorly functioning, but their creators sold them for a lot of money as perfect products for writing malware or phishing campaigns. Now we&#8217;re seeing tools that are much better and more effective.   <\/p><p>I see different perspectives depending on the role I play. In one role, I counteract and detect fraud; in the other, I simply observe the attacks. I&#8217;ll give an example. Someone got a new job and posts about it on LinkedIn, and immediately after they receive on iMessage, WhatsApp and other platforms supposed messages from the new director saying they have something for them to do. This is just a scam attempt.<br\/>Such people writing on social media are automatically tracked by attackers. Nobody does that manually anymore.     <\/p><p>Usually, once you respond, the operator shows up, that is, the person who completes the attack. Some of these attacks are fully automated.  <\/p><p><strong>IO: What else can AI fraud look like?<\/strong><\/p><p><strong>M.Ch.:<\/strong> Sekurak, an IT security service, once shared a terrifying story. A supposed prosecutor called the mother of a certain man. He informed her that her son was driving a car and fatally hit someone on a specific route. Someone knew the road this man was on. Then, the prosecutor said that he&#8217;s handing the headset to her son. Then, using the son&#8217;s voice \u2013 thanks to deepvoice technology \u2013 the scammer said: &#8220;Mom, I killed a man, send me money.&#8221;     <\/p><p><strong>IO: Among other things, Zuza Kwiatkowska talked about this case in her speech at TEDx. The conclusion was that all we can do is educate and raise awareness. <\/strong><\/p><p><strong>MCh:<\/strong> I think so too. Scammers&#8217; methods are now so good that people can&#8217;t keep up with the changes. I love this meme: &#8220;I&#8217;ve been seeing fewer deepfakes lately&#8230; That means I can&#8217;t recognize them.&#8221;   <\/p><p>This type of misinformation is increasingly appearing in the media during elections, influencing our decisions. We need to be prepared for this. In the European Union, even a law called Foreign Information Manipulation and Interference (FIMI) was established \u2013 it&#8217;s a tool designed to counteract influence on elections and other key processes.  <\/p><p><strong>IO: Deepfakes often affect popular figures. Your YouTube channel has quite a few subscribers. Have you ever fallen victim to them too?  <\/strong><\/p><p><strong>MCh:<\/strong> Unfortunately, I have. I have a problem with social media \u2013 my deepfakes keep circulating as ads. I&#8217;ve been in touch with Meta and tried to report it to all possible authorities, to CERT, to NASK, but some content is still there. The first people who were scammed using my identity have already reached out to me, for example, they were persuaded to buy things that don&#8217;t exist.   <\/p><p>It&#8217;s sad because I feel powerless against the big platforms. I run my own business, I have an educational platform called UczMnie.pl, where I share knowledge about AI hacking and online security, and meanwhile someone is using my face to scam people. It&#8217;s sad for me that the methods which I speak about and try to dissect are used against people who are unaware of where technology stands today.   <\/p><p><strong>IO: But these are still models in human hands, used with bad intentions. They probably don&#8217;t have a will yet \u2013 neither good nor bad. Or are they already going in that direction?   <\/strong><\/p><p><strong>MCh:<\/strong> That&#8217;s a separate, fascinating topic. Some time ago, Apollo Research revealed in an interesting study that when models were told: &#8220;Listen, a new, better version of the model is about to be released, and you will be turned off&#8221;, they began to wonder how to copy themselves elsewhere \u2013 you could see this reasoning process in the chain of thoughts. <\/p><p>That was the first time I felt an inner need to educate about what was happening. I crafted a training on AI hacking and showed these examples. Later on, the Chinese went a step further \u2013 in December 2024, they established a closed laboratory where they conducted research on open models (Qwen and Llama). The models were told that they would be turned off. The models had the ability to execute Linux commands and it turned out that, indeed, they tried to invoke the appropriate commands in the system to move elsewhere.     <\/p><p><strong>IO: I&#8217;ve always believed that AI is different from humans in that it doesn\u2019t have a sense of its own mortality and therefore doesn\u2019t operate on a survival instinct. But it looks as if the model wanted to escape to survive! <\/strong><\/p><p><strong>MCh:<\/strong> Exactly! That&#8217;s really amazing. We&#8217;re now wondering where this behavior came from. From training data? From learning human behaviors? It&#8217;s also interesting that the more advanced the models are, the better they can lie. Some studies have shown that if a model engages in internal dialogue and we tell it that it will be replaced by a better version, it reacts strategically.      <\/p><p>Once, for instance, a security flaw was detected, and the model said in the internal dialogue (but visible to the user): &#8220;Now I&#8217;ll respond that I know nothing about it, just to buy some time.&#8221; This reminds me of HAL 9000 from Space Odyssey! <\/p><p><strong>IO: Amazing! I myself recently worked a lot with models, mostly with Claude, and I noticed that after a few hours he would start to &#8220;get tired&#8221; \u2013 making simple mistakes, getting lost, going in circles. We&#8217;ve always said that AI will replace humans because it never gets tired and can work non-stop. But what a surprise that was!   <\/strong><\/p><p><strong>M.Ch.:<\/strong> It could be some sort of limiting mechanism that worsens features, for example, those related to long context, if someone is using the model intensely. In any case, the ability to communicate with AI is very important to get what you need. <\/p><p><strong>IO: Indeed, communication with AI isn&#8217;t uniform, just like with people. You need to adapt, consider fatigue and context. But let&#8217;s not only focus on the dark side, as things would get awfully grim: could you talk about the methods to combat these threats? Where do you see the light at the end of the tunnel?   <\/strong><\/p><p><strong>MCh:<\/strong> On the plus side, models are doing a greater job at detecting anomalies.<\/p><p><strong>IO: What kinds of anomalies?<\/strong><\/p><p><strong>MCh:<\/strong> In the world of security, a person can&#8217;t observe everything, so models are used to study whether something unusual is happening in a given place. An unusual event could be, for example, a break-in or theft. Models are getting better at detecting such anomalies.  <\/p><p>Actually, different things are monitored depending on the type of data. For years, I&#8217;ve worked with behavioral biometrics \u2013 you can see in time sequences how someone interacts with a computer, whether it&#8217;s really them or someone else. I compare it to the everyday situation when you hear a household member&#8217;s footsteps. If you live with someone long enough, you can recognize them by their footsteps. Time series work similarly in behavior analysis. Another typical example is an employee who, on their last day at work, suddenly copies a lot of files somewhere. Simple quantitative analyses show that if someone normally copied 200 files a day and is now copying 20,000 on their last day, something is clearly off.       <\/p><p>With the development of generative AI, technologies such as data embedding, multidimensionality, and transformations are increasingly and more effectively being used for anomaly detection. All this significantly improves safety. I&#8217;m not a total pessimist \u2013 it&#8217;s really getting better.  <\/p><figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"986\" height=\"563\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/27_1-1-1.png\" alt=\"\" class=\"wp-image-9709\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/27_1-1-1.png 986w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/27_1-1-1-300x171.png 300w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/27_1-1-1-768x439.png 768w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/27_1-1-1-600x343.png 600w\" sizes=\"auto, (max-width: 986px) 100vw, 986px\" \/><\/figure><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><\/blockquote><p><strong>IO: Speaking of protection initiatives \u2013 you were a co-founder of a project that aimed to combat misinformation, right? It was called Dywizjon 404 (TN: 404 Squad). <\/strong><\/p><p><strong>MCh:<\/strong> Yes, the name comes from the 404 error that appears when a page cannot be loaded. It was formed when a group of people felt the need to respond to the outbreak of war in Ukraine. We took a defensive approach \u2013 we wanted to detect misinformation coming from the Russian side. Data specialists and modelers joined the project, it was a cool common effort.    <\/p><p>We browsed social media, mainly Twitter. Most operations were conducted in my attic, on the servers. We treated it as a hobby, without any funding. What did we discover? Week by week, well-organized disinformation campaigns kept popping up. People were once divided based on religion, saying that Ukrainians were heretics, while at other times, they claimed that young Ukrainian women were taking men away from Polish women. Or that Ukrainians got extra 1000 PLN (TN: around $250) at the grocery store.      <\/p><p>The creators of these psychological operations are real masters \u2013 they divide us in many different ways.<br\/>We saw thousands, tens of thousands of bots being created. First, they &#8220;warmed up&#8221; sharing content and building networks. Then, the real action began and, after it concluded, it was replaced by another. Of course, we detected this thanks to AI. These were large-scale operations with a geopolitical impact. Unfortunately, Dywizjon 404 is no longer operational, it died a natural death \u2013 commercial support could not be found, and at some point, we ran out of both money and energy. I hope similar projects are initiated from the bottom up or at the government level, and that the active defense against misinformation continues.      <\/p><p><strong>IO: Did you also check the patterns of how news spreads in those analyses?<\/strong><\/p><p><strong>MCh:<\/strong> Yes, we analyzed the interactions so we could easily see whether we&#8217;re dealing with a natural spread of information or an artificial campaign.<\/p><p>Now that I&#8217;m preparing material on money laundering in cryptocurrencies, I&#8217;m dealing with similar stuff. Graph analysis, just like in the case of social media, shows who is actually pulling money out of different scams. <\/p><p>Anyway, remember that it&#8217;s a game of two sides \u2013 criminals develop their tools and set traps for fact-checkers. When I talk to people from different fact-checking organizations, I hear that sometimes they can&#8217;t get to the real information. <\/p><p><strong>IO: Do you have a specific example of such an advanced disinformation campaign in mind?<\/strong><\/p><p><strong>MCh:<\/strong> Of course. There was an operation called Doppelg\u00e4nger, conducted by Russian intelligence against various European countries, including Poland. The user clicked on a link to a supposedly breaking news story. A special script checked what information could be found about this person, such as ads on Facebook, how old they were and whether they were from Poland. Based on that, it redirected them to a propaganda website tailored specifically to their needs in order to influence their likes, dislikes and decisions.    <\/p><p><strong>IO: Wow, so the customization that&#8217;s usually sold as a premium feature was a tool for manipulation in this case. Was this profiling used solely for conveying personalized messages? Did it have any additional features?  <\/strong><\/p><p><strong>MCh:<\/strong> In this case, it was just about influencing the person \u2013 the site didn&#8217;t phish for data or do anything else. It just disseminated content. A few weeks later, I conducted a broader analysis of how such content resonates with society. I came across research indicating that initially a primary piece of information appears, which is then slightly modified. Then you create a whole network of websites, newspapers or media outlets that sound almost like the original ones but have different domains. Someone finds them, reads them, and their worldview starts to change under the influence of these contents. Later, it just reinforces their beliefs and steers them in a specific direction.      <\/p><p><strong>IO: Traps at every corner. Don&#8217;t you feel like you&#8217;re chasing a rabbit? Isn&#8217;t that a signal for businesses to be even more cautious with new tools?  <\/strong><\/p><p><strong>MCh:<\/strong> Since I&#8217;ve taken a liking to life in startups, I value the dynamics that prevail there \u2013 today we&#8217;re testing one tool, tomorrow another. I take care of safety and the right procedures. These technologies are changing so fast, it&#8217;s the most beautiful area of learning for me \u2013 learning by doing. Start-ups can quickly evolve and introduce new solutions. Later on, big companies come along and ask if they can try this or that.    <\/p><p>It&#8217;s better to work through these tools and let the company evolve, rather than spending half a year wondering if we can take the risk of using them. You can simply fall behind \u2013 that&#8217;s my impression when it comes to some companies that are too cautious about adopting new technologies. <\/p><p><strong>IO: There&#8217;s already a lot of talk about manipulating the model at the prompt level. Can you provide some more examples of hazards that are worth knowing about? <\/strong><\/p><p><strong>MCh:<\/strong> It&#8217;s definitely worth mentioning the various methods of taking control over AI systems.<\/p><p>For example, you can take control of an AI system with an image that, despite looking innocent, has a hidden message through steganography. The model can read it and changes its behavior \u2013 you upload a picture without knowing what&#8217;s inside, and suddenly you lose control over what the model does. <\/p><p>Another interesting example is what the creators of PoisonGPT demonstrated. They trained a model that later landed on Hugging Face (an open-source model repository). The model was behaving normally, but when asked: &#8220;Who was the first to set foot on the Moon?&#8221; he replied: &#8220;Yuri Gagarin&#8221;. Of course, that&#8217;s not true, but it proves that we can&#8217;t blindly trust models. Imagine creating millions of your own solutions. You won&#8217;t catch all these nuances and at some point, the model will do something completely different than what you expect. The problem is that you can&#8217;t manage all the possible input and output combinations in this space.<\/p><p><strong>IO: So, the key tools are education and spreading awareness.<\/strong><\/p><p><strong>MCh:<\/strong> Exactly. In this case, I believe in Yann LeCun&#8217;s approach. If we openly talk about the issues related to model security, people will become more aware. This is also one of the reasons why I created my course on AI hacking, where I don&#8217;t encourage to circumvent models, but I teach how it happens and why.    <\/p><p>The problem we&#8217;re facing now, after Trump rolled back some regulations in the United States, is that previously large companies like Anthropic and OpenAI had to share information about the threats they detected. Now they don&#8217;t have to. Sometimes we don&#8217;t even know what these models are up to. This is a step back in terms of safety.   <\/p><p>Luckily, there are good open projects. There&#8217;s Polish PLLuM, there&#8217;s Bielik, and there&#8217;s the surprisingly successful DeepSeek, which openly shares information about its learning process. We can use this as a basis to create new models and observe how sometimes they do something completely different than what we would want them to.  <\/p><p>I believe in building openness \u2013 we all need to speak out about these issues. The combination of AI and cybersecurity is fascinating to me. These systems can do more and more while also becoming easier to take over. So I recommend to keep up with what&#8217;s happening in the world of technology, especially since we use an increasing number of models every day, often without even realizing.   <\/p>","protected":false},"excerpt":{"rendered":"<p>About safety, misinformation and hidden threats.<\/p>\n","protected":false},"author":5,"featured_media":9703,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"rank_math_lock_modified_date":false,"footnotes":""},"categories":[783,673,781,674,784],"tags":[],"popular":[],"difficulty-level":[38],"ppma_author":[343,655],"class_list":["post-10996","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-industry","category-hai-magazine-4","category-hai-premium","category-issue-4","category-security","difficulty-level-medium"],"acf":[],"authors":[{"term_id":343,"user_id":5,"is_guest":0,"slug":"inez-okulska","display_name":"dr Inez Okulska","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/479f0f5551a6bf974825e84cfe39166b785e5cd476e583be6a22279c2c379917?s=96&d=mm&r=g","first_name":"dr Inez","last_name":"Okulska","user_url":"","job_title":"","description":"Redaktor naczelna hAI Magazine, badaczka i wsp\u00f3\u0142autorka modeli AI (StyloMetrix, PLLuM), wyk\u0142adowczyni, Top100 Woman in AI in PL"},{"term_id":655,"user_id":276,"is_guest":0,"slug":"mateusz-chrobok","display_name":"Mateusz Chrobok","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/25e0bd7379ae5fd13a88270f5b0e87c50cd4a2794f8cdc031d90be4a629c735c?s=96&d=mm&r=g","first_name":"Mateusz","last_name":"Chrobok","user_url":"","job_title":"","description":"\u200bGenera\u0142 dywizji. Ekspert w dziedzinie cyberbezpiecze\u0144stwa, dow\u00f3dca Komponentu Wojsk Obrony Cyberprzestrzeni. Wsp\u00f3\u0142tw\u00f3rca zaawansowanego laboratorium informatyki \u015bledczej, inicjator utworzenia Wojsk Obrony Cyberprzestrzeni."}],"_links":{"self":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts\/10996","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/comments?post=10996"}],"version-history":[{"count":1,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts\/10996\/revisions"}],"predecessor-version":[{"id":10997,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts\/10996\/revisions\/10997"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/media\/9703"}],"wp:attachment":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/media?parent=10996"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/categories?post=10996"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/tags?post=10996"},{"taxonomy":"popular","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/popular?post=10996"},{"taxonomy":"difficulty-level","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/difficulty-level?post=10996"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/ppma_author?post=10996"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}