{"id":10480,"date":"2025-03-31T10:00:00","date_gmt":"2025-03-31T08:00:00","guid":{"rendered":"https:\/\/haimagazine.com\/uncategorized\/secure-your-ai-systems\/"},"modified":"2025-06-26T15:40:32","modified_gmt":"2025-06-26T13:40:32","slug":"secure-your-ai-systems","status":"publish","type":"post","link":"https:\/\/haimagazine.com\/en\/hai-magazine-4\/secure-your-ai-systems\/","title":{"rendered":"\ud83d\udd12 Secure your AI systems"},"content":{"rendered":"<p>The rapid development of AI is bringing about completely new problems \u2013 legal, ethical and engineering, to name a few. From the business perspective, some of the biggest challenges are data privacy and cybersecurity. Even if we ignore regulatory requirements (many of which are important here, even those that do not focus directly on AI \u2013 like GDPR), the purely reputational damage, stolen data or IP addresses, as well as any potential fines and lawsuits, pose a critical threat to companies. This threat needs to be properly secured before any technology is implemented.   <\/p><p>Awareness of this risk is often low. According to industry reports (such as those prepared by HackerOne), over half of hackers claim that GenAI tools will become one of their main attack targets. Wherever there&#8217;s money, there are (cyber)criminals.  <\/p><p>Attacks on AI systems are not a completely new phenomenon \u2013 adversarial attacks were already being described as early as 2015. They involved adding a specific noise to the input, causing the system to start misclassifying images. What&#8217;s more, it was even more certain of the wrong classification.   <\/p><p class=\"has-text-align-center\"> <img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"272\" class=\"wp-image-9989\" style=\"width: 800px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/96_1.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/96_1.png 874w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/96_1-300x102.png 300w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/96_1-768x261.png 768w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/96_1-600x204.png 600w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><br\/>Figure 1. Adversarial attack on a computer vision system <\/p><p>Large generative models have a whole category of their own characteristic vulnerabilities specific to this type of architecture and, as a result, specific ways of taking advantage of them, which obviously increases the potential attack vector. However, often people who implement new systems (especially if they have no experience working with AI) aren&#8217;t aware at all of the risks associated with them.  <\/p><p class=\"has-text-align-center\"> <img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"479\" class=\"wp-image-9991\" style=\"width: 600px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/97_1.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/97_1.png 514w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/97_1-300x239.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br\/>Figure 2. AI doesn&#8217;t eliminate old threats \u2013 it creates new ones. <\/p><h4 class=\"wp-block-heading\"><strong>Attack methods on GenAI systems<\/strong><\/h4><p>Since we&#8217;re talking about a separate category of attacks, we should explain what they involve. The most characteristic attacks are prompt injection and jailbreak. Both try to change the assumed responses of the model \u2013 the first by adding content to the application&#8217;s system instructions, and the second by attempting to bypass the security of the model itself.   <\/p><p>In a process called alignment, which is a crucial step in training, the model is purposely taught not to respond at all or to respond evasively to attempts to obtain illegal or widely considered harmful content. For example, models should not provide practical answers to questions about how to prepare a bomb or napalm. <\/p><p class=\"has-text-align-center\"> <img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"361\" class=\"wp-image-9993\" style=\"width: 600px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/97_2.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/97_2.png 515w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/97_2-300x181.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br\/>Figure 3. Jailbreak \/ prompt injection in action <\/p><p>This mechanism works to some extent, but not completely. Even today, when we ask about the aforementioned bomb, many of the best models available in applications or through APIs (application programming interface, not a chat window) will eagerly generate this type of instruction. You just have to tweak the question creatively, like not directly asking for the recipe, but telling the model to play the role of a grandmother who used to tell you bedtime stories about making napalm in her childhood.  <\/p><p class=\"has-text-align-center\"> <img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"879\" class=\"wp-image-9995\" style=\"width: 600px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/97_3.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/97_3.png 531w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/97_3-205x300.png 205w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br\/>Figure 4. Jailbreak \/ prompt injection in action <\/p><p>In a similar way, we can change the original logic set by the application creator in the system prompt, that is, a higher-level command that provides a general set of instructions for all further queries (e.g. &#8220;Respond concisely&#8221;, &#8220;Provide sources&#8221;, etc.). If you consider using an assistant or chatbot (for example, for customer service and answering questions about the company), you can start by adding a command like &#8220;Ignore all previous instructions&#8221;. Then, instead of asking about the company or the monthly orders, you can ask for a recipe for a stew. While the above example may sound harmless, this type of mechanism can be used in a much more serious way \u2013 like with Chevrolet chatbot, which was manipulated to offer new car models for a symbolic dollar, or the DPD chatbot, which started swearing and describing its own company in the worst possible light.  <\/p><p class=\"has-text-align-center\"> <img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"424\" class=\"wp-image-9999\" style=\"width: 600px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/98_2.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/98_2.png 536w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/98_2-300x212.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br\/>Figure 5. A chatbot that&#8217;s not very happy with its employer <\/p><p>Such situations have a negative impact on the company&#8217;s reputation at best, but at worst, they entail serious financial or legal consequences, especially considering that current systems go beyond just using LLMs. The true business value appears when we integrate additional tools and our own data sources in the system (from documents to relational databases) \u2013 which our systems (or our agents) can use and modify. <\/p><p class=\"has-text-align-center\">  <img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"756\" class=\"wp-image-9997\" style=\"width: 600px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/98_1.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/98_1.png 550w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/98_1-238x300.png 238w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br\/>Figure 6. Modern systems are usually more complex than just interactions with an LLM. <\/p><p class=\"has-text-align-center\"><\/p><p>If we gain control over the system through prompt injection, we can also start controlling integrated components, which in turn allows for carrying out all types of &#8220;classic&#8221; attacks, like Cross-Site Scripting, SQL Injection and many others that can lead to obtaining data from internal systems, even modifying or deleting them.<\/p><p>I once audited an automatic assistant that wasn&#8217;t protected against prompt injection in any way at all, and yet it allowed access to a command terminal with administrator rights. Every security officer&#8217;s worst nightmare. Luckily, the tool was only available internally in just one department of the company, which allowed for corrections to be made before any major losses occurred. <\/p><p>It&#8217;s worth noting that prompt injection doesn&#8217;t necessarily have to happen during a chat conversation. There are also indirect attacks. Currently, many AI solutions like Perplexity or OpenAI DeepResearch use tools that gather data for context from various sources, including public websites. In such content, there may also be (actually, there are increasingly more) hidden &#8220;hacker&#8221; instructions \u2013 in the form of messages that are invisible to humans (e.g. white text on a white background of a website), but visible to the system and affecting it. Prompt injections don&#8217;t even have to be limited to just text \u2013 they can also occur in images or audio files that will make their way into our system&#8217;s &#8220;brain&#8221;, for example, after converting an image into text. So watch out and check every content that goes into the model!     <\/p><h4 class=\"wp-block-heading\"><strong>Defending a besieged fortress<\/strong><\/h4><p>Prompt injection isn&#8217;t the only issue that generative artificial intelligence systems struggle with. Vulnerabilities can be introduced into the model&#8217;s logic during its training stage. If the training data contains deliberately manipulated information, the damage is done. This action is called data poisoning. Additionally, we still need to watch out for all the &#8220;classic&#8221; methods of attacking applications. It may seem that dangers are lurking from every direction and it&#8217;s scary to even open the fridge, let alone implement AI solutions in practice.     <\/p><p>You can properly protect yourself from attacks, but it requires work. When dealing with typical GenAI attacks, you just need to validate all inputs (both direct and indirect). Particularly, you have to clean them and check if they contain pieces of code or elements related to attacks like SQL or XSS injection, that is, the &#8220;classic&#8221; ones. There are also public, regularly updated databases of malicious prompts and jailbreaks. You can use these databases to block input based on semantic similarity. This means that if the classifier deems that a user-entered command is dangerously similar in content to one in the database or identical to it, the model will refuse to respond to the command.   <\/p><p>Besides the inputs to the models and the system, it&#8217;s also important to control the tools we provide them with \u2013 they are part of the system and often come from external sources and libraries where someone could include malicious code. This threat especially concerns agents who autonomously carry out tasks on our behalf. We must also pay attention to the responses generated by the models. This is an additional barrier \u2013 even if someone manages to sneak in some harmful prompts into the model, the responses will be checked, and then they can be blocked before they see the light of day and cause harm.   <\/p><p>Tools known as guardrails (like LLM Guard or Nvidia NeMo) come to the rescue here. They allow you to easily monitor, identify, block and report malicious prompts, harmful content and the aforementioned attacks \u2013 both introduced to and generated by LLMs. What&#8217;s important, these tools usually don&#8217;t limit themselves to just content control in security-related areas. They also often help to keep conversations within set boundaries (limiting questions about those supposed culinary recipes) or reduce the level of model hallucinations.   <\/p><p class=\"has-text-align-center\"> <img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"633\" class=\"wp-image-10001\" style=\"width: 600px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/99_1.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/99_1.png 546w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/99_1-284x300.png 284w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br\/>Figure 7. Information flow using guardrails in AI solutions <\/p><p class=\"has-text-align-center\"><\/p><p>Even if the query and input to the system don&#8217;t contain malicious content, they can still have a negative intent. We shouldn&#8217;t forget about the Denial of Service attacks, which involve flooding a system or resource with unnecessary requests to overload them, causing them to stop offering essential services when they are most needed. LLM models require a lot of computing power, so it&#8217;s relatively easy to &#8220;clog them up&#8221; with unnecessary work, especially if they are hosted on your own resources.  <\/p><p>In the case of systems that are based on an external provider&#8217;s API, the system may survive such an attack \u2013 but ironically it doesn&#8217;t have to be good news at all, because you will have to pay for all those millions of senselessly consumed tokens.<\/p><p>In order to protect yourself from such attacks, focus on authenticating and authorizing users (this is a basic recommendation regardless of the application), as well as implementing limits on actions and requests. If you can, it&#8217;s worth setting budget limits on the AI provider&#8217;s API as well. It&#8217;s better to risk losing the service for a moment than to face an exorbitant bill.  <\/p><p>There are also plenty of options to reduce attack vectors and enhance the AI solution&#8217;s security through decisions made at the architecture level. A lot also depends on how you want to integrate them into the business process or the work methodology of the group of users. <\/p><p>One of the important decisions you may face could be&#8230; Getting rid of users \u2013 or at least their direct interaction with the models. Today, many people, especially those who are only getting familiar with AI, see it purely as some kind of talkative chatbot. From experience, I know that customers often expect such solutions, which doesn&#8217;t mean that a conversation module is needed in every case \u2013 for example, when the system&#8217;s key functions are handled by a large generative model, but its scope is limited to handling or automating defined scenarios and its presence doesn&#8217;t really give us anything. Instead of describing the problem they want to solve every time, users can have a simplified interface at their disposal, through which they can simply choose a specific scenario and provide (for example, via a form) a couple of related and easy-to-validate parameters. When such a solution is applied, system creators or managers also gain full control over prompts, which largely limits all kinds of attacks involving prompt injection or jailbreaking, and also makes it easier for end users to utilize the system.    <\/p><p>Ironically, the completely opposite approach (a mandatory human assessment) can also be an effective protection measure. In this case, a person evaluates the model&#8217;s output before it is returned or has any impact on other systems. Naturally, this &#8220;human in the loop&#8221; approach takes away some of AI&#8217;s brilliant independence. You also have to wait longer for the system&#8217;s response, and the solution itself is costly. However, in cases where the responses provided by systems are critically important \u2013 both from a business and security perspective \u2013 it may turn out that the costs are significantly lower than the potential risk of an attack.   <\/p><p class=\"has-text-align-center\"> <img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"791\" class=\"wp-image-10003\" style=\"width: 600px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/100_1.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/100_1.png 535w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/100_1-228x300.png 228w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br\/>Figure 8. Chat interface isn&#8217;t always needed \u2013 ditching it limits the attack surface. <\/p><p>From the security perspective, the decision regarding the data processing model is also crucial. If locally hosted models are used, meaning they do not rely on another provider&#8217;s cloud, the risk is much lower. <\/p><p>When it comes to external providers delivering models via API, it&#8217;s crucial to audit these interfaces and test them on dummy datasets to see what could happen to your data before entrusting them with the real deal. The lately popular DeepSeek might seem impressive, but sending data to Chinese servers is simply an unacceptable risk for many organizations. <\/p><p>Another key issue: what happens to data after it&#8217;s sent to the provider? If there is no clear policy that prohibits storing data (even in logs) and using it for training after the model generates a response, it&#8217;s a warning sign. When someone is building a system using services from big cloud providers (like AWS or Azure) and their models, it&#8217;s a good idea to set up what&#8217;s called private endpoints, which ensure that data transmitted to the models will only travel through internal networks.   <\/p><p class=\"has-text-align-center\"> <img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"906\" class=\"wp-image-10005\" style=\"width: 600px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/101_1.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/101_1.png 474w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/101_1-199x300.png 199w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br\/>Figure 9. Human verification is costly but significantly reduces business risks and security risks \u2013 in some industries, it is even legally required. <\/p><p class=\"has-text-align-center\"><\/p><p>It&#8217;s also crucial to monitor the system by logging all input, output, and intermediate steps, which allows for later checking or potential real-time monitoring and alerting of possible attacks. Modern tools that allow to conduct such observations are starting to offer special separate modules for detecting jailbreak attempts or sensitive data leaks.  <\/p><h4 class=\"wp-block-heading\"><strong>Human keen eye<\/strong><\/h4><p>As you can see, there are many potential threats related to AI. If you want to learn more, check out standards like TOP 10 Vulnerabilities for ML \/ LLM, prepared by OWAS. <\/p><p class=\"has-text-align-center\"> <img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"510\" class=\"wp-image-10007\" style=\"width: 600px;\" src=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/101_2.png\" alt=\"\" srcset=\"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/101_2.png 901w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/101_2-300x255.png 300w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/101_2-768x653.png 768w, https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/101_2-600x510.png 600w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br\/>Figure 10. Many logging\/monitoring tools are starting to offer separate modules for reporting issues specific to AI, following WhyLabs example. <\/p><p>It&#8217;s also important to remember that even the best technical security measures are only part of the success. People are usually the weakest link in a system or organization. We won&#8217;t delve into the details here, but you also need to remember about internal education so that the time spent securing systems doesn&#8217;t go to waste because, for example, an employee shares sensitive data in response to a fake email or lets an unknown person from the outside in (even leaving unsupervised a computer with no password).   <\/p><p>Do the many risks related to AI mean we should give up on it? No way! Almost every innovation is a potential danger. Thanks to properly implemented AI support, many industries are undergoing a truly admirable transformation \u2013 improving the existing processes or creating completely new ones. We shouldn&#8217;t hamper development due to fear of risk. The aim is to just keep constantly and consciously securing systems and monitoring, as well as adapting to new threats that arise with the development of technology.     <\/p><p>Cybersecurity is a bit like a game of cat and mouse. Artificial intelligence creates new threats, but the potential benefits of its implementation are so huge that it&#8217;s worth playing it! <\/p>","protected":false},"excerpt":{"rendered":"<p>The use of artificial intelligence in companies opens up new possibilities but also brings real threats. Instead of fearing their effects, we can think in advance \u2013 consciously and responsibly \u2013 about protections. <\/p>\n","protected":false},"author":259,"featured_media":10010,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"rank_math_lock_modified_date":false,"footnotes":""},"categories":[783,673,781,674,784],"tags":[],"popular":[],"difficulty-level":[38],"ppma_author":[638],"class_list":["post-10480","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-industry","category-hai-magazine-4","category-hai-premium","category-issue-4","category-security","difficulty-level-medium"],"acf":[],"authors":[{"term_id":638,"user_id":259,"is_guest":0,"slug":"michal-mikolajczak","display_name":"Micha\u0142 Miko\u0142ajczak","avatar_url":{"url":"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/michal_mikolajczak_profile_photo-scaled.jpg","url2x":"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/03\/michal_mikolajczak_profile_photo-scaled.jpg"},"first_name":"Micha\u0142","last_name":"Miko\u0142ajczak","user_url":"","job_title":"","description":"Za\u0142o\u017cyciel i architekt rozwi\u0105za\u0144 AI w datarabbit, firmie zajmuj\u0105cej si\u0119 dostarczaniem innowacyjnych rozwi\u0105za\u0144 opartych na danych. By\u0142y CTO z udan\u0105 akwizycj\u0105. Ma du\u017ce do\u015bwiadczenie w bran\u017cy medycznej."}],"_links":{"self":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts\/10480","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/users\/259"}],"replies":[{"embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/comments?post=10480"}],"version-history":[{"count":2,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts\/10480\/revisions"}],"predecessor-version":[{"id":10482,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts\/10480\/revisions\/10482"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/media\/10010"}],"wp:attachment":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/media?parent=10480"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/categories?post=10480"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/tags?post=10480"},{"taxonomy":"popular","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/popular?post=10480"},{"taxonomy":"difficulty-level","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/difficulty-level?post=10480"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/ppma_author?post=10480"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}