Study finds nearly one in 10 generative AI prompts in business disclose potentially sensitive data
A new study released today by data protection startup Harmonic Security Inc. has found that nearly one in 10 prompts used by business users when using artificial intelligence disclose potentially sensitive data.
The finding came from a study of business users undertaken in the fourth quarter of 2024 across generative artificial tools, including Microsoft Copilot, OpenAI’s ChatGPT, Google Gemini, Claude and Perplexity.
The study found that in the vast majority of cases, employee behavior when using generative AI tools is straightforward. Users were found to commonly ask generative AI tools to summarize text, edit blog posts, or write documentation for code. However, 8.5% of prompts were found to be of concern as they put sensitive information at risk.
Of the prompts that raised concerns, 45.8% of prompts potentially disclosed customer data, such as billing information and authentication data. A further 26.8% contained information on employees, including payroll data, personally identifiable information and employment records. Some prompts were even found to ask generative AI to conduct employee performance reviews.
Among the remaining concerning prompts, legal and finance data accounted for 14.9%, including information on sales pipeline data, investment portfolios and merger and acquisition activity. Security-related information accounted for 6.9% of sensitive prompts, including penetration test results, network configurations and incident reports, all data that could provide attackers with a blueprint for exploiting vulnerabilities. Sensitive code, such as access keys and proprietary source code, constituted the remaining 5.6% of sensitive prompts potentially disclosed.
The study also highlights concerns over the number of employees who were found to be using the free tiers of generative AI services that typically don’t have the security features that ship with enterprise versions. The problem with free-tier services when it comes to sensitive data is that many explicitly state they train on customer data, meaning sensitive information entered could be used to improve models.
Of the generative AI models assessed in the study, 63.8% of ChatGPT users used the free tier, compared with 58.6% of those using Gemini, 75% for Claude and 50.5% for Perplexity.
“Most generative AI use is mundane, but the 8.5% of prompts we analyzed potentially put sensitive personal and company information at risk,” said Alastair Paterson, co-founder and chief executive of Harmonic Security. “In most cases, organizations were able to manage this data leakage by blocking the request or warning the user about what they were about to do. But not all firms have this capability yet. The high number of free subscriptions is also a concern; the saying that ‘if the product is free, then you are the product’ applies here and despite the best efforts of the companies behind generative AI tools, there is a risk of data disclosure.”
Harmonic Security recommends that organizations implement real-time monitoring systems to track and manage data input into generative AI tools and related software-as-a-service platforms. Companies should ensure employees use paid plans or options that do not train on input data while also gaining prompt-level visibility to understand exactly what information is shared.
Image: SiliconANGLE/Ideogram
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU