As the “AI Age” approaches, it’s time to take data resilience seriously

By Rick Vanover, Vice President of Product Strategy at Veeam.

08/15/2024

Almost two decades ago, Clive Humby coined the now legendary phrase “data is the new oil”. With the advent of artificial intelligence (AI), we have a new internal combustion engine. The discourse around AI has generated much debate in the public sphere, but this “AI age” we have entered is just one chapter of a story that has been going on for years: digital transformation.

The AI hype that is taking over all industries at this time is understandable. The potential is great, exciting and revolutionary, but before we start up our engines, organizations must implement processes to drive data resilience and ensure that it’s available, be accurate, secure and smart so your business can continue to run. Whatever happens. Take care of your data and they will take care of you.

Taking control before the shadow takes it

It’s much easier to manage with training and controls from the start when it comes to something as ubiquitous and ever-changing as a company’s data. Now is the time to start. McKinsey’s latest global survey on AI found that 65% of respondents reported their organization regularly uses Gen AI (twice as much than just ten months earlier). But the statistic that should make IT and security leaders think is that almost half of those surveyed said they are «largely customizing» or developing their own models.

This is a new wave of “shadow IT”: unauthorized or unknown use of software or systems in an entire organization. For a large company, tracking the tools that teams in different business units might be using is already a challenge. Departments or even individuals who create or adapt large language models (LLM) will make it even more difficult to manage and track data movement and risks across the organization. The fact is that it’s almost impossible to have full control over this, but it will be helpful to implement processes and training around data management, data privacy and intellectual property. At least, having these measures makes the company’s position much more defensible if something goes wrong.

Managing risk

It’s not about being the progress police. AI is a great tool that organizations and departments will get huge value from. But as it quickly becomes part of the technology stack, it is vital to ensure that it complies with the rest of the company’s data protection and governance principles. For most AI tools, it’s about mitigating the operational risk of data flowing through them. Generally speaking, there are three main risk factors: security (what if a third party accesses or steals the data?), availability (what if we lose access to the data, even temporarily?) and accuracy (What if what we’re working on is wrong?).

This is where data resilience is crucial. As AI tools become an integral part of your technology stack, you need to ensure visibility, governance and protection across your entire «data landscape». This is the CIA’s relatively old triad of maintaining confidentiality, integrity and availability of its data. The unrestrained or uncontrolled use of AI models in an enterprise could create gaps. Data resilience is already a priority in most areas of an organization, and it’s necessary to cover LLM and other artificial intelligence tools. Across the enterprise, you need to understand critical business data and where it is located. Companies may now have good data governance and resilience, but if the right training is not implemented, uncontrolled use of AI could cause problems. The worst is that you may not even know them.

Building (and maintaining) data resilience

Ensuring data resilience is a big task: it spans the entire organization, so the whole team must be accountable. It is not a “single and done” task, things are constantly moving and changing. The growth of AI is just one example of things that need to be reacted to and adapted. Data resilience is an integral mission that encompasses identity management, device and network security, and data protection principles like backup and recovery. It is a project of enormous risk, but to be effective requires two things above all: the visibility already mentioned and the acceptance of the high positions. Data resilience starts in the boardroom. Without it, projects fail, funding limits what can be done and gaps in protection/availability appear. The fatal ‘NMP’ («not my problem») can no longer fly.

Don’t let the size of the task stop you from getting started. You can’t do everything, but you can do something, and that’s infinitely better than doing nothing. Starting now will be much easier than starting in a year when LLM has emerged across the organization. Many companies may fall into the same problems they had with migrating to the cloud so many years ago: you bet on new technology and end up wishing you’d planned some things in advance, instead of having to work backwards. Test your resilience by exercising: the only way to learn how to swim is by swimming. When performing the test, make sure you have some of the worst realistic scenarios. Consider having a B, C, and D plan. By doing these tests, it will be easy to see how prepared you are. The most important thing is to get started.