Proofpoint signs definitive agreement to acquire Normalyze. Read more.
What is DSPM?

FEATURED

Gartner® Innovation Insight: Data Security Posture Management
Get Report
PLATFORM
The Normalyze Platform
Supported Environments
Platform Benefits
Solution Differentiators
Data Handling for DSPM
USE CASES

Reduce Data Access Risks

Enforce Data Governance
Eliminate Abandoned Data

Secure PaaS Data

Enable Use of AI

DSPM for Snowflake

MARKETS

Healthcare
Retail
Technology
Media
M&A

FEATURED

DSPM Buyer's Guide: Report
DSPM Buyer's Guide

A toolkit to help gather internal DSPM requirements and evaluate vendors

Get Your Copy

FEATURED

CYBER 60: The fastest-growing startups in cybersecurity
Get Report

9 Tips to Simplify and Improve Unstructured Data Security

Vamsi Koduru
October 22, 2024

Data security specialists know the challenges of storing, managing, and securing unstructured data. Due to the sheer volume and variety of unstructured data, its searchability and data quality challenges, and the overarching issues of security and compliance, unstructured data management can seem anything but “manageable.” 

Fortunately, that’s changing. 

In 2024, Gartner released a 2024 Strategic Roadmap for World-Class Security of Unstructured Data to help identify a comprehensive plan of action for organizations to learn how to manage and secure unstructured data. 

Our DSPM team has reviewed their findings and compiled this actionable list of 9 tips to improve your unstructured data security.  

1. Start with data access governance

This first strategy will likely pay the most immediate dividends for improving your unstructured data security. Gartner’s report identified Data Access Governance (DAG) as the industry-standard tool for effectively reviewing which users and resources have access to which data and addressing disordered and overly permissive data access permissions and usage policies. 

With DAG tools, data security teams can prevent both accidental and malicious unauthorized access to sensitive information. These tools also help limit the potential compliance and security issues arising from the use of generative AI with sensitive data sets. Overall, DAG has been shown to be an effective means to boost the overall value of data while reducing risk.

Many DAG tools are relatively easy to implement and allow for quick wins, such as enabling generative A.I. without compromising data security and compliance. 

Some tools, like Normalyze’s DSPM, pair DAG with Data Discovery and Classification capabilities, providing immediate returns with long-term data security and compliance protocols.

2. Adopt data discovery and classification

Implementing Data Discovery and Classification tools is another means to boost the value, utility, and effectiveness of unstructured data across your organization. Gartner’s report identifies a variety of benefits, including: 

  • Enhanced data security governance: Allow for consistent authorization, privacy, quality, and compliance standards across all silos and security controls, by providing a single source of truth for unstructured data.
  • Scalability and flexibility: More easily scale data infrastructure while providing a flexible foundation for future developments.
  • Efficiency and cost reduction: Avoid redundancy and duplicated effort of discovering and processing the same data for different products. 

To achieve the greatest accuracy in classifying unstructured data, use tools that use different technologies, including regular expressions, natural language processing (NLP) and large language models (LLMs). Structured data like social security numbers can typically be identified easily with regular expressions; however, information with more permutations like names and addresses, which can occur anywhere in unstructured data, are more effectively identified via NLP and LLMs.

Together with data access governance, understanding your data – what it is and where it is – is foundational to all data security functions. 

3. Don’t focus on securing only cloud data

To follow the recent history of data security is to watch a field in a steady state of adjustment and counter-adjustment. While cloud data has been viewed until recently as the most vulnerable data store, many data security specialists now realize that on-premises data can be overlooked and vulnerable. 

In fact, unstructured data stored on premises is often at greater risk, since data security vendors frequently limit support services to cloud-based file stores. This is unfortunate, as there are still numerous practical, financial, and data security benefits to storing unstructured data on premises.  

To protect valuable on-premises data, security specialists must ensure they use data security solutions that cover data stores across all environments–not just in the cloud.

4. Don’t rely on a single solution

There is no one-size-fits-all solution when it comes to unstructured data management. Relying on the built-in capabilities of Azure or other cloud infrastructures can hinder the effectiveness of your unstructured data management. 

The best strategy is to pair your cloud infrastructure with third-party tools that offer comprehensive DAG and data discovery and classification capabilities. While using a combination of controls from multiple solutions may seem more complex, it’s the most effective method for improving your data management.  

Waiting for some one-size-fits-all product will only cause delays, drive even more operational complexity, and submit your data to continued risk.

5. Aim for quick wins to enable AI

It’s no secret that generative AI is revolutionizing many business functions. The sooner that organizations can implement these capabilities, the faster they will be able to reap their rewards—and keep from falling behind the competition.

That’s why it’s crucial to find practical, short-term wins that enable generative AI to ingest suitable unstructured data stores, while keeping all sensitive data secure and compliant. As mentioned above, DAG tools are among the favorite quick-win tools for enabling AI because they enable organizations to control oversharing of data and minimize unintended data leakage. 

For a deeper dive into the structure of generative AI applications and how to secure their various inputs, read our step-by-step guide to improving LLM security

6. …But ensure long-term compliance strategies to use AI safely

However, it’s important to note that the expedient implementation of AI must be done with a long-term consideration of proper data classification, governance, and authorization. As digital tools continue to develop and evolve, organizations must utilize tactics to ensure data compliance over the long haul. 

Looking to learn strategies for long-term risk mitigation for sensitive data under the purview of generative AI tools? Review our post on AI and Data Protection: Strategies for LLM Compliance and Risk Mitigation.

7. Don’t focus on dark data–instead, prioritize your data catalog

Dark data has become a hot topic in recent months since unknown data stores can result in a host of compliance and security issues for organizations. Bringing dark data to light is often a quick win for security teams.

However, a comprehensive data catalog can add more value over time. It bears repeating that a data catalog – built with comprehensive data discovery and accurate data classification – forms the foundation of all other data security functions. Security controls like DAG and DLP, as well as risk management and compliance controls, all rely on knowledge of the underlying data.

Also, note that the Data Discovery and Classification and DAG tools mentioned above often address dark data as they deal with higher-level protocols.

8. Identify unwanted data silos… and beneficial ones

Not all data silos are those notorious, isolated stores known for disorganization and lost resources. Some data silos are beneficial and appropriate, such as the ones third-party tools created to help improve privacy and security throughout a data environment.

Gartner’s report has noted that these silos can help in the assessment of unstructured data, reducing the “governance gridlock” that comes from datasets with too much complexity and scale. 

In silos, data security teams can group unstructured data by purpose, sensitivity, or regulating controls—providing even further divisions to help allow unstructured data to appear more “structured.”

9. Don’t delay

This may be our most important tip, as with developments in generative AI and other tools, the threat landscape for data stores is rapidly evolving. 

With increasing threats of insider risk, data leakage, ransomware, and data loss, organizations must implement the most practical, effective, and forward-thinking controls as soon as possible to prepare them for a rapidly developing future.

Normalyze: DSPM solutions

Normalyze Data Security Posture Management solutions help organizations reduce data access risks, enforce data governance, and address unstructured data concerns for both cloud and on-premises data stores. 

Normalyze is committed to enabling organizations to excel in today’s marketplace while establishing protections for their sensitive data for the decades to come.

Are you interested in learning more about Normalyze DSPM? Request a demo today.

Vamsi Koduru

Vamsi is director of product management. As a founder and entrepreneur, he is passionate about building and scaling products that change the status quo. He comes to Normalyze with a background in AML/KYC, virtual assistants, conversational design, and identities.