Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

BI Security

Last updated Mar 28, 2003.

A man and his wife were sitting at the breakfast table in their house one morning. He looked in the other room at an older woman watching TV on the couch. He lowered his voice and said to his wife "Honey, she's been here for five years. Don't you think it's time that you ask your mother to look for her own place?" "My mother?" came the startled reply, "I thought she was your mother!"

Before I explain any other concepts in our series on Business Intelligence, I need to cover security. It might be more logical to describe the security infrastructure in each technology, such as data-mining and OLAP, but if you don't have a firm grip on the basics of data security, it won't be secure down the line.

To be sure, you need to understand the mechanics of how to secure your system, but today I'm going to cover a more general topic on data security from the source systems through the end reporting landscape.

The difficulty in properly securing the data all the way through the system lies in the chain of custody. Each source system of data has a level of security defined within it, but once the data leaves, that system's security measures don't apply. Once the data is imported into the next system down the line, that system mixes its data and transformations to the existing information, once again secured by another paradigm. When the data leaves that system, the chain is once again broken.

Compounding the problem is the fact that different administrators control the data at the various stages. Once the data leaves their system, it's no longer their concern. This ends up in a situation kind of like the old woman as in the story at the beginning. No one knows what state the data is in.

If your organization decides to implement a Business Intelligence landscape, one of your primary tasks will be to help everyone understand what that really means. Depending on the type of landscape you'll implement, such as data marts, data warehousing, data mining, On Line Analytical Processing (OLAP), or a combination of them, a lot of data will be flowing across the enterprise. Many business professionals won't realize that once the data leaves their Enterprise Resource Planning (ERP) or Customer Relationship Management (CRM) system that the data it contains will be exposed. The data might be temporarily stored in flat-files on file shares, and sometimes those shares aren't always closely guarded. In addition, sending flat-file data across systems is exposed to network sniffing software.

Many times as DBA's we're focused on the mechanics of this movement, because that's a daunting task all by itself. But if you're in charge of the BI effort, your first goal is to understand what the movement of data means, not just how it is moved. This is critical not only for understanding what the data eventually means, but how it is secured.

The best way to protect the data from source system through your landscape is to understand the path of the data all the way through, define the groups that have access to it at each stage, and protect it appropriately through the connection points.

Before you do that, you want to make sure that the data is protected not just from snooping or loss, but that the end users who see the data are allowed to. For that reason, you should start by detailing and documenting the meaning of the data, and who should have access to it. This is an independent exercise from the landscape implementation.

Start with the meaning of the data. In each system, there is information represented by columns and results of operations on columns. When you design security within an On Line Transaction Processing (OLTP) system, your major concern is on securing these objects. In a BI landscape, however, data is viewed as information, and you should replace "column" in your conversations with that word. Begin by creating a questionnaire that covers each kind of information. For instance, questions regarding your manufacturing systems might include "Who should be allowed to see information regarding our stock levels?" This will probably prompt a response such as "You mean at each plant, or regionally? They're different sets of people." And that's the proper exchange to have. It forces the organization's staff to begin thinking in the right terms.

If your organization is decentralized, make sure that headquarters understands that administrators from various sites might have access to the data while it moves through their level. If they aren't comfortable with that, you might have to implement a central collection methodology, which can get expensive. You might have to implement that anyway if you're collecting vendor information or from military or government systems.

Group the data into the smallest level of information objects that you can, but no smaller. Then group the users into friendly names that everyone can understand such as "primary research analysts" or "financial managers". Create a matrix that ties these objects together, like this:

 

Inventory Level
(region)

Units Shipped
(Site)

Unit Cost
(Line)

Regional Manager

X

X

 

CFO

X

X

X

Site Manager

 

X

X

The matrix might not match the security in the source systems, since they will most likely include security principals on a more granular level and have groups that won't have access to the BI systems. This is a bit different than developing an application Create, Read, Update and Delete (CRUD) matrix, since all the data will be read-only in an analysis system.

Next, once you've defined your landscape (which I'll cover in the next few articles) you should graphically document the path the information will take. For instance: "The information leaves the ERP system, is staged on an FTP site, is imported to a file store, and is stored in a data mart. From there the data is extracted to a data warehouse staging table, transforms are applied, and is then used in four reports and one cube." That sentence doesn't work for a lot of business people, but if you draw circles and arrows with that information, they can see it clearly.

To complete the process of discovery, you can draw colored lines from the circles to the names of groups of people that can access those objects. That makes it very clear who can see what.

The point of this exercise is for the organization to tell you who is allowed access to the data. As the DBA, you're a bit like a bank. You don't control how people spend their money, but you do protect it and control who has access to it. Your bank doesn't decide who can get in to your account. They rely on you to tell them that. The same holds true for your BI landscape. You can collect the data, move the data around, and present it for display, but you shouldn't try to decide who has access to a given piece of information.

When the organization signs off on the security scheme in the matrix, you can use that to create your groups. You'll have to create "super-groups", meaning security principals that lie above the groups represented in the source systems. They might even include special accounts for vendors or other interested parties.

Finally, define the current security in the source systems, and create documentation to cover the transport to the final display of that data. Use this to create your security within each level of your BI landscape.

This will all become clear once we begin to cover each technology, but it's important to create these processes prior to designing anything else. Don't be surprised if you uncover some unhealthy security practices in your current systems. Don't focus on that; just fix it and move on. You have a lot more to work to do before your system is in place and ready for business.

Informit Articles and Sample Chapters

Larissa T. Moss and Shaku Atre have a great book on BI that also discusses security called Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications.

Online Resources

You can find a good Business Intelligence resource here that covers security and other BI topics.