Why is Database Security So Hard?

Posted in: Technical Track

I was recently asked a question by someone who had attended my Shmoocon talk entitled “Why are Databases So Hard to Secure?”. PDF slides are available (1.34 Mb). I was going to put this into a more formal structure, but the conversational nature works really well. I would love to see comments reflecting others’ thoughts.

I found several things of interest in your talk about database security and several new things to think about.

In particular I realized that DBMSs have at least two hats in the world of software architecture namely as technical services (“smart file system”) and as application framework. Perhaps that “depth” is one of the reasons why dbms is hard to secure? For example, considering just the question of who or what have user roles within a DBMS deployment. From the “deep” point of view, the “user” could be an application, or a module, or just the next layer up in the architecture stack. From the “shallow” point of view, the “user” should be an actual person whose UI actions are touching the database. I think someone in the audience was advocating that, and perhaps you were too, namely using the DBMS’s permissions and roles system to authorize actual users. But that would be different from common (even careful) practice, wouldn’t it?

I meant to grab you and re-ask my question though, namely “is all this complexity caused by the inherent difficulty of securing DBMS, or is it the _cause_ of the difficulty”? So I’ll re-ask now…

Best regards

And my response:

I think that’s part of the reason DBMS’ are hard to secure, the depth. Really it’s because a database is the ultimate “Web 2.0 application” — pretty much the input and output are mostly user-generated.

I think that the DBMS is actually the *cause* of the complexity, though in some way the complexity could be thought of as the cause. Basically, how we use the DBMS determines how we need to secure it. It’s because there are so many “moving parts” and because user input is allowed that we have so many problems. For instance, consider a supermarket database that contains all the shopping orders. Customers have very little, if any at all, input into this database. We choose the items, but only at a self-serve checkout line do we scan in the items. The description and cost are put into the database based on the barcode, and even fruit codes are put in by the cashier. Even at a self-serve checkout line, we can’t type in or override input; if something prices wrong we call over a cashier. Web 2.0 applications broaden who can provide input; 20 years ago nobody entered data into a database by hand unless that was their job, though plenty of people unknowingly did so by choosing from the options provided.

So that type of system is secure from user input, although we still have to worry about the data sources being spied upon. Because this DBMS system trickles down from each register to each store to central headquarters, someone can spy the data along the way. But otherwise, it’s relatively secure — because the flow of traffic is one-way, you can’t DoS a cash register.

Using databases for storing and retrieving user-supplied input is more dangerous (and more prone to human error!). We try to mitigate this by using pulldown menus in applications, which works in theory, until someone breaks into the application or uses XSS or CSRF (cross-site request forgery) or some other sort of bypass — that’s not in the scope of database security, but if someone can get different data into the system, they might put in attacks or somesuch.

What happens when you are on a social networking site and put <G> in your profile to simulate a grin? There are known cases where JavaScript is put into someone’s profile, maybe it redirects to a site so someone gets more “clickthroughs”, or maybe it goes to a malware site. In this case, the DBMS does its job of storing and retrieving data, and the actual database isn’t at risk, but the people are. This isn’t a database security issue, but if someone puts code in their profile, they might be able to do SQL injection too.

Traffic on “the internet” flows in every direction, from client to server (requesting a page and its contents, sending POST and PUT data) and from server to client (sending data), and that information passes through a lot of hands in the middle. Sniffing traffic can still be done, so spying is still a problem — this is why we encourage customers to use https on important sites.

Web 2.0 doesn’t make the need for the ‘application security’ side of DBMS’ any harder — you still have to lock down permissions for config files, limit what the DBMS user can do, control ACL’s for the database, and monitor and audit who does what. The problem is that the scope of the problem becomes larger — for instance, monitoring and auditing “who does what” is much harder when you have the user ‘webapp’ doing 99% of your queries.

And how do you know what’s a DoS attack versus what’s a runaway query? If a cash register became unusable because there were “too many connections” you’d know something was funny right away. But a web application might just be getting heavy traffic due to a marketing campaign that worked (or slashdotting).

The nature of the queries might also change, given the ever-evolving nature of Web 2.0. A not-for-profit might start selling merchandise to raise funds, and all of a sudden a new table springs up and gets populated with tons of data — is this bad traffic or good traffic? You need a lot of knowledge of the business to know.

The thing is, there are many ways to secure a database, because — as you mentioned — it can be used as an application framework. You can schedule events, have triggers and views, and stored procedures, so that you *can* limit the choices of the developers (who are the biggest end-users of the database directly!). In this way you can make the database more secure because you’re giving the *developers* a list of choices. Instead of having the application write a profile to a database, you can give a stored procedure called write_profile(). You can change data types to only allow certain data (using check constraints if your DBMS supplies them, but also using foreign keys).

The good news is that the fact that the DBMS is more complex allows more ways to secure it. It’s difficult, for instance, to have a different way of securing how data is written to a file, because you don’t have the flexibility in your operating system to do so. Oh sure, you can decide whether to use fsync() or not. But a DBMS supplies a “dumb” way to store and retrieve information — by using INSERT and SELECT and other SQL queries — and you can then make more complex routines to be “smarter”.

This would make auditing and monitoring easier, because any direct SQL query that isn’t a routine call is suspect. It doesn’t avoid DoS, and it might only help a bit with application vulnerabilities like code in a profile, or SQL injection, but it certainly does make it easier to find suspect activity. And that’s really the hard part, not just avoiding the bad stuff from happening, but recognizing when it has happened.

So it’s kind of both — the fact that the DBMS is used in such a complex way (accepting user-supplied data instead of a list of acceptable entries, and broadening users from “company employees” to “anyone who registers” or just “Anyone who visits the site”) makes the DBMS hard to secure, but the fact that the DBMS is complex also makes the security more complex.

Does that help? This is a great conversation and basically exactly what I wanted to get across during the talk.

And then he responded with:

> Does that help?

Well, it helps in the sense that it gives me more to think about…

And now I’d say that if many-moving-parts is the presenting problem (regardless of cause and effect) then there need to be best-practices of security based on a small set of profiles that the database or DBMS is playing in a system.

For example, the “web application” profile. Or the “web services middleware” profile. Or the “smart desktop file system” profiles. Or the “enterprise back end” profiles.

Each of these profiles should be hardened in different way.

e.g. maybe web applications should be triggered to prevent certain kinds of data, and have real-user database roles corresponding to the users of the app, Maybe web services middleware should have application identifiers as user roles with grants limited to what’s needed for each app. Maybe enterprise back end should have severe limits on the number and sources of connexions.

If hardening is so hard :-) then partitioning the space of uses into well-defined profiles might be a way to get a handle on it.

What do y’all think?


Interested in working with Sheeri? Schedule a tech call.

4 Comments. Leave new

Hmm. Saying a database is both a smart filesystem and an application framework is like saying that a hammer is both a wrench and a screwdriver. I.e., NOT. :-)

The thing about securing databases is that we must prevent illegitimate access, while permitting legitimate access. The complexity is not in the database, it’s in the definition of illegitimate vs. legitimate. In real-world scenarios, this distinction is complex and full of fine-grained special cases.

For instance, I want to allow users access to their own account details. But I don’t want users to access each other’s account details. Unless the user is an administrator. Or a merchant, in which case they should be able to access certain information about the user, such as her shipping address. But only if that user has authorized the merchant by ordering something from him. But that authorization is time-limited. Unless that user is a repeat customer of that merchant. Except when either the user or the merchant have blacklisted the other, because they had some bad transaction in the past.

The possible exception cases go on and on! Trying to block every case that needs to be blocked, while preserving access in cases that need access is hard. It’s not because the technology itself is complex. It’s because the real-world processes that we’re trying to model are complex.


Bill, you are right, of course: it would be unwise to design a system in which the (same?) database was being both a smart file system and an application framework at the same time. But I didn’t say that originally, I said that DBMS are able to play either role. To continue your metaphor – a DBMS is neither a wrench nor a screwdriver nor a hammer, but one of these.

Of course _requirements_ are complicated and hard to get right: in general. So we learn and apply requirements patterns and software architectural plans to get a handle on that complexity, to control it. Your example of user access is an excellent one: and how to deal with it depends on domain design decisions (what is a user? what attributes do they have at various layers of the system?). My point, though, is that a framework architecture (i.e. the DBMS) that potentially cross-cuts so many layers has the potential to get wired-up in all those decisions you try to make, and can therefore easily blur the layers of abstraction. (Imagine a machine whose instruction set depended on the spelling of your name.) And my suggestion is: perhaps there are typical patterns of use (I called them “profiles”) which, once recognized, could be hardened in different ways. This discussion was about security, after all.

Indra Setia Dewi
July 14, 2011 11:55 pm

Thank you for your posting. I use it for my paper assignment.


Leave a Reply

Your email address will not be published. Required fields are marked *