I was recently asked a question by someone who had attended my Shmoocon talk entitled “Why are Databases So Hard to Secure?”. PDF slides are available (1.34 Mb). I was going to put this into a more formal structure, but the conversational nature works really well. I would love to see comments reflecting others’ thoughts.
I found several things of interest in your talk about database security and several new things to think about.
In particular I realized that DBMSs have at least two hats in the world of software architecture namely as technical services (“smart file system”) and as application framework. Perhaps that “depth” is one of the reasons why dbms is hard to secure? For example, considering just the question of who or what have user roles within a DBMS deployment. From the “deep” point of view, the “user” could be an application, or a module, or just the next layer up in the architecture stack. From the “shallow” point of view, the “user” should be an actual person whose UI actions are touching the database. I think someone in the audience was advocating that, and perhaps you were too, namely using the DBMS’s permissions and roles system to authorize actual users. But that would be different from common (even careful) practice, wouldn’t it?
I meant to grab you and re-ask my question though, namely “is all this complexity caused by the inherent difficulty of securing DBMS, or is it the _cause_ of the difficulty”? So I’ll re-ask now…
And my response:
I think that’s part of the reason DBMS’ are hard to secure, the depth. Really it’s because a database is the ultimate “Web 2.0 application” — pretty much the input and output are mostly user-generated.
I think that the DBMS is actually the *cause* of the complexity, though in some way the complexity could be thought of as the cause. Basically, how we use the DBMS determines how we need to secure it. It’s because there are so many “moving parts” and because user input is allowed that we have so many problems. For instance, consider a supermarket database that contains all the shopping orders. Customers have very little, if any at all, input into this database. We choose the items, but only at a self-serve checkout line do we scan in the items. The description and cost are put into the database based on the barcode, and even fruit codes are put in by the cashier. Even at a self-serve checkout line, we can’t type in or override input; if something prices wrong we call over a cashier. Web 2.0 applications broaden who can provide input; 20 years ago nobody entered data into a database by hand unless that was their job, though plenty of people unknowingly did so by choosing from the options provided.
So that type of system is secure from user input, although we still have to worry about the data sources being spied upon. Because this DBMS system trickles down from each register to each store to central headquarters, someone can spy the data along the way. But otherwise, it’s relatively secure — because the flow of traffic is one-way, you can’t DoS a cash register.
Using databases for storing and retrieving user-supplied input is more dangerous (and more prone to human error!). We try to mitigate this by using pulldown menus in applications, which works in theory, until someone breaks into the application or uses XSS or CSRF (cross-site request forgery) or some other sort of bypass — that’s not in the scope of database security, but if someone can get different data into the system, they might put in attacks or somesuch.
Traffic on “the internet” flows in every direction, from client to server (requesting a page and its contents, sending POST and PUT data) and from server to client (sending data), and that information passes through a lot of hands in the middle. Sniffing traffic can still be done, so spying is still a problem — this is why we encourage customers to use https on important sites.
Web 2.0 doesn’t make the need for the ‘application security’ side of DBMS’ any harder — you still have to lock down permissions for config files, limit what the DBMS user can do, control ACL’s for the database, and monitor and audit who does what. The problem is that the scope of the problem becomes larger — for instance, monitoring and auditing “who does what” is much harder when you have the user ‘webapp’ doing 99% of your queries.
And how do you know what’s a DoS attack versus what’s a runaway query? If a cash register became unusable because there were “too many connections” you’d know something was funny right away. But a web application might just be getting heavy traffic due to a marketing campaign that worked (or slashdotting).
The nature of the queries might also change, given the ever-evolving nature of Web 2.0. A not-for-profit might start selling merchandise to raise funds, and all of a sudden a new table springs up and gets populated with tons of data — is this bad traffic or good traffic? You need a lot of knowledge of the business to know.
The thing is, there are many ways to secure a database, because — as you mentioned — it can be used as an application framework. You can schedule events, have triggers and views, and stored procedures, so that you *can* limit the choices of the developers (who are the biggest end-users of the database directly!). In this way you can make the database more secure because you’re giving the *developers* a list of choices. Instead of having the application write a profile to a database, you can give a stored procedure called write_profile(). You can change data types to only allow certain data (using check constraints if your DBMS supplies them, but also using foreign keys).
The good news is that the fact that the DBMS is more complex allows more ways to secure it. It’s difficult, for instance, to have a different way of securing how data is written to a file, because you don’t have the flexibility in your operating system to do so. Oh sure, you can decide whether to use fsync() or not. But a DBMS supplies a “dumb” way to store and retrieve information — by using INSERT and SELECT and other SQL queries — and you can then make more complex routines to be “smarter”.
This would make auditing and monitoring easier, because any direct SQL query that isn’t a routine call is suspect. It doesn’t avoid DoS, and it might only help a bit with application vulnerabilities like code in a profile, or SQL injection, but it certainly does make it easier to find suspect activity. And that’s really the hard part, not just avoiding the bad stuff from happening, but recognizing when it has happened.
So it’s kind of both — the fact that the DBMS is used in such a complex way (accepting user-supplied data instead of a list of acceptable entries, and broadening users from “company employees” to “anyone who registers” or just “Anyone who visits the site”) makes the DBMS hard to secure, but the fact that the DBMS is complex also makes the security more complex.
Does that help? This is a great conversation and basically exactly what I wanted to get across during the talk.
And then he responded with:
> Does that help?
Well, it helps in the sense that it gives me more to think about…
And now I’d say that if many-moving-parts is the presenting problem (regardless of cause and effect) then there need to be best-practices of security based on a small set of profiles that the database or DBMS is playing in a system.
For example, the “web application” profile. Or the “web services middleware” profile. Or the “smart desktop file system” profiles. Or the “enterprise back end” profiles.
Each of these profiles should be hardened in different way.
e.g. maybe web applications should be triggered to prevent certain kinds of data, and have real-user database roles corresponding to the users of the app, Maybe web services middleware should have application identifiers as user roles with grants limited to what’s needed for each app. Maybe enterprise back end should have severe limits on the number and sources of connexions.
If hardening is so hard :-) then partitioning the space of uses into well-defined profiles might be a way to get a handle on it.
What do y’all think?
Interested in working with Sheeri? Schedule a tech call.