SQL Server Integrity Check – A Necessary Routine

Feb 14, 2014 / By Murilo Miranda

Tags: , , ,

Today I’ll be discussing Integrity Check on SQL Server – we have some very good info in the community, but I’d like to summarize my thoughts here. When a DBA talks about Integrity Check, the conversation is usually redirected to one subject: CheckDB. And yes, we are going to talk about it in today’s post. :)

The DBCC CheckDB is a medicine of sorts that can prevent headaches, and remedy possible damages. But it should be used with caution: Without a well-planned backup strategy you can lose data. Even with all of the warnings about the importance of Integrity Check, it’s not uncommon to see instances without it.

“I’m a Manager just trying to understand this… Could you explain this at a high level?”

Let’s cut the blah, blah, blah and go ahead with this… I’ll introduce the CheckDB for those who have never heard of it – probably less technical individuals: The CheckDB command is a Database Console Command, or simply DBCC, categorized as a Validation DBCC. This is a necessary step in the entire instance maintenance, where we can detect, and in some cases repair a possible problem with the database integrity. Running the DBCC CheckDb regularly will help to anticipate problems, and even a unexpected shutdown of the instance. We’ll go into more detail a bit later…

“What’s the objective of the CheckDB?”

CheckDB has two roles: The main objective is the integrity check itself, and the second is the correction of the findings. Please not that the CheckDB verifies the database structure and not the data consistency.

Integrity Check: The command CheckDB follows a few steps on its execution, passing from the following phases:

  • Verification of the allocated structures, i.e. GAM, SGAM, PFS, IAM…
  • Verification of table consistency, including all of its indexes.
  • Verification of the system catalogs.

Correction: After the CheckDB “standard execution” as described above, we will have information about possible damaged pages. The CheckDB itself can fix these damages.

CheckDB has an option that allows the correction of the database structure, but the data allocated into the repaired page will be lost. This option is “REPAIR_ALLOW_DATA_LOSS”, and this option changes the damaged page for a new one. This way, we save the integrity of the database. Note that we are talking about the physical structure – not about data integrity, which is why we lose data. After the execution of this command, a verification based on backups should be made in order to identify and recover the lost data.

Another option is “REPAIR_REBUILD” used to correct non clustered indexes. On this option we don’t lose data. Even though we have options to correct pages, there are few limitations:

  • PFS pages, system tables and data purity detected errors aren’t fixable by CheckDB.
  • To repair using CheckDB, the database should be on SINGLE_USER mode.

“How is the CheckDB executed?”

Not too many people are curious to look deeper and discover more about this, but the CheckDB doesn’t run directly over the database. A hidden database snapshot is created and the process runs based on this snapshot. Many people only realize this when the CheckDB execution fails, and once they begin digging on it, they find out about “some snapshot” that failed the creation.

While we’re on the topic, do you know what to do if the snapshot creation fail? I can see 3 options:

  • Fix the permission to create the snapshot.
  • Execute the CheckDB with the TABLOCK option. This can bring problems, as the DB will be a target of locks.
  • Create a process (job) doing the following steps:
    1. Create a snapshot of the database.
    2. Run the DBCC CheckDB on this snapshot. Yes, the DBCC CheckDB can run over databse snapshots. And it’s obvious that a snapshot of a snapshot won’t be created.

“When and where should we execute an Integrity Check?”
It’s recommended to check  the integrity of all the databases, without exception! The best approach is to execute the CheckDB everyday. But we understand that on busy systems (instances with hundreds of databases, VLDBs, etc.) this can be nearly impossible.

Some interesting facts:

  • SQL Server maintenance plan ignores the TempDB, but if the TempDB becomes corrupt the instance will shutdown. Use caution with this.
  • Executing the CheckDB on the master database will cause the execution on the mssqlsystemresource database as well.

Execution Options: There are few execution options for CheckDB:

  • NO_INFOMSGS: Avoid the output of info messages.
  • ALL_ERRORMSDGS: Allow the output of error messages.
    • Default since SQL Server 2008 SP1.
  • NOINDEX:  Skips nonclustered indexes verification.
  • DATA_PURITY: Validates de data based on the column characteristcs.
    • Default since SQL Server 2005.
  • ESTIMATEONLY: Estimates the space needed on TempDB.
    • As per Paul Randal, this option is broken on newer versions of SQL Server.
  • TABLOCK: Uses locks instead of database snapshots.
    • Useful when the snapshot creation is failing.
  • EXTENDED_LOGICAL_CHECKS: Allows the validation on XML columns and Spatial indexes.
    • This can be costly.
  • PHYSICAL_ONLY: Skips most of the logical verifications.
    • Need less resources to run, but a “full” execution should be sone periodically.

Other Approaches to execute: As CheckDB is a resource-intensive process, in some cases the execution may take a long time to complete. Depending on the environment, the Integrity Check on all the databases (or even in one very large database) may not finish on time, namely, the integrity check will exceed the maintenance window limit. To work around this, we can reproduce a full CheckDB executing its subset commands. As follows:

  • CHECKALLOC
  • CHECKTABLE
  • CHECKCATALOG
  • CHECKFILEGROUP

By executing those commands, one per day, we can cover a full CheckDB execution in phases. Another possible approach is execute the backups with the “CHECKSUM” option. This way we will be able to restore using the option “VERIFY ONLY” and do an extra level of verification. Anyway, we will still need the CheckDB.

So, that’s all about CheckDB. There are more information around there, but the best place to get info from CheckDB is on SQL Skills blog, where the CheckDB guru writes few articles about it. See you in another post ;)

6 Responses to “SQL Server Integrity Check – A Necessary Routine”

  • Manoj says:

    Thank you for always keeping me up to date with changes! Great post.

  • I think this paragraph needs some clarification


    By executing those commands, one per day, we can cover a full CheckDB execution in phases. Another possible approach is execute the backups with the “CHECKSUM” option. This way we will be able to restore using the option “VERIFY ONLY” and do an extra level of verification. Anyway, we will still need the CheckDB.

    1- CHECKFILEGROUP can combine both CHECKALLOC & CHECKTABLE of every table and indexed view in the filegroup so at some point we don’t need to use CHECKTABLE for any table inside the FG that has been targeted by CHECKFILEGROUP ; I just wanted to clarify this.

    2- Another possible approach is execute the backups with the “CHECKSUM” option.

    Not quite really, I’d stress that the only way to detect a corruption in a DB is to run checkdb. Instead of much talk , here’s a good blog

    http://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a-day-2730-use-backup-with-checksum-to-replace-dbcc-checkdb/

    Some of approaches I suggest is to restore DB to another instance and run CHECKDB. You can run checkdb against a snapshot (ENT edition) , good idea if you have a mirror instance) but that’s not a true image of your production database.

    HTH

    • Murilo Miranda says:

      Hi Mohammed,

      First of all, thank you for the comments!
      Yes, the idea was pass another approach not something to replace the full CHECKDB.

      TY again,
      MM

  • vijay says:

    Nice compile.

    A little known nugget …for kicks:-)
    If the checkdb reports any errors, Its prudent to run drop clean buffers before you start to panic and run repair allow data loss. Any guesses why?

  • Murilo Miranda says:

    Nice adding Vijay!

    We can find more info about this here: http://www.sqlskills.com/blogs/paul/buffer-pool-disfavoring/

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>