SQL Server essentials for SharePoint (1)

SQL Server is an essential part of every SharePoint environment. Since the vast majority of SharePoint data is stored in a SQL Server database your SharePoint environment can never be faster than the speed at which SQL Server can handle the requests. Unfortunately most people installing and maintaining SharePoint databases aren’t trained SQL administrators. They usually are SharePoint administrators, or even SharePoint developers (like me). This means that on a lot of SharePoint environments performance isn’t what it could be, due to poor or nonexistent SQL Server management.
In this series I will list a few simple things that you can do to get more out of your SQL Server and SharePoint environment. I won’t explain the why of every tip, but I will focus on the how you can make your SharePoint environment faster and more stable by improving your SQL Server management. Some of the tips and tricks will only be relevant if you have multiple servers and a large budget, but other tips can be beneficial to even the smallest installations. The series is not intended to be the last word on SQL Server for SharePoint, but rather useful information, hints and tips for getting more out of your database platform.

This first post will talk about things you do before and while installing SQL Server.

 

Installing SQL Server

The first pieces of guidance is very straightforward and needs to be considered even before you start installing SQL Server.

  • For a production environment you should always run SQL Server on a separate physical server. Single server environments (where SQL Server and SharePoint are installed on the same server) can be used as development or test environments, but are not suitable for production and user acceptance environments.
  • Another tip for production and user acceptance environments; make sure you have at least 4GB of memory per processor in your SQL Server box.
Understand that SharePoint uses several different databases and that they all have different purposes and uses.

  • Configuration Database (Config DB).
    Stores information about your SharePoint environment like server names and Web Application URLs and a site collection list with names and URLs. This database isn’t used very heavily during normal operations.
  • Content Databases (Content DB).
    There is at least one Content Database per Web Application. Stores all data about site collections, sub sites, list, libraries and items. Data stored in the environment by end users ends up in (one of the) content database(s). In a collaboration environment content databases should be optimized for writing, in a web content management environment content databases should be optimized for reading.
  • Central Administration Content DB.
    Stores content for the Central Administration Web Application. It is generally not recommended to store large amounts of data in here (e.g. TechNet documents).
  • Shared Services Provider Database (SSP DB).
    Stores configuration data for the Shared Services Provider (there is one DB per SSP), and also all User Profiles, Audiences, Excel Trusted File Locations etc.
  • Shared Services Provider Search Database (SSP Search DB).
    Stores crawled and managed properties identified and indexed by MOSS search. This database also stores information about which users have access to what content. Depending on your crawl schedule and the use of your environment the use of this database can be both read and write intensive. Note that the content index is *not* stored in this database but rather on the file system of the Index and Query servers.
  • SSP Admin Content Database (SSP Admin Content DB).
    Stores content for the SSP Administration Web Application (there is one DB per SSP). It is generally not recommended to store large amounts of data in here, or to use the SSP Web Application and database for hosting My Sites.
  • Windows SharePoint Services Search Database (WSS Search DB).
    This database stores the full text index of content in a WSS only deployment. In the case of MOSS only the full text index of the SharePoint help files are stored in the WSS Search DB. This database is not required in a MOSS deployment, unless help search is needed.
  • Single Sign On Credential Store DB (SSO DB).
    Optional is using the MOSS SSO capability. Stores the credential cache for SSO applications.
  • TempDB.
    TempDB is SQL Server System database that is heavily used by SharePoint as an intermediate database for writing information to any of the SharePoint databases. Because of this, TempDB is very read and write intensive. A slow TempDB will cause all write actions to all SharePoint databases to be slow as well.

 

RAID

Next you need to start thinking about the appropriate RAID configurations for the disks you’re going to store your SQL Server installation and databases on.
RAID (Redundant Array of Independent Disks) is a disk system that contains multiple disk drives, called an array, to provide greater performance, reliability, storage capacity, and lower cost. There are several different types of RAID arrays and each type is called a level. Each level uses a different algorithm to implement fault tolerance. The most common RAID levels are 0, 1, 5 and 1+0, which are also the levels that are used for SharePoint.

  • RAID 0 is also known as disk striping. Two or more disks are used for a RAID 0 configuration. All data and programs that are installed on a server using RAID 0 are divided in blocks and are equally distributed across all disks. This means that RAID 0 improves the read and write performance and uses all available disk capacity. A downside is that there is no fault tolerance. If one of the disks crashes you lose the data that was stored on it.

RAID 0 
RAID 0

  • RAID 1 is called disk mirroring because  all disks have an identical mirror disk. All data that is written to the primary disk is written to the mirror disk as well. This means that RAID 1 provides fault tolerance. It also means that effectively you can only use half of your disk capacity for storage and that write actions will be slower as data has to be written to two locations, instead of just one location.

RAID 1 
RAID 1

  • RAID 5 is known as striping with parity. When using RAID 5 you will need at least three disks. All information is split in blocks, where half of each block is stored on a certain drive, and the other half of that block is stored on another drive. The third drive stores the parity information. If one of the first two drives fails the parity information can be used to reconstruct the missing data. This means that RAID 5 provides fault tolerance without storing all information twice. The downside is that you need at least three disks and that writing information is slower, because the parity has to be adjusted every time data is written to a disk.

raid5 
RAID 5

  • RAID 1+0 or RAID 10 is also called mirroring with striping. RAID 10 uses a striped array of disks like the one used for RAID 0 and then mirrors that like in RAID 1 (hence the name, RAID 1+0). For example if you have a striped array of two disks than that set of disks is mirrored on two other striped disks. RAID 10 provides both good reading performance and fault tolerance. The downsides of RAID 10 are that you need at least four disks, that you can only use half of your disk space for storage and that writing will be a bit slower because all data gets stored twice.

The more physical volumes you have, the better it is. If you can separate anything out onto separate disks, you should do so in the following order:

  • The first disk can be RAID 1 and will be used to store the operating system and SQL installation files on. If you only have 1 disk, don’t let it be RAID 1, but try and make it RAID 10 or 5.
  • If you can separate one thing out from the rest make it TempDB, as SharePoint writes to and reads from TempDB all the time. Your SharePoint installation can’t be faster than your TempDB. The disk that you store TempDB on should be a RAID 10 disk
  • If you have another disk, use this one for the SQL transaction logs and make this disk also a RAID 10.
  • In the case you have a fourth disk use this one for the SQL data files. This one should also be RAID 10. If your environment is read intensive, like an Internet presence site, where not a lot data is written to SQL you should separate out the SQL data files before the transaction log files, as the transaction log is only used when data is written to SQL and not when data is read from SQL.

Should you be in the lucky situation where you have more than four disks separate out the log and data files for the separate databases. Start by separating out the search database and the content databases, go for the SSP database after that.

If you have a choice, use more smaller and faster drives, instead of having less larger drives.

If you have to prioritize among faster disks use the following ranking order:

  • SQL TempDB data and transaction log files
  • Database transaction log files
  • Search database
  • Content databases

 

While installing SQL

While you are running the SQL Server install wizard make sure you check the following things:

  • You need to install the SQL Agent
  • Select windows authentication (SQL authentication is evil)
  • When choosing collation settings select LATIN1_General_CI_AS_KS_WS
    • Accent-sensitive
    • Kana-sensitive
    • Width-sensitive
  • Go to the Surface area configuration and click DB Engine. Make sure that remote connections are allowed
  • Select TPC/IP and Named Pipes
  • Configure SQL Server and SQL Server Agent using a regular Domain Service Account, do not use and Administrator or your SharePoint Farm Account as the SQL Server Service Account
  • Double check to make sure that the SQL Agent starts automatically
  • You don't need the browser, so don’t install it
  • Also don’t install Analysis Services or Reporting Services on the machine(s) hosting your SharePoint databases
  • Don’t install any other components that you don’t need
  • Restart SQL Server service

 

Database sizing

A database is build up out of several smaller components. 

  • Databases consists of filegroups
  • Filegroups consist of files
  • Files consist of extents
  • Extents consist of pages
  • Pages consist of data

The size of a page is always 8kB and since an extent consists of 8 pages an extent is 64kB. Because of this you should format your RAID array using 64kB blocks.

To get an optimal performance and to minimize fragmentation you should always pre-grow your data and log files. If you are unsure of the total amount of space you need to allocate for a database or a log it’s ok to over-allocate. Even if you have pre-allocated space you need to leave auto-growth on, just to make sure that your SharePoint environment doesn’t grind to a hold should your database reach the size you pre-grew it to. Do set auto-growth to fixed amounts, not to percentages, to prevent your database from growing out of control. Keep monitoring your SQL environment to make sure that the size of your data and log files doesn’t get out of hand and that your hard drive doesn’t run out of space.

If you pre-allocate space for you log files the most ideal way to do this will in most cases be by allocating 8GB at the time. If your database is smaller than 8GB you probably will want to grow your log files by using smaller amounts.

 

Other advice

Your database server is the least ideal candidate of all SharePoint server types to be virtualized because SQL has the highest amount of disk I/O activity and can often have high memory and processor requirements.

The biggest database throughput killers are:

  1. TempDB performance
  2. Slow disk IO. The advice is to use at least 15,000 RPM disks
  3. Inadequate memory. This will prevent SQL Server’s ability to cache data.