If you need more than 4Gb of storage for your SharePoint 2010 environment and blob linking is not for you, working with SQL Server is on your plate. 2010 has a lot of databases, each and every one of which has a role and performance profile. Knowing these can help you squeeze out that extra bit out of your environment.
A typical, or maybe not so depending on your individual needs/budget/etc, SharePoint 2010 environment has the potential to look like this. If yours does not, and almost certainly it does not 🙂 , I’m sure you can see similarities.
Viewing this in a compartmentalized fashion skewed towards where you can gain literal and alluded performance improvements you have roughly three major sections.
A: The database
B: The WFEs
C: The UI / General payload
I’m going to focus on the databases, what they are, and what they do. With knowledge comes power as they say.
So starting with the basics…
A normal installation of SQL drops a couple of databases on the disk.
Expect to see:
- MASTER: Records all system level information for a SQL Server instance. E.g. logins, configuration, and other databases.
- MSDB: Records operators and is used by SQL Server Agent to schedule jobs and alerts.
- MODEL: Is used as the template for all the databases created in the instance.
- TEMPDB: All temporary tables, temporary stored procedures, and anything else that is ‘temporary’ is stored here. Like a cache this is recreated every time the SQL Server instance is started.
- Of these four, only the last one TEMPDB really should be a candidate for its own spindle.
- Size wise, TEMPDB is the only one that you should expect any type of growth from.
- All four scale vertically, as in you can only allow them to grow in size rather than add in extra databases.
- ReportServer: This stores all the metadata for reports such as definitions, history, scheduling, and snapshots. When ReportServer is actually in use the report documents are stored in SharePoint Content databases.
- ReportServerTempDB: When reports are running temporary snapshots are found here.
- Both of these databases must coexist on the same database server.
- They have a heavy read load.
- The ReportServerTempDB can grow if there is heavy use of cached snapshots. Just like with the System databases, all four scale vertically, as in you can only allow them to grow in size rather than add in extra databases.
Microsoft SharePoint Foundation 2010 drops a number of databases to your disks. (If you install Search Server 2010 Express you will see some of these databases. I have put a * next to the name if it is a factor in play for that kind of scenario.)
Expect to see:
- Configuration*: As inferred by its name, this is where information about the environment is stored. Databases, IIS sites, Web Applications, Trusted Solutions, Wb Part Packages, site templates, blocked file types, quotas, they’re all in here. There can only be one configuration database per farm and you can expect it to remain relatively small.
- Central Administration Content*: Just like the configuration database, there can only be one, it does really grow much, and as its name infers it stores the Central Administration sites content.
- Content*: All site content is stored in these guys. Site pages, documents, lists, web part properties, audit logs, user names and rights, they’re all in here. Office Web Application data is also stored in here.
- Usage*: The Usage and Health Data Collection Service Application uses this database. This write heavy database is unique in that it is the only database that can be (and should be) queried directly by external applications. There can only be one per farm. Health monitoring and usage data used for reporting and diagnostics is in here.
- Business Data Connectivity*: External Content Types and related objects are in here. It is typically a read heavy yet small sized database.
- Application Registry*: You really only see this if backward compatibility with the BDC API is needed. Typically this can be deleted AFTER upgrading is complete.
- Subscription Settings: Features and settings for hosted customers gets put in here. It is not something you normally see as it has to be manually created using Powershell or SQL Server. If it is present you can expect it to remain small and busy only from reads.
- Content databases can hold multiple site collections and their content.
- They can get BIG. Their size is dependent on size and count of both content and users.
- 200GB should be your ceiling. Odds are you will see a performance hit as you pass that number. 1TB or more are an option but for Record Centers and the like. I.e. non collaborative usage and data retrieval (read only.) Add more content databases as you reach that 200GB level.
- Due to its nature it is wise to move the Usage database to a separate spindle.
Right so now lets move on to the Standard edition.
- Search Administration*: Used by the Search application the ACL and configuration information is in here. Growth can be vertical or horizontal – if you create new instances of the service application.
- Crawl*: Again used by the Search application, the state of the crawled content/data and the historical crawl records are in here. This database can get a bit big and if you are crawling LARGE amounts of data it is a good idea to not have it on the same server as the next database, Property. It is also a good idea to be using the Enterprise version of SQL so that you can leverage the data compression functionality. This is a read heavy database.
- Property*: Again used by the Search application associated data to the crawls is stored in here such as history, crawl queues, and general properties. If you have large volumes of data, move this to its own server and you will see performance improvements with regard to search query result returns. This is a write heavy database.
- Web Analytics Reporting*: Used by the Web Analytics application, aggregated report tables, fact data grouped by site – date – asset, diagnostic information can be found in here. Depending on your policies this database can get massive. As it grows scale out by adding more databases.
- Web Analytics Staging*: Temporary data (unaggregated) is in here.
- State*: Used by the State Service application, InfoPath Forms Services, and Visio Services. Temporary state information for Exchange is also in here. More State databases can be added via PowerShell.
- Profile: Used by the User Profile service application, users and their information is found and managed in here. Additional databases can be created if you create additional service instances. Expect lots of reads and medium growth.
- Synchronization: Again, used by the User Profile service application. It contains configuration and staging data for use when a directory service such as AD is being synched with. Scaling up or out is the same as with Profile. Its size can be influenced by not just the number of users and groups but their ratios as well. Expect fairly balanced read vs write activity.
- Social Tagging: Again, used by the User Profile service application. As users create social tags and notes they are stored, along with their URLs in here. Expect growth to be representative of how much your community embraces the functionality of social tagging. Scaling up or out is the same as with Profile. This is definitely more read than write heavy.
- Managed Metadata Service: Syndicated content types and managed metadata is stored in here. Growth can be vertical or horizontal – if you create new instances of the Managed Metadata service application.
- Secure Store*: Account names, passwords, and their mappings are stored in here. This is where you may have to collaborate with other internal groups as it is suggested that this database be hosted on a seperate database instane with limited access. Growth can be vertical or horizontal – if you create new instances of the service application.
- Watch the Web Analytics Reporting database. It can get very big.
- Use compression for Crawl if you have the Enterprise version of SQL.
Right so now lets move on to the Enterprise edition…
- Word Automation Services: Used by the Word Automation service application information about impending and completed document conversions is to be found in here. This database can be expected to remain relatively small.
- PerformancePoint: Used by the PerformancePoint service application settings, persisted user comments, and temporary objects are stored in here. It can be expected to be read heavy and can be scaled up or out.
- Draft: Not accessible to end users this is used to store data used by the project queue. There can only be one as far as I know.
- Published: When a project is published this is where it lives. Timesheets, resources, custom fields, and all the other metadata is stored in here. This database can get large.
- Archive: Backup data of projects, resources, calendars, etc. is in here. Growth can be limited, if it is not and is used it will get large.
- Reporting: This is the repository of the entire project portfolio. When you publish a project plan, a copy will be put in here. Read heavy and large…
- Search Administration: Stores and manages search setting groups, keywords, synonyms, promotion/demotions, best bets, and search schema metadata. It should stay on the small side and remain read heavy.
- PowerPivot Service Application: Cached and loaded PowerPivot files are in here along with usage data and schedules. You can expect this database to remain small.
- PowerPivot files can be rather big and PowerPivot stores data in content databases as well as the central administration database.
It has to be mentioned as well that each and every environment can prove to be unique so you may not see value in moving anything other than your content databases to another platter.
So what does this look like in the wild? Viewing a screenshot of, an admittedly deliberately slightly duplicated environment, you can see these databases as they would appear in real life…