Did you know that Microsoft now supports content databases up 4TB and beyond? Hang on though—before you design or adjust your information and service architectures, there are a number of assumptions, caveats and trade-off choices you must understand. We'll discuss these and how database size affects performance, content recovery, and day-to-day administration tasks. We'll then look at various techniques to help you scale out your storage tier. We close the session by sharing the very latest guidance on employing using RBS (Remote BLOB Storage) in your environments.
DevEX - reference for building teams, processes, and platforms
Sizing your Content Databases: Understanding the Limits
1. Sizing Your Content Databases:
Understanding the New Limits
Randy Williams
AvePoint
2. Randy Williams
• Enterprise Trainer & Evangelist – AvePoint
• 20+ years in IT
● developer, consultant, trainer, author
• Three-time SharePoint MVP
• Speaker at many global conferences
randy.williams@avepoint.com
http://linkd.in/plEEb1
@tweetraw
5. The SharePoint storage dilemma
• Documents, databases, and BLOBs
• Storage growth
SharePoint
SQL Server
2008/R2
Content
Database
Content
Content Database
Database
Active Content Actual Content
6. Previously supported limits
Large, single
-site
repositories
1 TB
and archives
General use (records
200 GB
scenarios center)
100 GB site collection *
* A larger site collection is supported if it
is the only site collection in the database
7. Revised limits (July ‘11)
Document
No archive
explicit scenario:
All scenarios: limit caveats
4 TB
caveats apply
General use apply
200 GB
scenarios
Site collection
No explicit size – limit by
scenario, database size,
item count
8. Understanding scenarios
• SharePoint is multi-purpose
• Scenario primarily refers to needs and
usage patterns
● Read/write centric
● Concurrent users
● Average/peak loads
● Recovery objectives
• Isolate different usage patterns to
separate databases
9. Common scenarios
Record Center Team Site
• Long term retention • Day to day collaboration
• Low volatility – very few w/ shorter retention
write operations • Higher volatility
• Limited reads • Higher reads
Larger databases Smaller databases
10. What are the 4TB-level caveats?
• A larger db requires faster storage
● Between 0.25 – 2.0 IOPS/GB
● 4TB DB : 1000 IOPS minimum
• Plans developed for DR/HA
• Capacity planning/perf testing
• Recognize added complexity
● Skilled architects and proactive admins
• 60M total item limit per db
http://technet.microsoft.com/en-us/library/cc262787.aspx
11. What are the >4TB caveats?
• All 4TB caveats, plus
• Document Center or Record Center only
• In any given month
● <5% of content accessed
● <1% of content modified
• No alerts, user workflow, item-level
security, et al
http://technet.microsoft.com/en-us/library/cc262787.aspx
12. Why is 200GB still a good number?
• Support operations are much easier
• Better performance
● The larger the db, the slower it gets
• Easier to meet backup and recovery
objectives
● Most recoveries begin with a db restore
● Can you meet your recovery objectives?
• Patching / upgrading is faster
200 GB
13. Why are larger DBs slower?
• Select queries take longer
● More rows to filter, group and sort
• Write queries take longer
• Locking escalation
● More blocking
• More data, but data cache same size
• DB maintenance takes longer
● reindex
● dbcc checkdb
14. What happens as size increases?
http://technet.microsoft.com/en-us/library/hh395916.aspx
17. Achieving storage performance
• Storage array (RAID 1+0)
● 10 300GB SAS drives, 15k RPM
● 1.5 TB effective space
● ~1500 IOPS = 1.0 IOPS/GB
• Set of drives (RAID 1+0)
● 4 750GB SATA drives, 10k RPM
● 1.5 TB effective space
● ~300 IOPS = 0.2 IOPS/GB
• Go with higher quality storage
● SAS > SATA ; SAN > DAS
18. Scaling storage
• Multiple storage arrays (RAID 1+0)
• Break out into multiple LUNs
• Add additional data files to DB, one per
array
F:SP_DocCenter_1.mdf
• Advice G: SP_DocCenter_2.ndf
Data
● Many smaller drives > H: SP_DocCenter_3.ndf
I: SP_DocCenter_4.ndf
fewer larger ones
J: SP_DocCenter.ldf Log
● RAID 1+0 > RAID 5
19. Additional performance guidance
• How many data files?
● Advice varies – between 0.25 to 1 per physical CPU
● Each on a different spindle/LUN
• Adjust database growth settings
● Use 50-100MB for each data file
● Use 20-40MB for log
• Enable instant file initialization
• Optimize tempdb
● Use multiple data files
● Pre-size to 25% of largest db
● RAID 1+0
http://slidesha.re/pwVlJM
20. Demo (if time permits)
DB SETTINGS AFFECT
PERFORMANCE
21. Achieving Disaster Recovery
• Built-in SharePoint backup is incapable of
working with large capacities
● Site collection backup limit : 15GB
● Practical database backup limit : 200GB
• Look at your backup/recovery objectives
● Most recoveries involve a database restore
• Look for third-party solutions
• Deploy SP1 – site recycle bin
http://slidesha.re/rlv3u1
23. Remote BLOB Storage (RBS)
• Storing document (BLOB) outside
database
● Reduce database size
• Cannot be used to scale beyond database
limits
● Effective size = DB size + BLOB store
• Can externalize based on document size
• Built in RBS support with SQL Server
2008 (FILESTREAM provider)
24. Overview of BLOB externalization
Pointer
(stub)
RBS
Upload SQL Server
Web Front-end
Externalized BLOB is
transparent to both File System
SharePoint and its users
25. Advantages of externalizing BLOBs
• Reduce storage costs
• Increase performance
● Read & write
● All other activity by users of the DB and SQL server
• Access to features of BLOB storage
platform
• Efficient content restructure
● Shallow copy in SP1
26. Advantages of keeping BLOBs in
SQL
• One storage container to
● Maintain
● Monitor
● Recover
• Tier I storage
● Performance relative to lower tiers of storage
benefits all content access
• SQL caching
● Performance of reads/writes of small documents
● SQL caching benefits reads
27. RBS Guidance
• Consider using in document-heavy databases
• Trade off
● Storage cost & performance benefits versus
● More complex architecture (support, DR, HA)
• Consider third party providers
● More full-featured solutions
• In general
● Do not externalize <1MB documents
● Ideal number varies widely
29. In review
• 4TB is the new supported limit for all
scenarios
• No limit for record/document centers
• Keys to achieving larger sizes
● Storage performance planning/testing
● DR/HA planning/testing
• RBS offers benefits but does not extend
these limits
30. Your Feedback is Important
Please fill out a session evaluation form
drop it off at the conference registration
desk.
Thank you!
Introduce concept of documents being stored as BLOBs in CDBBUILD: Diagram of architectureDiscuss storage growthBUILD: Bloat of data, mostly inactiveBUILD: Burden on CDBsDiscuss need to thin about storage holistically: lifecycle, compliance, SLAs, cost