SharePoint 2013 Capacity Management
Capacity Management
Capacity management is an ongoing process. You need to plan for growth and change, so that your environment can continue to deliver an effective business solution.
What to Monitor
Those are the important counters that Microsoft referred to and will depend on when taking decision to scale out/ up.
#
|
Counter
|
1
|
% Processor Time
|
2
|
Disk-Average Disk Queue Length
|
3
|
Disk-% Idle Time
|
4
|
Disk-% Free Space
|
5
|
Memory- Available Mbytes
|
6
|
Memory- Cache Faults/Sec
|
7
|
Memory- Pages/Sec
|
8
|
Paging File % Used
|
9
|
Paging File %Used Peak
|
10
|
Network Interface Card- Total Bytes/Sec
|
11
|
Process(w3wp and owstimer.exe) -Working Set
|
12
|
Process(w3wp and owstimer.exe) -%Processor Time
|
13
|
Application Pool Recycles
|
14
|
ASP.net- Requests Queued
|
15
|
Request Wait Time
|
16
|
Request Rejected
|
How to Monitor:
There are several ways for monitoring including free option and commercial options, here you are some methods that I have found and I will be showing steps for the free methods
- SCOM.
- 3rd Party Tool : there are several companies having products for monitoring I will not be going into details about that.
- Performance Monitor Dataset collectors and then using tool to analyze it.
-To analyze the .blg file use tool called PAL it is available over Codeplex pal.codeplex.com
- Diagnostic Studio: Microsoft SharePoint Diagnostic Studio 2010 (SPDiag version 3.0) was created to simplify and standardize troubleshooting of Microsoft SharePoint 2010 Products, and to provide a unified view of collected data. Administrators of SharePoint 2010 Products can use SPDiag 3.0 to gather relevant information from a farm, display the results in a meaningful way, identify performance issues, and share or export the collected data and reports for analysis by Microsoft support personnel.
It is mainly used with SP2010 but can be used with 2013 too
- PowerShell script and export it to CSV files or blg files.
In the following script sample will need 2 file 1 containing the script and the other containing the accurate name for the above counters then we will schedule a job in task schedule to grab the data.
Sample PowerShell that can be used to get counters data and generate csv file
$counterresults = Get-Counter -Counter (Get-Content e:\Counters.txt) -MaxSamples 10 -SampleInterval 10
$fileName = "ServerName -{0:yyyyMMdd-HHmmss}.csv" -f (Get-Date)
Export-Counter -Path e:\$fileName -FileFormat csv -InputObject $counterresults
Or you can export it to .blg file
Export-counter -Path $home\PercentProcessorTime.blg -Force -FileFormat "BLG"
How to schedule a Job:
- Use Windows Task Scheduler and create a new task and set the triggers for 4 or 5 times per day at different times to grab data at business hours and non-business hours to have full view about the counters.
- In the Actions tab of the job you created will open a program and give it the path to PowerShell.
C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe
In the parameter write the following: -ExecutionPolicy Bypass -File E:\logcounter.ps1
- Running this job several days will generate a lot of files to collect them all to 1 file open command prompt and write the following command after you go to the folder path containing the files: copy *.csv newfile.csv
- This will generate 1 csv file the last step is manual you will open the csv file and remove the duplicated header rows and empty rows so it will not affect the results after that save it as Excel file and now with selecting the first column with any other column you can generate graph for this counter and then compare it with the below thresholds.
Notes:
- Sometimes when you try to create Task Scheduled on a vanilla server you will get error mostly it is related to security policy, check this link for your reference.
- Need to take baseline data whenever you create new server/ Farm to compare the results and to be a reference point for troubleshooting.
- If you exported the files to .blg files then you can analyze it with pal.codeplex.com if you exported it as csv file then use the following steps to analyze the data
- To merge several .blg files use relog.exe
References:
Thresholds and Actions:
Microsoft specified some actions to remove bottleneck here you are the table by comparing the results you get from the previous steps and comparing to this table then you can start taking actions for capacity and scaling up/out.
#
|
Counter
|
Potential Bottleneck Condition
|
Action
|
1
|
% Processor Time
|
Persistently greater than 75%
|
· Upgrade processor, Add additional processors, Add additional servers
|
2
|
Disk-Average Disk Queue Length
|
Gradually increasing
|
· Upgrade to faster disks, Increase number of disks, Implement data striping, Move some data to alternative servers
|
3
|
Disk-% Idle Time
|
Persistently less than 90%
|
· Increase number of disks, Move some data to alternative disks or servers
|
4
|
Disk-% Free Space
|
Persistently less than 30%
|
· Increase number of disks, Move some data to alternative disks or servers
|
5
|
Memory- Available Mbytes
|
Less than 2 GB on WFE server
|
· Add memory, Remove unnecessary services,
|
6
|
Memory- Cache Faults/Sec
|
Persistently greater than 1
|
|
7
|
Memory- Pages/Sec
|
Persistently greater than 10
|
|
8
|
Paging File % Used
|
Greater than 50%
|
|
9
|
Paging File %Used Peak
|
Greater than 75%
|
|
10
|
Network Interface Card- Total Bytes/Sec
|
Persistently greater than 40% of capacity
|
|
11
|
Process(w3wp and owstimer.exe) -Working Set
|
greater than 80% of total memory
|
|
12
|
Process(w3wp and owstimer.exe) -%Processor Time
|
Persistently Greater than 75%
|
|
13
|
Application Pool Recycles
|
Several per day
|
· Verify application pool settings, Ensure application pools are not set to automatically recycle unnecessarily
|
14
|
ASP.net- Requests Queued
|
Persistently Large numbers (100s)
|
Add WFEs servers
|
15
|
Request Wait Time
|
Persistent delays
|
Add WFE servers
|
16
|
Request Rejected
|
Greater than 0
|
Add WFE servers
|
Tips for Performance:
Providing here some tips to enhance performance as sometimes the thresholds are met not due to need for scaling out/up but due to issues that needs to be resolved or best practices to be followed.
- 1 ms Latency between SQL and SP server.
- 20 ms latency to retrieve the first byte from SQL server.
- Create a SQL alias cliconfg.exe.
- Disable loopback check.
- Latest OS updates.
- SP Service Packs.
- Use Hardware Load Balancers.
- Log Files on different than primary (system) disk.
- Use naming convention for service accounts, services and database names.
- Disable unused Windows services (eg.: Spooler, AudisoSrv, tabletInput, WerSvc).
- Warm Up Scripts.
- Set Quotas on Web Applications.
- Enable compression.
- Enable caching.
- Avoid long Backup Jobs.
- Code Check and validations.
Comments
Post a Comment