Multi-tasking Operating Systems just give the illusion that they’re doing things in parallel. In reality, the CPUs rapidly skip from task to task using various algorithms and heuristics, making one think the processes truly are running simultaneously. The choice of scheduling algorithm can be immensely important.
In a nutshell the processors are allowed to spend finite chunks of time per process. This “time” is referred to as quanta. The quantum is simply the amount of time the CPU will spend on the task. Every time the CPU switches to a new process, there is what’s called a context switch. Context Switches are computationally expensive, and we all know how bad excessive context swtiching is in any environment, more so in a Terminal Services environment. So obviously we need to avoid excessive context switching but still maintain the illusion of concurrency.
Thomas Koetzing wrote a great article on Understanding and Troubleshooting context switches.
There is some excellent information on how Windows manages Processes and Threads in the updated Windows Internals, Fifth Edition book by Mark E. Russinovich and David A. Solomon with Alex Ionescu. You can still download chapter 6 from the Fourth Edition titled “Processes, Threads, and Jobs“. It’s a great read!
Changing the processor scheduling option modifies the Win32PrioritySeparation value under the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\PriorityControl key, which consists of 6 bits (AABBCC)…
- Where AA =
01 – longer timeslice interval
10 – shorter intervals
- Where BB =
01 – timeslice can have variable length
10 – timeslice has fixed length
- Where CC =
00 – foreground/background processes have same priority
01 – foreground process gets 2 x boost compared to background process
10 – foreground process gets 3 x boost compared to background process
When you set the performance option for processor scheduling in the GUI, you only see two possible choices for the duration of the timeslice quantum:
1) Programs: value is 38 decimal, binary is 100110 = shorter intervals, variable timeslice length, 3 x boost
2) Background Services: value is 24 decimal, binary is 011000 = longer timeslice interval, timeslice fixed length, no boost
Neither of these settings are optimal for a Terminal Server, although the Programs option is the better of the two simply because shorter timeslices are mandatory for a Terminal Server environment.
If using CPU Utilization Management Feature that was introduced with Citrix Presentation Server 4.0 Enterprise Edition and above, the 3 x boost for the default Programs value makes this feature somewhat less effective. In these circumstances the variable timeslice length is often better fixed. Under these conditions it is suggested that the optimum value may indeed be 40 decimal, 101000 binary. That gives us small, fixed length timeslices, allowing the CPU Utilization Management Feature to efficiently do its job by giving each user a fair share of the CPU by modifying the normal job priority scheduling in the operating system.
For further information refer to an FAQ on the CPU Utilization Management Feature and the CPU Rebalancer Services.
Ironically, after setting the Win32PrioritySeparation value to 40 decimal you will receive the following message in the Application Event Logs once the “Citrix CPU Utilization Mgmt/Resource Mgmt” service is next restarted.
Event Type: Warning
Event Source: CTXCPUUtilMgmt
Event Category: (1)
Event ID: 1591
Time: 12:34:28 AM
Windows is using a custom priority separation value and CPU Utilization Management performance may be degraded. To optimize CPU Utilization Management performance, on the Advanced tab of the System Properties dialog, open Performance Options and select Background services. Then restart the CPU Utilization Management service.
This seems to be logged if the Win32PrioritySeparation value is set to anything other than 26 Hex (38 Decimal), 18 Hex (24 Decimal), 2 or 0. Bizare! Notice how the description suggests using Background services. Doesn’t it then contradict itself by allowing a setting other than 18Hex? I think this is a badly worded warning message, and can be seen as misleading.
However, if you are not using Presentation Server, a version/edition that supports this feature, or a 3rd Party CPU Management application, then the Programs settings is a better choice than Background Services purely from the aspect of shorter timeslices.
I set this in a custom Group Policy ADM template as per the following:
CATEGORY “Terminal Server Tuning”
POLICY “Processor Scheduling”
PART “Optimise performance for:” DROPDOWNLIST REQUIRED
NAME “Programs” VALUE NUMERIC 38 DEFAULT
NAME “Background Services” VALUE NUMERIC 24
NAME “Terminal Server Optimised” VALUE NUMERIC 40
PART “ONLY set to Terminal Server Optimised if using either the Citrix CPU” TEXT
PART “Utilization Management Feature, or a 3rd party CPU Management” TEXT
PART “application, such as AppSense Performance Manager, etc.” TEXT
PART ” ” TEXT
PART “Note: Be aware that when this is set to Terminal Server Optimised” TEXT
PART “the Windows GUI will show this as being set to Background Services.” TEXT
PART “This can be misleading when auditing a deployment. Therefore,” TEXT
PART “always check the actual Win32PrioritySeparation setting in the” TEXT
PART “registry.” TEXT
END POLICY ; Processor scheduling
END CATEGORY ; Terminal Server Tuning
This information was compiled after reviewing many forum posts and articles written by some of the most respected Server Based Computing experts.