|
Why can't we do a "detect and install immediately"? |
| Let's talk about the realities of a "detect and
install now" function versus the normal operation of WSUS. Best case scenario is that a "detect and install now" will hit all of your systems within milliseconds. That is, after all, what everybody says they want. Unfortunately that type of functionality carries a lot of major issues -- the most significant being: What happens if ALL of your systems simultaneously query a webservice on your WSUS server at the same time? A WSUS server is normally sized to handle a load of clients based on a 22 hour balanced load. Unless your server is significantly oversized for it's normal load, the very first thing you'll observe is that the "Big Red Button" will probably bring your WSUS server to standstill... assuming it doesn't crash it entirely. Or, more likely... a majority of your clients will timeout trying to access the webservices.... (they'll get '500' errors), and your "Big Red Button" push will have to be pushed a few more times just to get past the timeout errors. So... let's say we "throttle" the detections from the clients by sequencing them individually, selected by target groups, or by sites, or randomly, perhaps by GUID, or even individually by checking a box on a per-machine basis. First question: How long will it take for ALL of your WSUS clients to initiate a detection if they have to be queued up in a one-at-a-time sort of environment. Well, 200 clients will take about an hour and a half, allowing 30 seconds for each client to execute the detection. Second question: How long will it take for ALL of your WSUS clients to download the update package if they have to be queued up in a one-at-a-time sort of environment. (Notice we're not even discussing what happens if 200 clients try to simultaneously download the same file from the WSUS server --- remember BITS???). Let's assume the package is relatively small, and it takes 30 seconds to download the content. Thirty seconds on a LAN connection will move about 225 megabytes of data, but thirty seconds on a 1Mbit/sec WAN connection will move about 3 megabytes of data (not an unusual size for an update package) -- if you're lucky and nothing else is interfering with the data transfers, you won't clog up your WAN pipe(s). The truth is that server load is an irrelevant consideration in this discussion, what will really impact organizations in terms of the "Big Red Button" is how much time it takes for their client base to transfer the updates across the WAN pipes, and how much BITS will throttle (maybe even choke) those transfer sessions at the WSUS server. So, let's allow, practically speaking, about 60 seconds per client to perform a detection and download of the content. That's about 3.5 hours just to get the detections and downloads completed using the "Big Red Button", and assuming we're not planning to saturate our network with update traffic. No doubt some environments will be able to much more .. perhaps as much as a half dozen clients a minute .. depending on how granular the control over which client detects when. How many of those clients are on the other side of a WAN connection will determine the real capacity of such a system. But the tool needs to be built to handle the /typical/ environment that WSUS is deployed into. Now... what happens if you simply use existing capabilities to do all of this. Easiest way with current technologies is to simply reset the detection interval via Group Policy. It doesn't even matter what you set to, because here's what happens when you change the policy: (1) Upon changing a policy, within five minutes all Domain Controllers will refresh their policy. When the WSUS GPO is changed at the client, the WUA automatically initiates a detection. So your DCs will, in fact, execute a detection and download the patch within five minutes of you changing the policy. Of course, you're not going to configure your DCs to auto-install and auto-restart, because we know what that will cause.... so practically speaking, you're still going to manually install and restart (one at a time) each of the DCs in your organization, and that will take some time. (2) Member servers and desktop systems refresh policy every 60 minutes, plus or minus 30 minutes. What this means is that starting as soon as you change the policy (consider the system whose next 'refresh' is only seconds away from the time you click on the Save button), and lasting for as much as the next 90 minutes (consider the system whose next 'refresh' is the maximum time away), each of the remaining servers and desktops in your organization will refresh policy and initiate a detection. Now, this will be just a bit more chaotic than the "Big Red Button" because we put a hard coded delay in that function - and the policy change occurs pseudo-randomly. But, here, we can just let random activity take it's place. It takes about 30 seconds to change the policy, and the we go do something else -- instead of monitoring the "Big Red Button" as it slowly pings all 200 clients to tell them to perform a detection "now". In addition, we don't need the massive overhead for an all-at-once hit on the server, but spreading your 200 clients out over an hour and half is about 2-3 sessions per minute on your WSUS server, and I don't think you can even install Windows 2000/2003 on a system that underpowered. :-) You'll still have to be aware of sizing issues on the WAN connections, but the play-togetherness of the whole package will be a lot cleaner. Incidentally, your clients across those WAN connections should already be appropriately configured so that they cannot saturate the WAN pipe in such a scenario. The thing to watch about all of this is: How many /simultaneous/ HTTP file transfer sessions can you support on your WSUS server? and: How long will it take, allowing for the max number of simultaneous HTTP file transfer sessions, to actually transfer an update to all of your clients? and: How much of the NIC bandwidth and LAN bandwidth on the segment supporting your WSUS server can you afford to reserve for file transfer activity on any given day? Those answers will tell you how long it will take, in terms of hours, to actually complete an emergency "Big Red Button" update of your clients. So.... existing methodology..... all clients detected and downloaded in about an hour and a half, subject to available server load and WAN bandwidth, with nothing more than a policy change. Big Red Button -- assuming 200 clients sequentially querying the server -- about 3.5 hours to complete the process. If you wanted to reduce the delay between clients to 30 seconds, you could accomodate about 120 client connections per hour, or complete the process in a bit under 2 hours -- assuming the HTTP file transfer sessions with BITS don't choke down the server. Finally... none of the above considers the load imposed by those clients coming back to "report" 15-20 minutes after the detection cycle. So, practically speaking, your server load is going to double about 15 minutes after pushing the "Big Red Button", as a result of "reporting" sessions landing on top of your ongoing "detection and download" sessions. Incidentally, with proper policy and deadline configuration on your "Under Construction" GPO, you can update any platform, to any service pack level, in under two hours. See Article 012 for an example of an XP SP1 system fully patched in two hours. My question... and I'm more than willing to listen to counter-arguments.... is to show me what, realistically, would be achieved with a "Big Red Button" that you cannot already do with the existing system, as it is today. So, yes I acknowledge that a LOT of people are asking for this feature. But, do I think they'll see any significant benefit from it when they get it? In fact, I daresay I doubt they ever get an opportunity, outside of testing, to actually use the thing! How many of us can afford to do a "Big Red Button" TEST on our production networks? |