May 2007 - Posts

Clustering with Access-Based Enumeration (Part 2)

In Part 1 of this article we wrote a short script to re-enable Access-Based Enumeration on a clustered file share. In part 2, we'll dissect the script and make some improvements to the code so that the status of ABE can be checked. This will allow the Cluster Administrator console to show the true current state of Access-Based Enumeration to administrative users.

Here's the original script:

Function Online( )
on error resume next
' Call the ABECMD.EXE /Enable command for each share
Set oShell = CreateObject("WScript.Shell")
oShell.Run "H:\ABECMD.EXE /enable ABEShare", 1, true
if (Err.Number <> 0) then
Online = 1
end if
Online = 0
End Function

Function LooksAlive( )
LooksAlive = True
End Function

Function IsAlive( )
IsAlive = True
End Function

The script is divided into 3 separate functions.

The Online() Function

The Online() function is called when the cluster resource monitor attempts to bring the Generic Script resource online. This function is responsible for taking any action that enables the required functionality for the script. In our case, we want to use the ABE command line utility ABECMD.EXE to enable ABE on the clustered share. It's important to note the return values from the Online() function; if it returns zero (0) the function was successful and the resource is placed online. Other values will cause repeated attempts to bring the resource online, possibly followed by a failover.

The LooksAlive() Function

The LooksAlive() function is called at intervals determined by the cluster resource configuration. Microsoft says the following about the LooksAlive() function:

Perform one or more very fast, cursory checks of the specified instance with the emphasis on detecting potential problems rather than verifying operational status. IsAlive will determine whether the instance is really operational. Take no more than 300 milliseconds to return a value. Resource Monitor calls LooksAlive repeatedly at a specified time interval (for example, once every five seconds).

The default time interval for LooksAlive calls is 5 seconds but it is configurable by the administrator.

We return True in this function to tell the Resource Monitor that ABE is probably active.

The IsAlive() Function

The IsAlive() function is called at intervals determined by the cluster resource configuration. Microsoft says the following about the IsAlive() function:

Perform a complete check of the resource to see if it is functioning properly. The set of procedures you need to use depends on your resource. For example, a database resource should check to see that the database can write to the disk and perform queries and updates to the disk. If the resource has definitely failed, return FALSE. The Resource Monitor immediately sets the status of the resource to "ClusterResourceFailed" and calls the Terminate entry point function. Resource Monitor calls IsAlive repeatedly at a specified time interval (for example, once every sixty seconds).

The default time interval for IsAlive calls is 60 seconds but it is configurable by the Administrator.

We return True in this function to tell the Resource Monitor that ABE is definitely active.

Improving the IsAlive() function 

In part 1 we noted that we're not really telling the truth about the ABE resource. We're telling Resource Monitor that ABE is enabled, but we're not checking it. So we change the IsAlive() function as follows:

Function IsAlive( )
Set oShell = CreateObject("WScript.Shell")
set wshRun = oShell.Exec ("H:\ABECMD.EXE ABEShare")
if (Err.Number <> 0) or (wshRun.Status <> 0) then
IsAlive = False

Exit Function
else
sABEOutput = wshRun.StdOut.ReadAll()
if InStr(sABEOutput, "enabled") then
IsAlive = True
Exit Function
end if
IsAlive = False
end if
End Function

Now we're really checking that ABE is enabled for the share. If an administrator comes along and disables ABE, the cluster Resource Monitor will know about it and we can take corrective action.

Posted by davidr with no comments
Filed under: ,

Clustering with Access-Based Enumeration (Part 1)

Access-Based Enumeration is a rather cool add-on to Windows Server 2003 that allows an administrator to restrict what users can see on a file share. If Access-Based Enumeration is enabled for a share, a user can see only the files and folders to which they have access. This can help reduce support calls from users, eg "Why do I get this Access Denied error on the Finance folder?" and make it simpler for users to access the data they need.

For example, here's the view of an ABE-enabled share for an Administrator. A finance user on the other hand will see a different view, and a normal user will see an even more restricted view.

Now that all works fine on a single server without any further effort on the administrator's part.

When you enable ABE on a clustered file share, it will all appear to work just fine until the file share is failed over to another node. When this happens the file share will no longer be ABE-enabled, and the share will revert to the standard Windows 2003 behaviour. To get around this, we write a VBScript application and register it as a cluster resource within the appropriate cluster group.

Here's a script that does exactly this - note that it assumes ABECMD.EXE is available on the cluster drive (in this case, H:\) and that the share is called ABEShare: 

Function Online( )
on error resume next
' Call the ABECMD.EXE /Enable command for each share
Set oShell = CreateObject("WScript.Shell")
oShell.Run "H:\ABECMD.EXE /enable ABEShare", 1, true
if (Err.Number <> 0) then
Online = 1
end if
Online = 0
End Function

Function LooksAlive( )
LooksAlive = True
End Function

Function IsAlive( )
IsAlive = True
End Function

The version 1 script above is sufficient to re-establish Access-Based Enumeration on a clustered file share after failover. Microsoft recommends not placing the script files on the cluster disk, but for this type of script I think placing it on the cluster disk is acceptable. You may choose to store the script in the same location on each cluster node; but I'm not entirely convinced that the stated benefits outweigh the disadvantages in managing the script (replication etc). YMMV.

Implementing is simple. Add a new resource to the cluster group of type Generic Script. Your Possible Owners for the new resource should include all nodes on which the ABE share must be available. Set the script to be dependent on the File Share resource, and set the Script filepath to be the full path (H:\ABEShare.VBS) to the VBS file. When you being the resource online, the share will become ABE-enabled.

The script above has some limitations, the most glaring of which is that if an Administrator disables ABE (either using the command-line tools, or the Windows Explorer interface) the cluster will not know about it. 

In part 2 I'll expand on the VBScript above and describe some improvements that allow the script to report the true status back to the cluster Resource Monitor. 

Posted by davidr with no comments
Filed under: ,

IIS 6: User Isolation with Active Directory

I'm busy migrating everything off our single host to a collection of virtual machines. One of the things we do is apply User Isolation in IIS 6 so that when people use FTP to access the system, they can only access their own files and sites (we do some very limited hosting for friends and family).

Since we have an Active Directory, it's pretty much a no brainer to use the same AD account for everything (except Community Server, which runs this site, and Countersoft Gemini, which we use for our internal bug database). We've had user isolation going for a couple of years with no dramas, but moving it off the old server onto the new VM has been ... painful.

Just so it's clear; we're not changing the way people log on, nor even the location of their data (luckily the same drive letter is in use on old and new hosts). But the new server just wouldn't work.

The old server still worked fine, so user isolation was operational and the data in AD was correct. There's nothing useful in the event logs; the standard error in the FTP client was as useful as ever:
530: User XXX cannot log in. Home directory inaccessible.

Turns out that you get this error when your username and password don't match the directory.

The fix is simple. I'll use $$ to indicate the metabase ID for your FTP site, they will all differ so use adsutil enum /p msftpsvc or the IIS console to find the right site number. Also, replace $DOMAIN with your NETBIOS domain name, $USERNAME with the account you want to use for directory access by the FTP service, and $PASSWORD with the correct password for that account:

pushd C:\INetPub\AdminScripts
adsutil set msftpsvc/$$/ADConnectionsUserName $DOMAIN\$USERNAME
adsutil set msftpsvc/$$/ADConnectionsUserName $PASSWORD
net user /domain $USERNAME $PASSWORD
popd

The five commands above are a good way to ensure that your domain connection will operate as expected. You could even massage the above into a command script to change the user and password (we all change service account passwords regularly, right?)

Windows NFS Limitations

NFS has been a native part of Windows since the Windows 2003 R2 release, some 18 months ago now. The version of NFS in Windows 2003 R2 is a descendant of the Services For Unix package that has been available for a number of years.

I've designed a fairly complex implementation of NFS on a Windows cluster; this cluster also hosts ordinary file/print services, SQL Server and Exchange. Part of designing a cluster like this is designing the filesystems that underlie the cluster capability. Since we obviously want to separate SQL Server logs and databases, Exchange logs and databases, the Quorum, DTC, network services and file stores, we have a number of different cluster groups and therefore, drive letters.

In this design, there are so many different drives required that we have to use Windows mount points to reduce the number of drive letters. SQL Server is S: drive, but there's also data, log and backup volumes mounted as S:\Data1, S:\Log1 etc.

Exchange is done the same way (Exchange is about 8 volumes), with X:\SG1\Logs, X:\SG1\Data1, X:\SG1\Data2 etc.

So it was a no brainer to make all the cluster groups work the same way. 9 cluster groups (3 x File, Exchange, SQL, Net Services, Print, DTC and Quorum) taking only 9 drive letters for 33 different volumes.

And everything works just fine ... except NFS.

NFS acts weird. If I create the cluster NFS resource to "share the root directory" I can see the contents of the root directory, but my mounted data volume is inaccessible. If I create the cluster NFS resource to "share subdirectories", I can't even mount the NFS share.

Turns out that Windows NFS will not work with mount points at all (and yes, it cost a support call to find that this limitation has been extensively documented within Microsoft). In fact, one of the unspoken and unwritten requirements for NFS on Windows is that there be a drive letter associated with the NFS share.

Don't get caught like I did ...

Posted by davidr with no comments

Upgrading to WDS ...

We had the strangest experience attempting to upgrade a customer's RIS server from Windows 2003 SP1 to SP2 this week.

This upgrade is supposed to be simple, straightforward ... and non-breaking. How little we knew!

Instead of the simple procedure that usually accompanies a Windows Service Pack, (I've installed hundreds of service packs, and the whole "MS is bad and Service Packs break everything" certainly isn't my point of view) we saw stack overflows occurring as a direct result of running the service pack installation. Obviously you don't see this level of detail - but you might see an Unhandled Exception in RISetup.EXE.

It turns out RISetup.exe is executed as part of the service pack installation - the service pack logs show attempt to execute the "risetup.exe -upgrade" command. Presumably this performs some SP2-specific setup.

In any case, after the error was cleared and the Service Pack setup completed, the server rebooted. WDS did not work and neither did RIS.

The solution:

  1. Download the Windows Automated Installation Kit (WAIK). You'll need to burn it to a DVD (as an image, not a file) or arrange to mount it with Daemon Tools, Alcohol 120% or your tool of choice.
  2. At the command prompt, run WDSUTIL /Uninitialize-Server.
  3. At the command prompt, run WDSUTIL /Initialize-Server /RemInst:{Path to your RemoteInstall Folder}
  4. Extract F1_WINPE.WIM from WINPE.CAB (it's in the root of the WAIK DVD) - this is the x86 boot image. If you want to add x64 support, extract F3_WINPE.WIM as well.
  5. Use the Windows Deployment Services console to import F1_WINPE.WIM as a Boot image (repeat with F3_WINPE.WIM for x64 support).
  6. Make sure that the Windows Deployment Server service is set to Automatic, and the Remote Installation service is set to Manual.

Now if only it was in the upgrade guide.
Posted by davidr with no comments
Filed under: , , ,