Internet - Migrate A Netscape Webserver To Microsoft IIS

Internet: Migrate A Netscape Webserver To Microsoft IIS

(c) Symmetric Web Sites, Inc.

Author: Mark Hopkins Email
Date: 10.21.2005

I know that this is hard to believe, but we are actually doing the untinkable: Migrating a webserver from a Unix (Sun Solaris 8) to Microsoft Windows 2000 Internet Information Server (IIS). I suppose that it could be worse; we could be migrating from Apache. Now that would be a sin! Nonetheless, this procedure details the actual content move and some really cool Unix commands are used as well.

Article Index

Requirements
Procedure

Task 1: Install GNU WIn32 Tar On The Target Server
Task 2: Copy The Directory Structure (No Content) To The Target Server
Task 3: Search All Content (*.htm*) On The Source Server For Strings
Task 4: Copy The Content To The Target Server
Task 5: Migration "Gotchas"

Conclusion
Printing

Requirements

The requirements here are quite quite intuitive and are as follows:

The source server; Solaris 8 Sun E220R
Netscape Web Server (not really a requirement however)
The target server; Windows 2000 configured with FTP and IIS.
GnuWin32 tar on the target server.

Procedure

There really is not great procedure with this write up. It is simply a collection of tasks and descriptions of the execution of the tasks.

	Task 1: Install GNU WIn32 Tar On The Target Server Copying the source content tree from a Unix server to a Windows server involves three steps; zip, ftp and unzip. Zipping and unzipping the content tree can be done with any number of utilities, but I prefer tar. However, this can be a little tricky because of file name limitations (or rules) on the target server. We have seen that the source server (Unix) has characters in file names that are not accepted bu Windows. In the case of using cpio on Windows, strange things happen and the complete content tree does not get created. As a matter of fact, it seems like a few file name errors will cause many failed directory creations on the target server. For that reason we will use tar to create the archive on the source and tar to explode the archive on the target. Therefore we need to install a Windows version of the utiltiy on the target. First let's get the utility from some Open Source location. I always have my best "luck" getting GnuWin32 utilities and they can be found through GnuWin32; more specifically in the Tar For Windows subdirectory of the site. If these links ever go away, just search for it via google.com using the following search string without the quotes: "tar+for+windows". We will not cover the install of Tar For Windows here. It should suffice to say that the install is very straight forward and to take all the defaults. After we installed it on the target server, we did change the system-wide "Path" environment variable to include the location of tar.exe.
	Task 2: Copy The Directory Structure (No Content) To The Target Server One of the more menial tasks of this migration is to set up the authorization on the target server. Since it is not a Unix server the authorization cannot be migrated; it must be done manually which will be time consuming. Because of this, and the fact that we are migrating a live website, with a limited window, I was asked if there is a way to migrate the directory structure only. The point being that if we could copy the directory structure only, the authorization part of the migration could be performed ahead of our actual migration window. Well, this being a Unix server, the answer to the request was "YES". We all know that anything is possible on Unix! First let's log into (ssh) blcinet.banklife.com and set our default directory one level up from the web server (Netscape) content directory tree.

	Now let's create a cpio archive that contains the directory structure of the docs directory tree, which is the document root of the source web server.

	The target server is Windows 2000. I have had difficulty using cpio on that platform. We would rather use tar. But we have a cpio archive and must somehow get a tar archive. So we create a temporary location to extract the cpio directory archive that we just created. We then explode the archive.

	Now, tar (relative) the exploded directory structure. Why did we use cpio at all? I simply could not figure a way to send the results of the find command directly to tar.

	FTP the tar archive to the target Windows 2000 server.

	We still must expand the directory structure archive on the target server. So let's create a remote connection to the Windows 2000 server.

	Authenticate at the target server.

	Looks like I neglected to logout after my previous remote session; there is still a DOS screen active. So let's work from here.

	The default directory for my ftp session turns out to be in the "docs" directory. If I were to explode the tar archive here we would have a docs/docs structure which is not what we want. Since we are not exactly where we want to be, copy the tar file up one directory, delete it from the current location, move up one directory, and explode the tar archive. This creates the web server directory tree. Essentially we are finished with this task.

	Remove the tar file. Task completed.

	*Task 3: Search All Content (.htm) On The Source Server For Strings* As it turns out, there are some file names in the content tree that contain a special character (":"). We need to locate any content files (.htm) that might be referencing these files. In our case, the files are directories so we need to find any content that references anything under those directories with an absolute method. Anything referencing files in a relative manner, underneath the "bad" directory names will be ok. So how do we do this? Using the "find" and "grep" commands as shown below.

	The more I think about the logic behind searching for references to files with a ":" in the name, the more complicated it gets. There are a plethora of ways they this reference could take place and searching for them could be quite difficult, and perhaps not even worth it in our case. It appears that the files created with the ":" character were done so by mistake during the act of remote publishing.
	This next search is more direct. We have been asked to find all content files that contain certain strings (file names) and we do get "hits" on each string. First we create an input file with one line per search string. Then the rest is shown in the screenshot below. Note that for each search string a seperate output file is created. Also note that there may be more than one occurance of the search string in each file listed.
	Of course, the output files (*.out listed above) were sent to the requestor.
	Task 4: Copy The Content To The Target Server This task is part of the actual migration, perfoemed withing the scheduled "content freeze" window. This task is actually the very first thing done once the window arrives. The start of the window does not work well with my schedule so I will cron the start of it. The content move has three parts: Create the tar archive of the content on the source server FTP the content archive to the target server Explode the content archive on the target server. Each of these three steps will take between 90 and 120 minutes. The last thing we want is to sit around and manually execute them. So, we have created a simple shell script and we will use cron to schedule it to run at the exact start of the migration window. The following is what the script looks like:

	Oops, is that my password shown? It will be different by the time anyone reads these notes. The crontab entry looks like the following:

	Now that the content has been archived and ftp'd it is time to log int the target server and explode them. Log into the target via "Remote Desktop" as we did above and untar the archive as shown below.

	Task 5: Migration "Gotchas" A favorite line from one of my favorite movies is "There's always the unexpected, isn't there?". And in our business the unexpected is the norm. The following are task(s) completed during the migration, and outside (actually after in this case) of our assigned tasks. Copy the directory blcinet.banklife.com:/usr/netscape/server4/cgi-bin and its contents to the target server. This was ommitted on the first transfers because it was not requested; it is not under the docs directory. Also note that on the target, it is moved one directory DOWN, so that it is now under the docs directory. The procedure meeting this request is the same used above : tar, ftp and tar! The only difference is that we have no need to script and automate this one. It is quite small and took all of about five minutes total. Something that falls under the category of "should have done" is that with all of the comands that we used, I opted NOT to generate log files. My reasoning was that the files would be too large and bulky. I was asked about errors and I could not prove there were none wthout the appropriate log files.

Conclusion

Conclusion? Well, these are really just notes about my part of a project to move a web server. However, I will say that I will never understand running a web site on anything other than Apache. But I am just a little "worker bee" and I am only too happy to do whatever needs to be done. There are some really cool Unix commands used in this "article" like "find:, "tar", and "cpio". I may find myself referring to this "article" when I forget how to do something which will no doubt happen.

Printing This Article

If you have trouble printing this article, be sure to set your browser Page Properties correctly. Go to File -> Page Setup and set your left and right margins to .125 inches.