Make Room on Your Hard Disk


As you use your computer, your hard disk becomes cluttered with junk files that take up space and serve no purpose. For example, many programs save previous versions of documents with the extension .BAK. ScanDisk can leave files with the .CHK extension in the root folder of any drive, and these files are rarely useful. Cache directories are often filled with files meant to speed access to sites you will never revisit. It's time-consuming to delete these files manually. This issue's utility, HDValet, automates the process. Just select the junk file types you want to eliminate and click the Clean up button. Junk file types are configurable, and you can add types as needed. A confirmation process protects against mistakes.

HDValet runs under Microsoft Windows 95, Windows 98, and Windows NT. The Delphi 4 source code for HDValet is provided with the utility for those interested in seeing how it works.

Using HDValet
To install HDValet, simply run the supplied INSTALL.EXE. To uninstall the program, use the Add/Remove Programs applet in the Windows Control Panel.

HDValet comes preconfigured with a number of junk file types, and you will see this list in the program's main window when you first launch it (see Figure 1). You can add new types and remove or edit existing types. You should review these default types to make sure that they suit your needs. The Undo button becomes active after you've pressed Remove, and will undo the most recent removal. If you modify or remove one of the default file types, you can restore it by clicking the Defaults button and choosing from the Defaults list. You can reorder the types in the list using the higher and lower buttons, or by pressing Ctrl+Shift+H and Ctrl+Shift+L.

Many of the button commands can be invoked by right-clicking a junk file type and choosing from the pop-up menu. Several other handy commands are available only on the pop-up menu. Check All and Uncheck All let you enable or disable all items at once. Cleanup method for All lets you change the cleanup method for all items at once. You can rename a junk file type by clicking on it twice, slowly, then entering the new name.

When you click the Clean up button, it changes to a Stop button, which will interrupt the cleanup process when clicked. HDValet goes down the list of junk file types and processes each checked type. It searches the set of folders specified in the junk file type for files matching any of the type's file specifications. If any found files don't meet the date, age, or size restrictions for the junk file type, HDValet ignores them. It also ignores any files that match the junk file type's list of protected file specifications.

If the Test mode box is checked, HDValet will not actually remove any files. Instead, it will log the files that would have been removed to the file HDValet.TEST.TXT, in the HDValet's own folder. When the test is complete, it will offer to display the test log. At the end of the log, it reports the amount of Recycle Bin space needed on each drive to perform this cleanup operation. If the amount needed on any drive exceeds the capacity of the Recycle Bin on that drive, HDValet will warn you of the problem. In the log, it will flag the problem drive with "*ERROR*," and indicate the Recycle Bin percentage that would be sufficient.

The very first time you run HDValet, it will probably find a huge number of junk files. If you get a Recycle Bin warning on this first try, you can work around it by processing a few junk file types at a time. But if you consistently receive warnings that the Recycle Bin's capacity is too low, you can correct the settings, as described in the next section.

If the Test mode box is not checked, HDValet will remove each file using the cleanup method specified for this particular junk file type. There are three cleanup methods. The first, Send to Recycle Bin, is the default setting. HDValet will move each matching file to the Recycle Bin. After verifying that only junk files were deleted, you can empty the Recycle Bin to permanently delete the files.

The second method, Move to holding folder, moves each matching file into a folder named HDVal$$$.$$$ in the root directory of the same drive, retaining the existing folder structure. For example, C:\Windows\getrid.bak would be moved to C:\HDVal$$$.$$$\Windows\getrid.bak. If you wish, you can use a ZIP compression utility to move the contents of this folder into an archive file, retaining the directory structure. To permanently delete these files, simply delete the \HDVAL$$$.$$$ folder. It will be recreated as needed.

If you choose the third cleanup method, Delete permanently, HDValet will simply delete the matching files, leaving you no opportunity to restore them in case of error.

When HDValet cleans up your disk, it logs its activity to the file HDValet.LOG.TXT, located in the program's own folder. At the end of a cleanup session, the program will offer to show you the log file. The log entry begins with the current date and time. It reports each junk file type processed, along with the cleanup method used. Under each junk file type, HDValet reports the files that were removed. If an error occurs, it puts "ERR" to the left of the filename, and displays the reason for the error on the next line (if the reason is known). The cleanup log is cumulative; each new session adds to the log. From time to time, you'll want to delete the earliest entries so the log doesn't get too large.

There's always a possibility that one of your junk file types may match a file that isn't junk. The first few times you use HDValet, be sure to run the test mode before performing an actual cleanup and carefully review the list of files that would have been removed. If the list includes a file that you'd want to keep, edit the junk file type to protect that particular file or files matching certain specifications.

Even when you take this precaution, it's possible that at some point HDValet will delete a file that you wish it hadn't. The technique for restoring such a file depends on the cleanup method that was used. If you selected Send to Recycle Bin, double-click the Recycle Bin icon on your desktop to open it. Locate the file in question, right-click it, and choose Restore. If you selected Move to holding folder, launch Explorer and open the folder HDVAL$$$.$$$ in the root directory of the drive that contained the file. Right-click the folder and use Find to locate the file, or simply navigate to it in Explorer. Now drag the file back to its old location. If you selected Delete permanently, the file cannot be restored. After restoring a file, you should edit the corresponding junk file type so it won't delete that file again.

Recycle Bin
HDValet's default behavior is to send junk files to the Recycle Bin. However, users can disable the Recycle Bin or limit its capacity. At startup, HDValet checks to make sure the Recycle Bin is enabled, and it checks again each time you press the Clean up button. If the Recycle Bin is not functional, HDValet disables any junk file types that rely on the Recycle Bin; then it displays a warning.

Some power users deliberately turn off the Recycle Bin. In order for HDValet to work in such a case, you must edit your junk file types to use one of the other two cleanup methods. If you want to be able to send junk files to the Recycle Bin, you will have to change the bin's settings to make it fully functional on all local fixed disks.

To change the Recycle Bin settings, right-click the Recycle Bin icon and choose Properties. If the option button Use one setting for all drives is checked, then you only need to make changes on the Global tab. Make sure the box labeled Do not move files to the Recycle Bin is not checked, and make sure the size of the Recycle Bin is set to something more than 0 percent. If you received a warning came in Test mode, check the test log for a recommended percentage.

If the option button Configure drives independently is checked, then you must change settings on each tabbed page. For each page, make sure the Do not move... box is not checked, and set the Recycle Bin size to something more than 0 percent. Here again, check the test log for a recommended percentage if Test mode flagged the bin's capacity as insufficient.

The Junk File Wizard
HDValet works by letting you define what files are "junk" and then deleting them for you automatically. The Junk file type wizard gathers all the information needed to describe a junk file type. It appears when you click New item or Edit in the main window. The Name box on the first page corresponds to the title of the junk file type in the main window, and the Enabled checkbox corresponds directly to this type's box in the main window. Select the method you want HDValet to use for cleaning up this type of file, and click Next.

The folders page (see Figure 2 )lets you tell HDValet where to search for junk files of this type. By default, HDValet searches all local fixed disks. You cannot use it to define junk file types across a network or on removable drives. If you wish to limit the search to specific drives or folders, click the Change button on this page.

The Change button brings up the folder choice window (see Figure 3 ), which provides a graphical, tree-type display indicating which folders are selected. Each folder has a checkbox and one of five different folder icons. The checkbox determines whether the folder will be searched, and the folder icons indicate how the subfolders will be treated. Select an item in the list and click one of the three buttons to choose ALL subfolders, NO subfolders, or Listed subfolders (the subfolders currently listed). You also can right-click an item and choose the same commands from the pop-up menu. See the online Help for examples of how to use this window.

After you make your selections in the folder choice window, HDValet will distill the information to as small a folder list as possible and display it on the folders page. Click the Clear button on the folders page to discard the list of folders and return to the default of searching all local fixed disks. When you have selected the folders to search, click Next.

Now that you've told HDValet where to search, you must tell it what kind of files to remove. On the files to delete page (see Figure 4 ), you must enter at least one file specification. File specifications can contain the * and ? wildcard characters (for example, *.BAK or FILE????.CHK), or they can be unambiguous filenames such as MSCREATE.DIR. To add a new file specification, click the Add button. To edit the highlighted file specification, click Edit.

You can further refine the set of files to be deleted using date, age, or size restrictions. The date restrictions let you limit the search to files whose date/time stamp is equal to, before, or after a specific date, or limit the search to files whose date/time stamp is between or not between two specified dates. The age restriction lets you limit the search to files whose age is equal to, older than, or newer than a selected age, or limit the search to files whose age is between or not between two selected ages. The size restriction lets you limit the search to files whose size is equal to, greater than, or less than a specified number of bytes, kilobytes, or megabytes, or lets you limit the search to files whose size is between or not between two sizes.

The Next button will be enabled when at least one file specification has been entered, and when all settings for any selected date, age, or size restrictions are complete and valid. Clicking the button will bring you to the files to protect page. If you enter a file specification on this page, files that match it will not be deleted. Naturally, this only makes sense if the file specification is a subset of a file specification from the previous page. You can also enter a full pathname to protect one specific file. For example, one of the default junk file types is Help temporary files. Every time you use the full-text search feature of WinHelp, it creates a *.FTS file to hold the index. You can delete these files because they are recreated as needed. For a large help file, however, recreating the *.FTS file can take time. If you frequently use full-text search on a large help file, you can edit this file type to protect that particular file. When you're done entering any file specifications or full pathnames for protection, click Next.

The next page (see Figure 5 ) summarizes the properties of the junk file type that you've created or edited. It's the same display you see when you click the Properties button in the main window. Look it over carefully. If anything isn't right, click Back to go back and make corrections.

At this point, you have finished describing the junk file type. You now can click Finish to record your entries, but if you're creating a brand new type, it's best to click Next instead. The last page of the wizard performs a search based on your specifications and reports on the number of files found. You can click Stop to abort the process. If you want this information later, just click the Count button in the main window. For a list of found files, click the button View file list. When you are done looking everything over, click Finish to return to the main window.

It takes a one-time effort to teach HDValet how you want your disks cleaned up, but once you've gone through the process, cleanup is effortless. Every week or two, launch HDValet and click Clean up. That's it!

Inside HDValet
HDValet's purpose is to remove junk files, so it needs to know exactly how the user defines junk. The process starts with telling HDValet where to look for junk files. The folder window lets the user enter a set of search locations, and stores only as much information as necessary. Another essential part of a junk file definition involves excluding files that aren't junk. HDValet calls standard Windows API functions to identify files that match the junk file specifications, but uses its own wildcard pattern-matching algorithm to handle excluded file specifications. By default, HDValet sends removed files to the Recycle Bin, so it needs to be sure the Recycle Bin's settings are correct. These programming challenges added spice to the mundane task of cleaning up junk files.

Selecting Folders
How should the user tell HDValet where to search for junk files? One approach would be to gather a list of folders, perhaps using the Windows API function SHBrowseForFolder(). But complications quickly set in. Does the user want to search this folder, its subfolders, or both? And what happens when two selections overlap?

Rather than ask the user to enter the folders individually, HDValet presents a full-scale tree display of all folders on the system. Initially, the tree contains only the root node and a node for each local fixed disk. Nodes that represent folders containing subfolders are marked with the usual boxed plus sign at the left, indicating that they can be expanded, but their child nodes are not initially loaded. Each time the user expands a branch of the tree for the first time, HDValet reads the necessary data from the disk. Thus the tree can be displayed quickly, and only holds detailed information about folders that the user wants to see.

Every folder node has a checkbox and a folder icon to its left. The checkbox indicates whether the folder itself is to be searched. The icon indicates how subfolders are to be treated. There are five possible subfolder states, each with its own icon. I've given them simple names for the purpose of this discussion:

ALL: Search all subfolders that exist at clean-up time.
NONE: Search no subfolders.
LISTED: Search all subfolders that exist now.
SOME: Search some subfolders, not all.
NA: Folder has no subfolders (not applicable).

The three buttons in the folder choice window, ALL subfolders, NO subfolders, and Listed subfolders, set the current node's state to ALL, NONE, and LISTED respectively. The SOME state appears as changes to folder settings are propagated up and down the folder tree. A folder with no subfolders can only be in the NA or ALL state.

When the user sets a node to ALL or LISTED, HDValet updates all of its child nodes. It puts a check in each checkbox, and sets every child node that has subfolders to the ALL state. When the user sets a node to NONE, the checkbox for every child node is cleared, and every node that has subfolders is changed to the NONE state. In all three cases, child nodes without subfolders remain in the NA state.

After any change to a node's state, HDValet checks the parent node. If none of the parent's child nodes are selected after the latest change, the parent's state changes to NONE. If all of the parent's child nodes are now selected, its state changes to LISTED. Otherwise, its state changes to SOME, meaning some, but not all, subfolders will be searched. The parent's parent is treated in a similar fashion, and so on up to the root node.

HDValet saves the user's choices as a list of folder names, along with a flag that says whether to search the folder only, its subfolders only, or both. Starting at the root node, it traverses the tree. For each node whose state is LISTED or SOME, it traverses the child nodes; for those whose states are ALL, NONE, or NA, it does not traverse the child nodes. If the current node's state is ALL, HDValet records the folder name and flags it to search either subfolders only or both folder and subfolders, depending on the state of the check box. For nodes in all other states, if the check box is checked, HDValet records the folder name and flags it to search the folder only. The resulting list defines completely the desired set of folders.

Matching Filenames
When processing a junk file type, HDValet steps through all the folders specified as search locations. In each folder, it uses the standard API functions FindFirstFile() and FindNextFile() to process each of the file specifications in the junk file type. For each found file, HDValet checks any date, age, or size restrictions, and skips the file if it doesn't match those restrictions. It also checks the file against the list of filenames and file specifications to be protected.

I tried using FindFirstFile/FindNextFile again to list the excluded files in each folder, but this proved unnecessarily time-consuming. Instead, I decided to check each found file's name against the excluded file specifications using a pattern-matching algorithm.

Windows provides a handy internal function called PathMatchSpec(), which compares a filename to a wildcard file specification and reports whether they match. Unfortunately, this function is present only when the Windows Desktop Update is installed (specifically, it needs shlwapi.dll version 4.71 or later). Also, filtering a list of all files in a folder using PathMatchSpec() didn't yield the same result as using the same file specification in a FindFirstFile/FindNextFile loop.

Figure 6:
function MyPathMatchSpec(FileParam, Spec : String) : Bool;
  function MMatch(FName, FSpec : PChar) : Bool;
  begin
    CASE FSpec^ OF
      #0  : Result := FName^ = #0;
      '?' : Result := MMatch(FName+1, FSpec+1);
      '*' : REPEAT
              Result := MMatch(FName, FSpec+1);
              IF Result THEN Break;
              IF FName[0]=#0 THEN Break;
              FName := FName + 1;
            UNTIL False;
      ELSE IF FName^ = FSpec^ THEN
        Result := MMatch(FName+1, FSpec+1)
      ELSE Result := False;
    END;
  end;
begin
  IF Pos('.', FileParam) = 0 THEN FileParam := FileParam + '.';
  Result:= MMatch(PChar(Uppercase(FileParam)),
    PChar(Uppercase(Spec)));
end;

Figure 6 shows MMatch, a recursive function that duplicates the pattern-matching used by FindFirstFile/FindNextFile. In preparation for calling this function, HDValet appends a period to filenames with no extension, and converts both the filename and the file specification to uppercase. It passes the arguments as PChars rather than Delphi String variables, because it's very easy to "walk" a pointer through a PChar.

MMatch works by effectively splitting the file specification into its first character (FSpec^) and remaining characters (FSpec+1), and doing the same to the filename (FName^ and FName+1). If FSpec^ is #0, the ASCII null character, it means the filespec is empty. MMatch returns true only if the filename is also empty. The question mark wildcard matches exactly one character. If FSpec^ is a question mark, MMatch calls itself to compare the remainder of the filespec with the remainder of the filename.

The asterisk wildcard, which matches zero or more characters, is a bit trickier. MMatch calls itself to compare the remainder of the filespec with the current filename. If there's no match, it advances the filename pointer, effectively discarding the first character, and tries again. This continues until some suffix of the original filename matches the remainder of the filespec, in which case MMatch returns true, or until the filename is empty, in which case MMatch returns false.

If the first character of the filespec is not a wildcard character, MMatch compares it with the first character of the filename. If they don't match, it returns False. If they do, it calls itself to compare the remainder of the filespec with the remainder of the filename. That's what it takes to test whether a filename matches a file specification.

One other minor point: When the system checks a filename against a file specification, it checks both the long filename and the short filename. HDValet first tests the long filename; if there's no match, it tries the short filename.

Quizzing the Recycle Bin
If the Recycle Bin is disabled or set to 0% for any drive, HDValet doesn't permit use of the Recycle Bin. It will disable any junk file types that use the Recycle Bin, and won't let you enable them. In the Junk file type wizard, the Send to Recycle Bin option will be disabled. One way to obtain information about the Recycle Bin is the 32-bit Windows API function SHQueryRecycleBin(). However, the only information this function provides is the number and size of files currently in the Recycle Bin. HDValet needs more information, so I was forced to go spelunking in the Registry.

The Recycle Bin's settings are stored in the Registry key HKEY_LOCAL_MACHINE-\SOFTWARE-\Microsoft-\Windows-\CurrentVersion-\explorer-\BitBucket. The general settings reside in a 72-byte binary value named PurgeInfo. There are also separate values for each local fixed disk, but the data HDValet needed was all in PurgeInfo. Unfortunately, this data is just a big binary block, with no clue as to its meaning. By painstaking empirical observation, making changes, and watching how the data changed, I came up with this data structure:

cbSize A 4-byte integer–size of the structure.
UseOne A 4-byte integer–1 if all drives use the same settings, 0 if not.
DrivePercents An array of twenty-six 2-byte integers–the specified percentage for each drive (irrelevant if UseOne=1).
DefaultPercent A 2-byte Integer - seems to always be 10 percent.
GlobalPercent A 2-byte Integer - the percentage set on the Global tab.
NoBinDrives A 4-byte Integer - Each bit represents the Do not... Recycle checkbox for a given drive, a=0x00000001, b=0x00000002, c=0x00000004, and so on. If that box is checked on the Global tab, 0x08000000 is set.
Signature A 4-byte Integer–its purpose is not clear, but it does not change when the Recycle Bin settings change.

Our testing showed that this data structure correctly reflects the essential Recycle Bin settings under Windows 95, Windows 98, and Windows NT 4.0.

HDValet also uses this information when producing the test mode report. It retrieves the capacity of each local fixed disk using Delphi's GetDiskFreeSpaceEx() function. This function calls the Windows API function of the same name if available, or makes the necessary calculation if not. It multiplies that capacity by the Recycle Bin percentage. Then it sums the amount of Recycle Bin space that would be required on each drive. If this amount exceeds the calculated capacity, it adds a warning to the test log.

Probably the biggest challenge in writing HDValet was in the design, since the different components are so dependent on each other. Once I settled on the use of a wizard, the other pieces fell into place.

Neil J. Rubenking, the author of HDValet, is the contributing technical editor of PC Magazine. Sheryl Canter is the editor of the Utilities column and a contributing editor of PC Magazine.


HDValet: Download It Here


First Published in PC Magazine, US Edition, October 5, 1999 (v18n17)
HDValet, Version 1.1
Platforms: Windows 95, Windows 98, Windows NT 4

License Information:
PC Magazine programs are copyrighted and cannot be distributed, whether modified or unmodified. Use is subject to the terms and conditions of the license agreement distributed with the programs.

Description:
As you use your computer, your hard disk becomes cluttered with "junk files" that take up space and serve no purpose. For example, many programs save the previous version of a document with the extension .BAK. ScanDisk can leave files with the .CHK extension in the root folder of any drive, and these are rarely useful. Cache directories are often filled with files meant to speed access to sites you will never revisit. It's time-consuming to delete these files manually. HDValet automates the process. Just select the junk file types you want to eliminate and click the "Clean up" button. Junk file types are configurable, and you can add them as needed. A confirmation process protects against mistakes.

Copyright (c) 2004 Ziff Davis Media Inc. All Rights Reserved.

1