As you use your computer, your hard disk becomes cluttered with
junk files that take up space and serve no purpose. For example,
many programs save previous versions of documents with the
extension .BAK. ScanDisk can leave files with the .CHK extension in
the root folder of any drive, and these files are rarely useful.
Cache directories are often filled with files meant to speed access
to sites you will never revisit. It's time-consuming to delete
these files manually. This issue's utility, HDValet, automates the
process. Just select the junk file types you want to eliminate and
click the Clean up button. Junk file types are configurable,
and you can add types as needed. A confirmation process protects
against mistakes.
HDValet runs under Microsoft Windows 95, Windows 98, and Windows
NT. The Delphi 4 source code for HDValet is provided with the
utility for those interested in seeing how it works.
Using HDValet
To install HDValet, simply run the supplied INSTALL.EXE. To
uninstall the program, use the Add/Remove Programs applet in the
Windows Control Panel.
HDValet comes preconfigured with a number of junk file types,
and you will see this list in the program's main window when you
first launch it (see Figure
1). You can add new types and remove or edit existing types.
You should review these default types to make sure that they suit
your needs. The Undo button becomes active after you've pressed
Remove, and will undo the most recent removal. If you modify
or remove one of the default file types, you can restore it by
clicking the Defaults button and choosing from the
Defaults list. You can reorder the types in the list using
the higher and lower buttons, or by pressing
Ctrl+Shift+H and Ctrl+Shift+L.
Many of the button commands can be invoked by right-clicking a
junk file type and choosing from the pop-up menu. Several other
handy commands are available only on the pop-up menu. Check
All and Uncheck All let you enable or disable all items
at once. Cleanup method for All lets you change the cleanup
method for all items at once. You can rename a junk file type by
clicking on it twice, slowly, then entering the new name.
When you click the Clean up button, it changes to a
Stop button, which will interrupt the cleanup process when
clicked. HDValet goes down the list of junk file types and
processes each checked type. It searches the set of folders
specified in the junk file type for files matching any of the
type's file specifications. If any found files don't meet the date,
age, or size restrictions for the junk file type, HDValet ignores
them. It also ignores any files that match the junk file type's
list of protected file specifications.
If the Test mode box is checked, HDValet will not
actually remove any files. Instead, it will log the files that
would have been removed to the file HDValet.TEST.TXT, in the
HDValet's own folder. When the test is complete, it will offer to
display the test log. At the end of the log, it reports the amount
of Recycle Bin space needed on each drive to perform this cleanup
operation. If the amount needed on any drive exceeds the capacity
of the Recycle Bin on that drive, HDValet will warn you of the
problem. In the log, it will flag the problem drive with "*ERROR*,"
and indicate the Recycle Bin percentage that would be
sufficient.
The very first time you run HDValet, it will probably find a
huge number of junk files. If you get a Recycle Bin warning on this
first try, you can work around it by processing a few junk file
types at a time. But if you consistently receive warnings that the
Recycle Bin's capacity is too low, you can correct the settings, as
described in the next section.
If the Test mode box is not checked, HDValet will remove
each file using the cleanup method specified for this particular
junk file type. There are three cleanup methods. The first, Send to
Recycle Bin, is the default setting. HDValet will move each
matching file to the Recycle Bin. After verifying that only junk
files were deleted, you can empty the Recycle Bin to permanently
delete the files.
The second method, Move to holding folder, moves each matching
file into a folder named HDVal$$$.$$$ in the root directory of the
same drive, retaining the existing folder structure. For example,
C:\Windows\getrid.bak would be moved to
C:\HDVal$$$.$$$\Windows\getrid.bak. If you wish, you can use a ZIP
compression utility to move the contents of this folder into an
archive file, retaining the directory structure. To permanently
delete these files, simply delete the \HDVAL$$$.$$$ folder. It will
be recreated as needed.
If you choose the third cleanup method, Delete permanently,
HDValet will simply delete the matching files, leaving you no
opportunity to restore them in case of error.
When HDValet cleans up your disk, it logs its activity to the
file HDValet.LOG.TXT, located in the program's own folder. At the
end of a cleanup session, the program will offer to show you the
log file. The log entry begins with the current date and time. It
reports each junk file type processed, along with the cleanup
method used. Under each junk file type, HDValet reports the files
that were removed. If an error occurs, it puts "ERR" to the left of
the filename, and displays the reason for the error on the next
line (if the reason is known). The cleanup log is cumulative; each
new session adds to the log. From time to time, you'll want to
delete the earliest entries so the log doesn't get too large.
There's always a possibility that one of your junk file types
may match a file that isn't junk. The first few times you use
HDValet, be sure to run the test mode before performing an actual
cleanup and carefully review the list of files that would have been
removed. If the list includes a file that you'd want to keep, edit
the junk file type to protect that particular file or files
matching certain specifications.
Even when you take this precaution, it's possible that at some
point HDValet will delete a file that you wish it hadn't. The
technique for restoring such a file depends on the cleanup method
that was used. If you selected Send to Recycle Bin, double-click
the Recycle Bin icon on your desktop to open it. Locate the
file in question, right-click it, and choose Restore. If you
selected Move to holding folder, launch Explorer and open the
folder HDVAL$$$.$$$ in the root directory of the drive that
contained the file. Right-click the folder and use Find to
locate the file, or simply navigate to it in Explorer. Now drag the
file back to its old location. If you selected Delete permanently,
the file cannot be restored. After restoring a file, you should
edit the corresponding junk file type so it won't delete that file
again.
Recycle Bin
HDValet's default behavior is to send junk files to the Recycle
Bin. However, users can disable the Recycle Bin or limit its
capacity. At startup, HDValet checks to make sure the Recycle Bin
is enabled, and it checks again each time you press the Clean
up button. If the Recycle Bin is not functional, HDValet
disables any junk file types that rely on the Recycle Bin; then it
displays a warning.
Some power users deliberately turn off the Recycle Bin. In order
for HDValet to work in such a case, you must edit your junk file
types to use one of the other two cleanup methods. If you want to
be able to send junk files to the Recycle Bin, you will have to
change the bin's settings to make it fully functional on all local
fixed disks.
To change the Recycle Bin settings, right-click the Recycle Bin
icon and choose Properties. If the option button Use one
setting for all drives is checked, then you only need to make
changes on the Global tab. Make sure the box labeled Do
not move files to the Recycle Bin is not checked, and make sure
the size of the Recycle Bin is set to something more than 0
percent. If you received a warning came in Test mode, check the
test log for a recommended percentage.
If the option button Configure drives independently is
checked, then you must change settings on each tabbed page. For
each page, make sure the Do not move... box is not checked,
and set the Recycle Bin size to something more than 0 percent. Here
again, check the test log for a recommended percentage if Test mode
flagged the bin's capacity as insufficient.
The Junk File Wizard
HDValet works by letting you define what files are "junk" and then
deleting them for you automatically. The Junk file type
wizard gathers all the information needed to describe a junk
file type. It appears when you click New item or Edit
in the main window. The Name box on the first page
corresponds to the title of the junk file type in the main window,
and the Enabled checkbox corresponds directly to this type's
box in the main window. Select the method you want HDValet to use
for cleaning up this type of file, and click Next.
The folders page (see Figure 2 )lets you tell HDValet where to search for
junk files of this type. By default, HDValet searches all local
fixed disks. You cannot use it to define junk file types across a
network or on removable drives. If you wish to limit the search to
specific drives or folders, click the Change button on this
page.
The Change button brings up the folder choice window (see
Figure 3 ), which
provides a graphical, tree-type display indicating which folders
are selected. Each folder has a checkbox and one of five different
folder icons. The checkbox determines whether the folder will be
searched, and the folder icons indicate how the subfolders will be
treated. Select an item in the list and click one of the three
buttons to choose ALL subfolders, NO subfolders, or Listed
subfolders (the subfolders currently listed). You also can
right-click an item and choose the same commands from the pop-up
menu. See the online Help for examples of how to use this
window.
After you make your selections in the folder choice window,
HDValet will distill the information to as small a folder list as
possible and display it on the folders page. Click the
Clear button on the folders page to discard the list
of folders and return to the default of searching all local fixed
disks. When you have selected the folders to search, click
Next.
Now that you've told HDValet where to search, you must tell it
what kind of files to remove. On the files to delete page (see Figure 4 ), you must enter
at least one file specification. File specifications can contain
the * and ? wildcard characters (for example, *.BAK or
FILE????.CHK), or they can be unambiguous filenames such as
MSCREATE.DIR. To add a new file specification, click the Add
button. To edit the highlighted file specification, click
Edit.
You can further refine the set of files to be deleted using
date, age, or size restrictions. The date restrictions let you
limit the search to files whose date/time stamp is equal to,
before, or after a specific date, or limit the search to files
whose date/time stamp is between or not between two specified
dates. The age restriction lets you limit the search to files whose
age is equal to, older than, or newer than a selected age, or limit
the search to files whose age is between or not between two
selected ages. The size restriction lets you limit the search to
files whose size is equal to, greater than, or less than a
specified number of bytes, kilobytes, or megabytes, or lets you
limit the search to files whose size is between or not between two
sizes.
The Next button will be enabled when at least one file
specification has been entered, and when all settings for any
selected date, age, or size restrictions are complete and valid.
Clicking the button will bring you to the files to protect page. If
you enter a file specification on this page, files that match it
will not be deleted. Naturally, this only makes sense if the file
specification is a subset of a file specification from the previous
page. You can also enter a full pathname to protect one specific
file. For example, one of the default junk file types is Help
temporary files. Every time you use the full-text search
feature of WinHelp, it creates a *.FTS file to hold the index. You
can delete these files because they are recreated as needed. For a
large help file, however, recreating the *.FTS file can take time.
If you frequently use full-text search on a large help file, you
can edit this file type to protect that particular file. When
you're done entering any file specifications or full pathnames for
protection, click Next.
The next page (see Figure
5 ) summarizes the properties of the junk file type that you've
created or edited. It's the same display you see when you click the
Properties button in the main window. Look it over
carefully. If anything isn't right, click Back to go back
and make corrections.
At this point, you have finished describing the junk file type.
You now can click Finish to record your entries, but if
you're creating a brand new type, it's best to click Next
instead. The last page of the wizard performs a search based on
your specifications and reports on the number of files found. You
can click Stop to abort the process. If you want this
information later, just click the Count button in the main
window. For a list of found files, click the button View file
list. When you are done looking everything over, click
Finish to return to the main window.
It takes a one-time effort to teach HDValet how you want your
disks cleaned up, but once you've gone through the process, cleanup
is effortless. Every week or two, launch HDValet and click Clean
up. That's it!
Inside HDValet
HDValet's purpose is to remove junk files, so it needs to know
exactly how the user defines junk. The process starts with telling
HDValet where to look for junk files. The folder window lets the
user enter a set of search locations, and stores only as much
information as necessary. Another essential part of a junk file
definition involves excluding files that aren't junk. HDValet calls
standard Windows API functions to identify files that match the
junk file specifications, but uses its own wildcard
pattern-matching algorithm to handle excluded file specifications.
By default, HDValet sends removed files to the Recycle Bin, so it
needs to be sure the Recycle Bin's settings are correct. These
programming challenges added spice to the mundane task of cleaning
up junk files.
Selecting Folders
How should the user tell HDValet where to search for junk files?
One approach would be to gather a list of folders, perhaps using
the Windows API function SHBrowseForFolder(). But complications
quickly set in. Does the user want to search this folder, its
subfolders, or both? And what happens when two selections
overlap?
Rather than ask the user to enter the folders individually,
HDValet presents a full-scale tree display of all folders on the
system. Initially, the tree contains only the root node and a node
for each local fixed disk. Nodes that represent folders containing
subfolders are marked with the usual boxed plus sign at the left,
indicating that they can be expanded, but their child nodes are not
initially loaded. Each time the user expands a branch of the tree
for the first time, HDValet reads the necessary data from the disk.
Thus the tree can be displayed quickly, and only holds detailed
information about folders that the user wants to see.
Every folder node has a checkbox and a folder icon to its left.
The checkbox indicates whether the folder itself is to be searched.
The icon indicates how subfolders are to be treated. There are five
possible subfolder states, each with its own icon. I've given them
simple names for the purpose of this discussion:
| ALL: |
Search all
subfolders that exist at clean-up time. |
| NONE: |
Search no
subfolders. |
| LISTED: |
Search all
subfolders that exist now. |
| SOME: |
Search some
subfolders, not all. |
| NA: |
Folder has no
subfolders (not applicable). |
The three buttons in the folder choice window, ALL
subfolders, NO subfolders, and Listed subfolders, set
the current node's state to ALL, NONE, and LISTED respectively. The
SOME state appears as changes to folder settings are propagated up
and down the folder tree. A folder with no subfolders can only be
in the NA or ALL state.
When the user sets a node to ALL or LISTED, HDValet updates all
of its child nodes. It puts a check in each checkbox, and sets
every child node that has subfolders to the ALL state. When the
user sets a node to NONE, the checkbox for every child node is
cleared, and every node that has subfolders is changed to the NONE
state. In all three cases, child nodes without subfolders remain in
the NA state.
After any change to a node's state, HDValet checks the parent
node. If none of the parent's child nodes are selected after the
latest change, the parent's state changes to NONE. If all of the
parent's child nodes are now selected, its state changes to LISTED.
Otherwise, its state changes to SOME, meaning some, but not all,
subfolders will be searched. The parent's parent is treated in a
similar fashion, and so on up to the root node.
HDValet saves the user's choices as a list of folder names,
along with a flag that says whether to search the folder only, its
subfolders only, or both. Starting at the root node, it traverses
the tree. For each node whose state is LISTED or SOME, it traverses
the child nodes; for those whose states are ALL, NONE, or NA, it
does not traverse the child nodes. If the current node's state is
ALL, HDValet records the folder name and flags it to search either
subfolders only or both folder and subfolders, depending on the
state of the check box. For nodes in all other states, if the check
box is checked, HDValet records the folder name and flags it to
search the folder only. The resulting list defines completely the
desired set of folders.
Matching Filenames
When processing a junk file type, HDValet steps through all the
folders specified as search locations. In each folder, it uses the
standard API functions FindFirstFile() and FindNextFile() to
process each of the file specifications in the junk file type. For
each found file, HDValet checks any date, age, or size
restrictions, and skips the file if it doesn't match those
restrictions. It also checks the file against the list of filenames
and file specifications to be protected.
I tried using FindFirstFile/FindNextFile again to list the
excluded files in each folder, but this proved unnecessarily
time-consuming. Instead, I decided to check each found file's name
against the excluded file specifications using a pattern-matching
algorithm.
Windows provides a handy internal function called
PathMatchSpec(), which compares a filename to a wildcard file
specification and reports whether they match. Unfortunately, this
function is present only when the Windows Desktop Update is
installed (specifically, it needs shlwapi.dll version 4.71 or
later). Also, filtering a list of all files in a folder using
PathMatchSpec() didn't yield the same result as using the same file
specification in a FindFirstFile/FindNextFile loop.
Figure 6:
function MyPathMatchSpec(FileParam, Spec : String) : Bool;
function MMatch(FName, FSpec : PChar) : Bool;
begin
CASE FSpec^ OF
#0 : Result := FName^ = #0;
'?' : Result := MMatch(FName+1, FSpec+1);
'*' : REPEAT
Result := MMatch(FName, FSpec+1);
IF Result THEN Break;
IF FName[0]=#0 THEN Break;
FName := FName + 1;
UNTIL False;
ELSE IF FName^ = FSpec^ THEN
Result := MMatch(FName+1, FSpec+1)
ELSE Result := False;
END;
end;
begin
IF Pos('.', FileParam) = 0 THEN FileParam := FileParam + '.';
Result:= MMatch(PChar(Uppercase(FileParam)),
PChar(Uppercase(Spec)));
end;
Figure 6 shows MMatch, a recursive function that
duplicates the pattern-matching used by FindFirstFile/FindNextFile.
In preparation for calling this function, HDValet appends a period
to filenames with no extension, and converts both the filename and
the file specification to uppercase. It passes the arguments as
PChars rather than Delphi String variables, because it's very easy
to "walk" a pointer through a PChar.
MMatch works by effectively splitting the file specification
into its first character (FSpec^) and remaining characters
(FSpec+1), and doing the same to the filename (FName^ and FName+1).
If FSpec^ is #0, the ASCII null character, it means the filespec is
empty. MMatch returns true only if the filename is also empty. The
question mark wildcard matches exactly one character. If FSpec^ is
a question mark, MMatch calls itself to compare the remainder of
the filespec with the remainder of the filename.
The asterisk wildcard, which matches zero or more characters, is
a bit trickier. MMatch calls itself to compare the remainder of the
filespec with the current filename. If there's no match, it
advances the filename pointer, effectively discarding the first
character, and tries again. This continues until some suffix of the
original filename matches the remainder of the filespec, in which
case MMatch returns true, or until the filename is empty, in which
case MMatch returns false.
If the first character of the filespec is not a wildcard
character, MMatch compares it with the first character of the
filename. If they don't match, it returns False. If they do, it
calls itself to compare the remainder of the filespec with the
remainder of the filename. That's what it takes to test whether a
filename matches a file specification.
One other minor point: When the system checks a filename against
a file specification, it checks both the long filename and the
short filename. HDValet first tests the long filename; if there's
no match, it tries the short filename.
Quizzing the Recycle Bin
If the Recycle Bin is disabled or set to 0% for any drive,
HDValet doesn't permit use of the Recycle Bin. It will disable any
junk file types that use the Recycle Bin, and won't let you enable
them. In the Junk file type wizard, the Send to Recycle
Bin option will be disabled. One way to obtain information
about the Recycle Bin is the 32-bit Windows API function
SHQueryRecycleBin(). However, the only information this function
provides is the number and size of files currently in the Recycle
Bin. HDValet needs more information, so I was forced to go
spelunking in the Registry.
The Recycle Bin's settings are stored in the Registry key
HKEY_LOCAL_MACHINE-\SOFTWARE-\Microsoft-\Windows-\CurrentVersion-\explorer-\BitBucket.
The general settings reside in a 72-byte binary value named
PurgeInfo. There are also separate values for each local fixed
disk, but the data HDValet needed was all in PurgeInfo.
Unfortunately, this data is just a big binary block, with no clue
as to its meaning. By painstaking empirical observation, making
changes, and watching how the data changed, I came up with this
data structure:
| cbSize |
A 4-byte
integer–size of the structure. |
| UseOne |
A 4-byte
integer–1 if all drives use the same settings, 0 if
not. |
| DrivePercents |
An array of
twenty-six 2-byte integers–the specified percentage for each
drive (irrelevant if UseOne=1). |
| DefaultPercent |
A 2-byte Integer -
seems to always be 10 percent. |
| GlobalPercent |
A 2-byte Integer -
the percentage set on the Global tab. |
| NoBinDrives |
A 4-byte Integer -
Each bit represents the Do not... Recycle checkbox for a
given drive, a=0x00000001, b=0x00000002, c=0x00000004, and so on.
If that box is checked on the Global tab, 0x08000000 is
set. |
| Signature |
A 4-byte
Integer–its purpose is not clear, but it does not change when
the Recycle Bin settings change. |
Our testing showed that this data structure correctly reflects
the essential Recycle Bin settings under Windows 95, Windows 98,
and Windows NT 4.0.
HDValet also uses this information when producing the test mode
report. It retrieves the capacity of each local fixed disk using
Delphi's GetDiskFreeSpaceEx() function. This function calls the
Windows API function of the same name if available, or makes the
necessary calculation if not. It multiplies that capacity by the
Recycle Bin percentage. Then it sums the amount of Recycle Bin
space that would be required on each drive. If this amount exceeds
the calculated capacity, it adds a warning to the test log.
Probably the biggest challenge in writing HDValet was in the
design, since the different components are so dependent on each
other. Once I settled on the use of a wizard, the other pieces fell
into place.
Neil J. Rubenking, the author of HDValet, is the contributing
technical editor of PC Magazine. Sheryl Canter is the editor
of the Utilities column and a contributing editor of PC
Magazine.
|