Site Map Online Directory
  Search Information Technology   Northwestern University  
YOU ARE HERE > HomePolicies and GuidelinesSearch Tools for Sensitive Data
Additional Information

Policies and Guidelines

Guideline for Using Sensitive Data Search Tools

The critical importance of securing sensitive data and personally identifiable information (PII) on University desktop computers calls for technical support staff to be aware of possible preventative and remedial measures.

Though complete data security relies on a multitude of factors, technology-oriented tools can be used to reduce the risk of exposure. The programs and processes outlined within this guideline may be able to identify and protect PII that resides on personal computers and servers.

NUIT has reviewed several open source and vendor supported applications that are designed to identify occurrences of Social Security Numbers (SSNs) and other types of sensitive data. No single vendor provided a comprehensive solution, and the best results were achieved when products were used in combination with others.

In addition, use of these tools is often time and CPU-intensive, requiring a day or more of processing time. Technical support staff should also have knowledge of how to create searches with wildcard and other character strings for SSNs and credit card numbers. A short reference about character string searching is available from NUIT Information and Systems Security/Compliance.

Audience:

Tools should be used by technical support staff only. If your department is interested in conducting a search for sensitive data, NUIT's Distributed Support Services can provide guidance in this process.

These measures may be especially relevant for users who frequently come in contact with social security and credit card numbers, such as business managers, lead administrative personnel, and accounts receivables staff.

Statement:

The following PPI data search tools have been tested by NUIT, and may provide preventative and remedial measures for locating PII on University desktop computers.

For every tool, make sure that it checks all possible files that may contain sensitive data, and be aware that PDFs and ZIP files may cause problems, though these formats may contain PII data. Further, plan for at least a day to collect and examine data on a loaded machine.

Tool Pros Cons
Cornell Spider 3.0 beta
  • Easy to use
  • Preconfigured options allow searches for SSN and credit card numbers
  • Options allow you to customize what data is displayed in the results
    log 
  • User defined searches
  • Options only allow you to display one search result in context
  • Depending on options chosen, can result in high numbers of false positives
dbDataFinder v1.65
  • Excellent reporting options help to make results very useful
  • False positive rate is very low
  • Very stable
  • Produces reports that are suitable for most users to understand
  • No user defined searches (This is supposed to be available in the next version)
  • Available for Windows only, although you can search network drives mounted on a Windows system
DTSearch Desktop 7.25
  • Produces an indexed list of data on the machine, allowing quicker searching
  • Converts file types to HTML for display with highlighted hits
  • Searches word processing, database, spreadsheet, email and attachments, ZIP, and Unicode files  
  • Loaded machines may require user to develop a method for systematically searching contents
  • Data found in PDF format is not communicated clearly

File Hunter 3.5.6.0
  • Finds test files quickly
  • Results easy to read, displayed with color-coded search strings
  • May not have been updated recently
  • Graphics difficult to read
  • Does not appear to search files other than .TXT files
  • Not highly recomended 
Google Desktop Search
  • Indexing feature enables fast searches
  • As effective as Windows search
  • No effective way to search for digit strings
  • May find better results with other tools 
PowerGREP 3.2.2
  • Worked well on basic and loaded machine
  • Easy-to-understand display shows files containing matched data
  • Few false positives, which are easy to recognize
  • Does not search for hidden files automatically, change this under "preferences"
  • For easier searching, change display to "Do not show files or matches," and switch back to review results
Windows Grep 2.3.0.2269
  • Easy-to-navigate display shows files of matched data
  • Does not locate strings found in PDF or ZIP files
  • Does not appear to be under active development

Original Issue Date:

August 2006

Revision Dates:

July 2007

Last Updated: 17 July 2007