The file frequent_hashcodes_and_paths_rdc.xml contains SHA1 hashcode and path data derived from the Real Drive Corpus collected by the DEEP Project at the U.S. Naval Postgraduate School. The file provides two kinds of data useful to forensic investigators: (1) SHA1 hashcodes that occurred for undeleted files on at least five different drives in the corpus but did not occur in the National Software Reference Library (www.nsrl.nist.gov). These are likely to indicate files uninteresting and excludable in most forensic investigations. File sizes and names are also given. (2) Path names (file name plus all directories) for paths that occurred on at least twenty different drives in the corpus on undeleted files. These usefully supplement the hashcodes in indicating recurring files uninteresting for investigators. However, occurrences of these files could include viruses and other malware, or could be hiding illegal content although it is unlikely. Data in releasable_data.xml is provided in DFXML format. Data on hashcodes is given as "hashrecord" items (with subfields "hashcode", "filesize", and "filenamealone") and data on paths is given as "pathrecord" items (with sole subfield "path"). Paths and file names are in UTF-8 Unicode, so a Unicode-enabled reader should be used to view this data. More details of how this data was obtained are provided in section 7 of the paper http://faculty.nps.edu/ncrowe/dfrws12_nsrl_rowe.htm. Note that paper gives statistics on a larger set of hashcodes which were obtained including deleted files as well. However, file names of deleted files in our corpus, as well as their other metadata, were in error on a significant fraction of the corpus. The results reported in the paper also include hashcodes for files that had the same or similar name to files with hashcodes in NSRL. However, we exclude those here because many of these were unique to individual users, and we want this collection to be for generic files. The "pathrecord" records in this data release are an attempt to provide a safer alternative to the similar-name hashcodes. The current version was produced in March of 2013 by Neil Rowe, ncrowe@nps.edu. It was obtained by analysis of images of 3537 drives containing a total of 96 million files. It contains 464,7067 hashcodes and 279,252 path names. Please acknowledge us in publications if you use this data.