Accession Number : AD1005367


Title :   PAVE: Write-print Creation with MapReduce


Descriptive Note : Technical Report


Corporate Author : Network Science Ceter, United States Military Academy West Point United States


Personal Author(s) : Matthews,Suzanne J ; St Amour,Leo ; Ulrich,Frederick ; Kellas,Andreas ; Molnar,Alexander


Full Text : http://www.dtic.mil/get-tr-doc/pdf?AD=AD1005367


Report Date : 01 Aug 2015


Pagination or Media Count : 8


Abstract : Cyber-crime is becoming alarmingly common through the use of anonymous e-mails. Author attribution helps digital forensics investigators filter through a large set of possible authors and focus traditional investigative techniques on the most probable culprits. A recent promising technique is to construct a write-print for each known author and compare it to the write-print extracted from the anonymous message(s). A write-print is a unique digital fingerprint created by mining frequent patterns from a particular authors writing style. Parallel computing enables us to leverage multiple cores in the creation of author write-prints. We introduce Parallel Author Verification of E-mail (PAVE), a MapReduce algorithm for generating author write-prints in parallel. Our algorithm is able to achieve up to 90 accuracy when tested on a subset of the Enron dataset. We believe the community will find the PAVE system useful to expedite author identification in time sensitive situations.


Descriptors :   data mining , feature extraction , pattern recognition , electronic mail , distributed computing , algorithms


Distribution Statement : APPROVED FOR PUBLIC RELEASE