Solving the "false positives" problem in fraud prediction
Authors:
Roy Wedge,
James Max Kanter,
Santiago Moral Rubio,
Sergio Iglesias Perez,
Kalyan Veeramachaneni
Abstract:
In this paper, we present an automated feature engineering based approach to dramatically reduce false positives in fraud prediction. False positives plague the fraud prediction industry. It is estimated that only 1 in 5 declared as fraud are actually fraud and roughly 1 in every 6 customers have had a valid transaction declined in the past year. To address this problem, we use the Deep Feature Sy…
▽ More
In this paper, we present an automated feature engineering based approach to dramatically reduce false positives in fraud prediction. False positives plague the fraud prediction industry. It is estimated that only 1 in 5 declared as fraud are actually fraud and roughly 1 in every 6 customers have had a valid transaction declined in the past year. To address this problem, we use the Deep Feature Synthesis algorithm to automatically derive behavioral features based on the historical data of the card associated with a transaction. We generate 237 features (>100 behavioral patterns) for each transaction, and use a random forest to learn a classifier. We tested our machine learning model on data from a large multinational bank and compared it to their existing solution. On an unseen data of 1.852 million transactions, we were able to reduce the false positives by 54% and provide a savings of 190K euros. We also assess how to deploy this solution, and whether it necessitates streaming computation for real time scoring. We found that our solution can maintain similar benefits even when historical features are computed once every 7 days.
△ Less
Submitted 20 October, 2017;
originally announced October 2017.
Registering the evolutionary history in individual-based models of speciation
Authors:
Carolina L. N. Costa,
Flavia M. D. Marquitti,
S. Ivan Perez,
David M. Schneider,
Marlon F. Ramos,
Marcus A. M. de Aguiar
Abstract:
Understanding the emergence of biodiversity patterns in nature is a central problem in biology. Theoretical models of speciation have addressed this question in the macroecological scale, but little has been investigated in the macroevolutionary context. Knowledge of the evolutionary history allows the study of patterns underlying the processes considered in these models, revealing their signature…
▽ More
Understanding the emergence of biodiversity patterns in nature is a central problem in biology. Theoretical models of speciation have addressed this question in the macroecological scale, but little has been investigated in the macroevolutionary context. Knowledge of the evolutionary history allows the study of patterns underlying the processes considered in these models, revealing their signatures and the role of speciation and extinction in sha** macroevolutionary patterns. In this paper we introduce two algorithms to record the evolutionary history of populations in individual-based models of speciation, from which genealogies and phylogenies can be constructed. The first algorithm relies on saving ancestral-descendant relationships, generating a matrix that contains the times to the most recent common ancestor between all pairs of individuals at every generation (the Most Recent Common Ancestor Time matrix, MRCAT). The second algorithm directly records all speciation and extinction events throughout the evolutionary process, generating a matrix with the true phylogeny of species (the Sequential Speciation and Extinction Events, SSEE). We illustrate the use of these algorithms in a spatially explicit individual-based model of speciation. We compare the trees generated via MRCAT and SSEE algorithms with trees inferred by methods that use only genetic distance among extant species, commonly used in empirical studies and applied here to simulated genetic data. Comparisons between tress are performed with metrics describing the overall topology, branch length distribution and imbalance of trees. We observe that both MRCAT and distance-based trees differ from the true phylogeny, with the first being closer to the true tree than the second.
△ Less
Submitted 19 December, 2017; v1 submitted 13 September, 2017;
originally announced September 2017.