Efficiently Discovering Unexpected Pattern Co-Occurrences

Abstract. Our world is filled with both beautiful and brainy people, but how often does a Nobel Prize winner also wins a beauty pageant? Let us assume that someone who is both very beautiful and very smart is more rare than what we would expect from the combination of the number of beautiful and brainy people. Of course there will still always be some individuals that defy this stereotype; these beautiful brainy people are exactly the class of anomaly we focus on in this paper. They do not posses rare qualities, but it is the unexpected combination of factors that makes them stand out.

In this paper we define the above described class of anomaly and propose UpC, an algorithm to quickly identify them in transaction data. The effectiveness of UpC is thoroughly verified with a wide range of experiments on both real world and synthetic data.

Implementation

the source code by Roel Bertens, in part based on Slim.

Related Publications

Bertens, R, Vreeken, J & Siebes, A Efficiently Discovering Unexpected Pattern-Co-Occurrences. In: Proceedings of the SIAM International Conference on Data Mining (SDM), SIAM, 2017. (overall 26% acceptance rate)
Bertens, R, Vreeken, J & Siebes, A Beauty and Brains: Detecting Anomalous Pattern Co-Occurrences. Technical Report 1512.07048, arXiv, 2016.