Skip directly to content

On importance weighting for electric fraud detection with dataset shifts

on Fri, 01/11/2019 - 11:01
TitleOn importance weighting for electric fraud detection with dataset shifts
Publication TypeConference Proceedings
Year of Conference2019
AuthorsAntonelo EA, State R
Conference NameIEEE International Conference on Systems, Man, and Cybernetics (SMC)
Date Published2019
Keywordscovariate shift, electric fraud detection, importance weighting

Covariate shift and imbalanced datasets are common in real-world scenarios. Usually, the probability distribution for the collected data  is non-stationary due to  the incremental and endless process of sequential data collection, which is influenced by actions and predictions of human experts, predictive models, set of rules or other unknown external factors (i.e., user interaction on a website, seasonal/cyclic or geographical factors). Thus, a predictive model may be suboptimal in terms of generalization performance under a shift in the test input. In this work, we evaluate the importance-weighted fisher discriminant analysis (FDA) classifier in an electric fraud detection task with dataset shift, where the goal is to detect customers with frauds or irregular eletricity meters, also called nontechnical loss detection in the literature.
 The inputs to the model are mainly based on features computed from the monthly energy consumption time series of each customer, using a real-world dataset of 3.6M clients from an energy distribution network.
The importance weights which define the relevance of each training input sample are estimated via either of two methods: the Kullback-Leibler importance estimation procedure (KLIEP) and another based on a discriminative classifier with probabilistic output. 
On a series of experiments, we show that a misspecified (biased) classifer in the form of a Least Squares solution has its bias removed when the estimated importance weights are employed in the model, making it comparable to the solution given by the original unbiased FDA for the electric fraud detection task.