An Explainable Attention Network for Fraud Detection in Claims Management
Insurance companies must manage millions of claims per year. While most of these are not fraudulent, those that are nevertheless cost insurance companies and those they insure vast amounts of money. The ultimate goal is to develop a predictive model that can single out fraudulent claims and pay out non-fraudulent ones automatically. Health care claims have a peculiar data structure, comprising inputs of varying length and variables with a large number of categories. Both issues are challenging for traditional econometric methods. We develop a deep learning model that can handle these challenges by adapting methods from text classification. Using a large dataset from a private health insurer in Germany, we show that the model we propose outperforms a conventional machine learning model. With the rise of digitalization, unstructured data with characteristics similar to ours will become increasingly common in applied research, and methods to deal with such data will be needed.