The detection of outliers or anomalous data patterns is one of the most prominent machine learning use cases in industrial applications. I present a Bayesian histogram anomaly detector (BHAD), where the number of bins is treated as an additional unknown model parameter with an assigned prior distribution. BHAD scales linearly with the sample size and enables a straightforward explanation of individual scores, which makes it very suitable for industrial applications when model interpretability is crucial. I study the predictive performance of the proposed BHAD algorithm with various SoA anomaly detection approaches using simulated data and also using popular benchmark datasets for outlier detection. The reported results indicate that BHAD has very competitive predictive accuracy
compared to other more complex and computationally more expensive algorithms, while being explainable and fast.
I present an unsupervised and explainable Bayesian anomaly detection algorithm. For this I consider the posterior predictive distribution of a Categorical-Dirichlet distribution and use it to construct a Bayesian histogram-based anomaly detector (BHAD).
BHAD scales linearly with the size of the data and allows a direct explanation of individual anomaly scores due to its simple linear functional form, which makes it very suitable for practical applications when model interpretability is crucial. Based on simulated data and also using popular benchmark datasets for outlier detetcion I analyze the predictive performances of the used candidate models and also compare them with outlier ensemble approaches. The results suggest that the proposed BHAD model has very competitive performance compared to other more complex models like variational autoencoders, in fact it is among the best performing candidates while offering individual and global model explainability.