In order to perform anomaly detection in a predictive maintenance environment, Mondragon University has developed a novel Machine Learning algorithm, named Repetitive Weighted Attribute Oriented Induction (ReWAOI), a hierarchical clustering algorithm. Based on its definition, the ReWAOI aims to generate groups according to data similarities. To find these similarities, the algorithm uses hierarchical structures, denoted concept-trees, and transforms the different attribute values of the dataset according to them, in a process named generalisation. In addition, this algorithm establishes weights for the clusters, depending on the level of generalisation of the attributes. Based on these weights, a numerical function that represents the wear trend of the machine is generated: quantification.

One of the main features of this algorithm is the capacity of combining data collected from machines and knowledge of domain experts of the area. Considering the knowledge of the domain experts is extremely relevant to achieve more reliable results that can help the experts to take more precise maintenance decisions and reducing unplanned downtime. Moreover, the ReWAOI has a demonstrated capacity for data representation tasks, as it has been used in scenarios such as spatial patterns, medical science, security and business decision-making. This power for data representation, in combination with the capacity of utilisation of domain knowledge, provides facilities to estimate tasks such as the Root Cause Analysis, that is, the ability to know the ultimate reason that caused the anomaly in the first place.

The initial results for the examined use-case (Philips UC-1) were related to the Anomaly Detection. Based on the quantification function generated with ReWAOI, the correction level of the analysed executions (strokes of the machine) has been represented. This way, we have been able to detect anomalous executions. As the dataset is labeled, it is possible to compare the ground truth with ReWAOI.

To train the ReWAOI and generate clusters, correct observations were used (anomalous strokes where not used in training). Hence, if previously unregistered instances appear in a monitored execution, the probabilities to consider the stroke as faulty stroke increased. About 4500 strokes has been analysed, and the detection algorithm has yielded a 99,368% accuracy, when compared to the ground truth.