Anomaly Detection & Seeding
-
Misclassification (Anomaly) Detection
"Garbage in, garbage out". Statistical classifiers learn from examples that were classified by human experts. What if the experts are wrong? As much as 30% of manually classified information in corporate catalogs is incorrectly classified, and it is well known that two domain experts (in blind tests) will often disagree on how to classify the same item. Notiora's Anomaly Detection technology identifies possible misclassifications, within a large collection of manually classified items. Quality assurance of internal or supplier processes may be achieved, in addition to creating better example sets for classifier learning.
-
Text Clustering
Notiora Clustering technology analyzes a database of text and groups together items that are similar. Text similarity can either be structural, in which both texts contain similar patterns, formatting, or linguistic structure, or can be topical, in which groupings of text share similar terminology.
-
Directed Seeding
Notiora Directed Seeding technology complements Optimized Bayes Classifiers by analyzing a collection of unclassified text and identifying those snippets that would best serve as examples ("seeds") for learning. By applying Directed Seeding, the customer's investment in manual classification of examples is most likely to result in high value in the optimized classifier.