Binary Classifier Inspired by Quantum Theory
aa r X i v : . [ c s . L G ] M a r Binary Classifier Inspired by Quantum Theory
Prayag Tiwari
Department of Information EngineeringUniversity of PadovaVia Gradenigo 6/b 35131 - Padova, Italyhttps://sites.google.com/view/prayag-tiwari/[email protected]
Massimo Melucci ∼ melo/[email protected] Abstract
Machine Learning (ML) helps us to recognize patterns fromraw data. ML is used in numerous domains i.e. biomedical,agricultural, food technology, etc. Despite recent technologi-cal advancements, there is still room for substantial improve-ment in prediction. Current ML models are based on classi-cal theories of probability and statistics, which can now bereplaced by Quantum Theory (QT) with the aim of improv-ing the effectiveness of ML. In this paper, we propose the Bi-nary Classifier Inspired by Quantum Theory (BCIQT) model,which outperforms the state of the art classification in termsof recall for every category.
Introduction
ML is a set of models which can automatically identify thehidden patterns in the data and can then utilize hidden pat-terns to make decisions in condition of uncertainty. ML hasbeen progressively implemented in several areas includingchemistry, biomedical science and robotics. ML falls intothree categories, i.e. supervised learning (e.g. classification),unsupervised learning (e.g. clustering) and reinforcementlearning. In this paper we focus on classification, which isthe way to represent and allocate objects into different cate-gories.QT is the probabilistic approach to representing and pre-dicting properties of microscopic phenomena. Given an ob-servable and an arbitrary state of a microscopic particle, QTcomputes a probability distribution of the values of the ob-servable. The quantum formalism is explicitly acceptable toexplain distinct types of stochastic processes. Several non-standard implementations of the quantum formalism hasemerged. For instance, the quantum formalism have beenutilized vastly in the economic processes, game theory andcognitive science as well.Since the data is growing exponentially, current state-of-the-art models are still not effective. In particular, recall isstill unsatisfactory because most classification models aim tomaximize precision especially when the items of a class canbe ranked by a certain measure of membership to the class;a glaring example is the search of the Internet. In contrast,
Copyright c (cid:13) recall is crucial in many daily tasks aiming to find all the per-tinent items of a class such as patent search and biomedicalimage classification.Our approach is to develop a new theoretical approach in-spired by Quantum Mechanics (QM) in order to dig into thequantum world and come up with new and effective mod-els which are capable of increasing recall. Our hypothesis isthat, since QM has already shown its effectiveness in severalfields, it may also be effective in ML. To this end we willexploit Quantum Probability theory, which is the quantumgeneralization of classical probability theory and was devel-oped by Von Neumann. While classical probability theoryprovides that a system can be in either state 0 or 1, quantumprobability comes into existence to go beyond classical the-ory and describes states which can be anything in-between0 and 1. In this paper, we propose the BCIQT model whichis a step towards shifting from classical models to quantummodels.
Proposed Methodology
Classical and Quantum Signal Detection Theory
BCIQT is based on the overlap between Signal DetectionTheory (SDT) and QM. The main difference between theclassical framework and the quantum framework of signaldetection regards what encoders encode and what decodersdecode (Helstrom 1969).In the classical framework, there is c-c (classical-classical) mapping from a symbol to the wave to the cor-rupted channel; then the decoder produce c-c (classical-classical) mapping from the corrupted channel wave to asymbol. In the quantum framework, there is a coder betweenthe source and the channel; the classical symbol is transmit-ted through the quantum state. Initial encoding starts like c-q(classical-quantum) mapping from the symbol to the quan-tum state selected from a finite set of possible states. Moredetails about classical and quantum SDT can be found in(Helstrom 1969).
Binary Classifier Inspired by Quantum Theory
A novel BCIQT that is inspired by quantum detection the-ory is described in this section. For each category we sup-posed that each training sample was about the category ornot. For a given category and the set of training samples, wesed the projector ∆ for each category to identify whetherthe test sample was about the category or not. To determinewhether the test sample was about the category, ∆ was ex-amined against a vectorial representation of the test samples.Consider a set of distinct features calculated from thewhole sample collection. Each sample could be representedas a vector of features; each element in the feature vectorwas a non-negative number such as frequency. Each samplein the training set had a binary label in { , } . The main goalof BCIQT was to obtain one binary label for each sample inthe test set.The BCIQT estimated two density operators ρ and ρ ,one operator for each category or class and its complement,by using the training samples; in particular, for each class,the negative training samples were utilized to estimate ρ and the positive training samples were utilized to estimate ρ .In order to achieve these density operators ρ and ρ , wefirst calculated the total number of samples with non-zerovalues for each particular feature. In such a way, one vector | v i was obtained for each class. Since we were consideringthe binary case, two vectors | v i and | v i were obtained;the former referred to the negative training samples and thelatter referred to the positive training samples; these vectorsmay be considered as statistics of the features in a class. Wenormalized the vectors to obtain | h v | v i | = 1 . Then, wecalculated the outer product in order to obtain the densityoperators ρ and ρ as follows: ρ = | v i h v | tr ( | v i h v | ) ρ = | v i h v | tr ( | v i h v | ) (1)We computed the projection operator ∆ according to(Melucci 2016), that is, ρ − λρ = η ∆+ β ∆ ⊥ η > β < ⊥ = 0 (2)where ξ is the prior probability of the negative class and λ = ξ / (1 − ξ ) ; moreover, η is the positive eigenvalue corre-sponding to ∆ which represents the subspaces of the vectorsrepresenting the sample to be accepted in the target class.We set λ = 1 to simply mean that both classes had thesame prior probability ( ξ = 0 . ); moreover, there was nocost for wrong detection C = C = 0 ; finally, the costs offalse alarm and miss were constant ( C = C ). Eventually,we determined the binary label for the given test sample S j by inspecting the value of h w S j | ∆ | w S j i : If h w S j | ∆ | w S j i ≥ . , then C ( S j ) = 1 ; otherwise C ( S j ) = 0 . Experiment
The MNIST database of handwritten digits has a trainingset of 60,000 examples, and a test set of 10,000 examples.There are 9 categories from 0 to 9 but excluded 9. It is asubset of a larger set available from the National Instituteof Standards and Technology (NIST). The digits have beensize-normalized and centered in a fixed-size image.The four models i.e. Na¨ıve Bayes (NB), Support VectorMachine (SVM), k Nearest Neighbours (k-NN) and Deci-sion Tree (DT) were used as baselines. Prior to training the http://yann.lecun.com/exdb/mnist/ models, the top 100 features were selected as the best fea-tures for all the models in terms of recall. The chi-squarefeature selection model was used.We used one-vs-all strategy: for each category, the train-ing samples labeled as pertinent to the category are consid-ered positive examples, while the rest are considered nega-tive examples. While training the model, five fold cross val-idation was used. As it can be seen from Table 1, our pro-posed model performs better than any state-of-art-model interms of recall for every category. By changing number offeatures, evaluation measures(i.e. accuracy, precision, recalland f-measure) also change and provide comparable resultsto the baselines.Table 1: Comparison of Recall among k-nearest neigh-bors(KNN), Decision Tree(DT), Naive Bayes (NB), SupportVector Machine (SVM) and Binary Classifier Inspired byQuantum Theory (BCIQT) Category KNN DT NB SVM BCIQT0 0.959 0.884 0.889 0.292 Conclusion and Future Works
We found out that our proposed model outperforms the state-of-the-art models in terms of recall ; therefore, this modelcan be safely implemented if someone is looking for high re-call. We believe that this is an encouraging result and opensa gateway towards quantum inspired ML approaches. As forfuture work, we would like to develop multi-class classifiers(i.e. how to assign an item to more than one class) and multi-label classifiers (i.e. how to deal with non-binary labels), andre-rank the test items of a class by increasing precision aswell.
Acknowledgments ”This project has received funding from the EuropeanUnion’s Horizon 2020 research and innovation programmeunder the Marie Sklodowska-Curie grant agreement No721321”.
References [Helstrom 1969] Helstrom, C. W. 1969. Quantum detec-tion and estimation theory.
Journal of Statistical Physics