N-Dimensional Polynomial Neural Networks and their Applications
Ben Abdallah, Habib
Ben Abdallah, Habib. N-Dimensional Polynomial Neural Networks and their Applications; A thesis submitted to the Faculty of Graduate Studies of The University of Winnipeg in partial fulfillment of the requirements of the degree of Master of Science, Department of Applied Computer Science, University of Winnipeg. Winnipeg, Manitoba, Canada: University of Winnipeg, 2022. DOI: 10.36939/ir.202204211510.
In addition to being extremely non-linear, modern machine learning problems require millions if not billions of parameters to solve or at least to get a good approximation of the solution, and neural networks are known to assimilate that complexity by deepening and widening their topology in order to increase the level of non-linearity needed for a better approximation. However, compact topologies are always preferred to deeper ones as they offer the advantage of using less computational units and less parameters. This compactness comes at the price of reduced non-linearity and thus, of limited solution search space. This thesis proposes the N-Dimensional Polynomial Neural Network (NDPNN) model that uses automatic polynomial kernel estimation for N-Dimensional Convolutional Neural Networks (NDCNNs) and introduces a high degree of non-linearity from the first layer which can compensate the need for deep and/or wide topologies. We first theoretically formalized the 1DPNN model which can process 1-dimensional signals and we demonstrated that its inherent non-linearity enables it to yield better results with less computational and spatial complexity than a regular 1DCNN on various classification and regression problems related to audio signals, even though it introduces more computational and spatial complexity on a neuronal level. The experiments were conducted on three publicly available datasets and demonstrate that the proposed 1DPNN model can extract more relevant information from the data than a 1DCNN in less time and with less memory. We subsequently extended the theoretical foundation of the 1DPNN to NDPNN which can process 2D signals such as images and 3D signals such as videos. Also, we theoretically created a general polynomial degree reduction formula that we used to develop a heuristic algorithm, which enables the degree reduction of any pre-trained NDPNN. This algorithm compresses an NDPNN without altering its performance, thus making the model faster and lighter. Following that, we used 2DPNNs and 3DPNNs to tackle the problem of plant species recognition on a publicly available plant species recognition dataset composed of 40,000 images with different sizes consisting of 8 plant species. As a result, we created a novel method, called Variably Overlapping Time—Coherent Sliding Window (VOTCSW), that transforms a dataset composed of images with variable size to a 3D representation with fixed size that is suitable for convolutional neural networks, and we demonstrated that this representation is more informative than resizing the images of the dataset to a given size. We theoretically formalized the use cases of the method as well as its inherent properties and proved that it has an oversampling and a regularization effect on the data. By combining the VOTCSW method with 3DPNNs, we were able to create a model that achieved a state-of-the-art accuracy of 99.9% on the considered dataset, surpassing well-known architectures such as ResNet and Inception. Furthermore, we established that the currently available plant species dataset could not be used for machine learning in its present form, due to a substantial class imbalance between the training set and the test set. Hence, we created a specific preprocessing and a model development framework that enabled us to improve the accuracy from 49.23% to 99.9%. The contributions of this thesis are the creation of a novel generic model called NDPNN that can extract more information from data than a NDCNN with less computational and spatial complexity, the evaluation of the performance of NDPNNs on audio signals, images and videos, the creation of a general direct polynomial reduction formula, the design of a heuristic algorithm for NDPNN compression that generates faster and lighter models, the formalization of an image transformation method that circumvents image resizing without altering fine-grained information, and the production of a state-of-the-art 3DPNN for plant species recognition.