-
Evaluating the Determinants of Mode Choice Using Statistical and Machine Learning Techniques in the Indian Megacity of Bengaluru
Authors:
Tanmay Ghosh,
Nithin Nagaraj
Abstract:
The decision making involved behind the mode choice is critical for transportation planning. While statistical learning techniques like discrete choice models have been used traditionally, machine learning (ML) models have gained traction recently among the transportation planners due to their higher predictive performance. However, the black box nature of ML models pose significant interpretabili…
▽ More
The decision making involved behind the mode choice is critical for transportation planning. While statistical learning techniques like discrete choice models have been used traditionally, machine learning (ML) models have gained traction recently among the transportation planners due to their higher predictive performance. However, the black box nature of ML models pose significant interpretability challenges, limiting their practical application in decision and policy making. This study utilised a dataset of $1350$ households belonging to low and low-middle income bracket in the city of Bengaluru to investigate mode choice decision making behaviour using Multinomial logit model and ML classifiers like decision trees, random forests, extreme gradient boosting and support vector machines. In terms of accuracy, random forest model performed the best ($0.788$ on training data and $0.605$ on testing data) compared to all the other models. This research has adopted modern interpretability techniques like feature importance and individual conditional expectation plots to explain the decision making behaviour using ML models. A higher travel costs significantly reduce the predicted probability of bus usage compared to other modes (a $0.66\%$ and $0.34\%$ reduction using Random Forests and XGBoost model for $10\%$ increase in travel cost). However, reducing travel time by $10\%$ increases the preference for the metro ($0.16\%$ in Random Forests and 0.42% in XGBoost). This research augments the ongoing research on mode choice analysis using machine learning techniques, which would help in improving the understanding of the performance of these models with real-world data in terms of both accuracy and interpretability.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Compression Spectrum: Where Shannon meets Fourier
Authors:
Aditi Kathpalia,
Nithin Nagaraj
Abstract:
Signal processing and Information theory are two disparate fields used for characterizing signals for various scientific and engineering applications. Spectral/Fourier analysis, a technique employed in signal processing, helps estimation of power at different frequency components present in the signal. Characterizing a time-series based on its average amount of information (Shannon entropy) is use…
▽ More
Signal processing and Information theory are two disparate fields used for characterizing signals for various scientific and engineering applications. Spectral/Fourier analysis, a technique employed in signal processing, helps estimation of power at different frequency components present in the signal. Characterizing a time-series based on its average amount of information (Shannon entropy) is useful for estimating its complexity and compressibility (eg., for communication applications). Information theory doesn't deal with spectral content while signal processing doesn't directly consider the information content or compressibility of the signal. In this work, we attempt to bring the fields of signal processing and information theory together by using a lossless data compression algorithm to estimate the amount of information or `compressibility' of time series at different scales. To this end, we employ the Effort-to-Compress (ETC) algorithm to obtain what we call as a Compression Spectrum. This new tool for signal analysis is demonstrated on synthetically generated periodic signals, a sinusoid, chaotic signals (weak and strong chaos) and uniform random noise. The Compression Spectrum is applied on heart interbeat intervals (RR) obtained from real-world normal young and elderly subjects. The compression spectrum of healthy young RR tachograms in the log-log scale shows behaviour similar to $1/f$ noise whereas the healthy old RR tachograms show a different behaviour. We envisage exciting possibilities and future applications of the Compression Spectrum.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
To prune or not to prune : A chaos-causality approach to principled pruning of dense neural networks
Authors:
Rajan Sahu,
Shivam Chadha,
Nithin Nagaraj,
Archana Mathur,
Snehanshu Saha
Abstract:
Reducing the size of a neural network (pruning) by removing weights without impacting its performance is an important problem for resource-constrained devices. In the past, pruning was typically accomplished by ranking or penalizing weights based on criteria like magnitude and removing low-ranked weights before retraining the remaining ones. Pruning strategies may also involve removing neurons fro…
▽ More
Reducing the size of a neural network (pruning) by removing weights without impacting its performance is an important problem for resource-constrained devices. In the past, pruning was typically accomplished by ranking or penalizing weights based on criteria like magnitude and removing low-ranked weights before retraining the remaining ones. Pruning strategies may also involve removing neurons from the network in order to achieve the desired reduction in network size. We formulate pruning as an optimization problem with the objective of minimizing misclassifications by selecting specific weights. To accomplish this, we have introduced the concept of chaos in learning (Lyapunov exponents) via weight updates and exploiting causality to identify the causal weights responsible for misclassification. Such a pruned network maintains the original performance and retains feature explainability.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
Kolam Simulation using Angles at Lattice Points
Authors:
Tulasi Bharathi,
Shailaja D Sharma,
Nithin Nagaraj
Abstract:
Kolam is a ritual art form practised by people in South India and consists of rule-bound geometric patterns of dots and lines. Single loop Kolams are mathematical closed loop patterns drawn over a grid of dots and conforming to certain heuristics. In this work, we propose a novel encoding scheme where we map the angular movements of Kolam at lattice points into sequences containing $4$ distinct sy…
▽ More
Kolam is a ritual art form practised by people in South India and consists of rule-bound geometric patterns of dots and lines. Single loop Kolams are mathematical closed loop patterns drawn over a grid of dots and conforming to certain heuristics. In this work, we propose a novel encoding scheme where we map the angular movements of Kolam at lattice points into sequences containing $4$ distinct symbols. This is then used to simulate single loop Kolam procedure via turtle moves in accordance with the desired angular direction at specific points. We thus obtain sequential codes for Kolams, unique up to cyclic permutations. We specify the requirements for the algorithm and indicate the general methodology. We demonstrate a sample of Kolams using our algorithm with a software implementation in Python.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Permutation Decision Trees
Authors:
Harikrishnan N B,
Arham Jain,
Nithin Nagaraj
Abstract:
Decision Tree is a well understood Machine Learning model that is based on minimizing impurities in the internal nodes. The most common impurity measures are Shannon entropy and Gini impurity. These impurity measures are insensitive to the order of training data and hence the final tree obtained is invariant to any permutation of the data. This is a limitation in terms of modeling when there are t…
▽ More
Decision Tree is a well understood Machine Learning model that is based on minimizing impurities in the internal nodes. The most common impurity measures are Shannon entropy and Gini impurity. These impurity measures are insensitive to the order of training data and hence the final tree obtained is invariant to any permutation of the data. This is a limitation in terms of modeling when there are temporal order dependencies between data instances. In this research, we propose the adoption of Effort-To-Compress (ETC) - a complexity measure, for the first time, as an alternative impurity measure. Unlike Shannon entropy and Gini impurity, structural impurity based on ETC is able to capture order dependencies in the data, thus obtaining potentially different decision trees for different permutations of the same data instances, a concept we term as Permutation Decision Trees (PDT). We then introduce the notion of Permutation Bagging achieved using permutation decision trees without the need for random feature selection and sub-sampling. We conduct a performance comparison between Permutation Decision Trees and classical decision trees across various real-world datasets, including Appendicitis, Breast Cancer Wisconsin, Diabetes Pima Indian, Ionosphere, Iris, Sonar, and Wine. Our findings reveal that PDT demonstrates comparable performance to classical decision trees across most datasets. Remarkably, in certain instances, PDT even slightly surpasses the performance of classical decision trees. In comparing Permutation Bagging with Random Forest, we attain comparable performance to Random Forest models consisting of 50 to 1000 trees, using merely 21 trees. This highlights the efficiency and effectiveness of Permutation Bagging in achieving comparable performance outcomes with significantly fewer trees.
△ Less
Submitted 31 May, 2024; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Granger Causality for Compressively Sensed Sparse Signals
Authors:
Aditi Kathpalia,
Nithin Nagaraj
Abstract:
Compressed sensing is a scheme that allows for sparse signals to be acquired, transmitted and stored using far fewer measurements than done by conventional means employing Nyquist sampling theorem. Since many naturally occurring signals are sparse (in some domain), compressed sensing has rapidly seen popularity in a number of applied physics and engineering applications, particularly in designing…
▽ More
Compressed sensing is a scheme that allows for sparse signals to be acquired, transmitted and stored using far fewer measurements than done by conventional means employing Nyquist sampling theorem. Since many naturally occurring signals are sparse (in some domain), compressed sensing has rapidly seen popularity in a number of applied physics and engineering applications, particularly in designing signal and image acquisition strategies, e.g., magnetic resonance imaging, quantum state tomography, scanning tunneling microscopy, analog to digital conversion technologies. Contemporaneously, causal inference has become an important tool for the analysis and understanding of processes and their interactions in many disciplines of science, especially those dealing with complex systems. Direct causal analysis for compressively sensed data is required to avoid the task of reconstructing the compressed data. Also, for some sparse signals, such as for sparse temporal data, it may be difficult to discover causal relations directly using available data-driven/ model-free causality estimation techniques. In this work, we provide a mathematical proof that structured compressed sensing matrices, specifically Circulant and Toeplitz, preserve causal relationships in the compressed signal domain, as measured by Granger Causality. We then verify this theorem on a number of bivariate and multivariate coupled sparse signal simulations which are compressed using these matrices. We also demonstrate a real world application of network causal connectivity estimation from sparse neural spike train recordings from rat prefrontal cortex.
△ Less
Submitted 23 September, 2022;
originally announced October 2022.
-
Neurochaos Feature Transformation and Classification for Imbalanced Learning
Authors:
Deeksha Sethi,
Nithin Nagaraj,
Harikrishnan N B
Abstract:
Learning from limited and imbalanced data is a challenging problem in the Artificial Intelligence community. Real-time scenarios demand decision-making from rare events wherein the data are typically imbalanced. These situations commonly arise in medical applications, cybersecurity, catastrophic predictions etc. This motivates the development of learning algorithms capable of learning from imbalan…
▽ More
Learning from limited and imbalanced data is a challenging problem in the Artificial Intelligence community. Real-time scenarios demand decision-making from rare events wherein the data are typically imbalanced. These situations commonly arise in medical applications, cybersecurity, catastrophic predictions etc. This motivates the development of learning algorithms capable of learning from imbalanced data. Human brain effortlessly learns from imbalanced data. Inspired by the chaotic neuronal firing in the human brain, a novel learning algorithm namely Neurochaos Learning (NL) was recently proposed. NL is categorized in three blocks: Feature Transformation, Neurochaos Feature Extraction (CFX), and Classification. In this work, the efficacy of neurochaos feature transformation and extraction for classification in imbalanced learning is studied. We propose a unique combination of neurochaos based feature transformation and extraction with traditional ML algorithms. The explored datasets in this study revolve around medical diagnosis, banknote fraud detection, environmental applications and spoken-digit classification. In this study, experiments are performed in both high and low training sample regime. In the former, five out of nine datasets have shown a performance boost in terms of macro F1-score after using CFX features. The highest performance boost obtained is 25.97% for Statlog (Heart) dataset using CFX+Decision Tree. In the low training sample regime (from just one to nine training samples per class), the highest performance boost of 144.38% is obtained for Haberman's Survival dataset using CFX+Random Forest. NL offers enormous flexibility of combining CFX with any ML classifier to boost its performance, especially for learning tasks with limited and imbalanced data.
△ Less
Submitted 16 May, 2022; v1 submitted 20 April, 2022;
originally announced May 2022.
-
Cause-Effect Preservation and Classification using Neurochaos Learning
Authors:
Harikrishnan N B,
Aditi Kathpalia,
Nithin Nagaraj
Abstract:
Discovering cause-effect from observational data is an important but challenging problem in science and engineering. In this work, a recently proposed brain inspired learning algorithm namely-\emph{Neurochaos Learning} (NL) is used for the classification of cause-effect from simulated data. The data instances used are generated from coupled AR processes, coupled 1D chaotic skew tent maps, coupled…
▽ More
Discovering cause-effect from observational data is an important but challenging problem in science and engineering. In this work, a recently proposed brain inspired learning algorithm namely-\emph{Neurochaos Learning} (NL) is used for the classification of cause-effect from simulated data. The data instances used are generated from coupled AR processes, coupled 1D chaotic skew tent maps, coupled 1D chaotic logistic maps and a real-world prey-predator system. The proposed method consistently outperforms a five layer Deep Neural Network architecture for coupling coefficient values ranging from $0.1$ to $0.7$. Further, we investigate the preservation of causality in the feature extracted space of NL using Granger Causality (GC) for coupled AR processes and and Compression-Complexity Causality (CCC) for coupled chaotic systems and real-world prey-predator dataset. This ability of NL to preserve causality under a chaotic transformation and successfully classify cause and effect time series (including a transfer learning scenario) is highly desirable in causal machine learning applications.
△ Less
Submitted 28 January, 2022;
originally announced January 2022.
-
Learning Generalized Causal Structure in Time-series
Authors:
Aditi Kathpalia,
Keerti P. Charantimath,
Nithin Nagaraj
Abstract:
The science of causality explains/determines 'cause-effect' relationship between the entities of a system by providing mathematical tools for the purpose. In spite of all the success and widespread applications of machine-learning (ML) algorithms, these algorithms are based on statistical learning alone. Currently, they are nowhere close to 'human-like' intelligence as they fail to answer and lear…
▽ More
The science of causality explains/determines 'cause-effect' relationship between the entities of a system by providing mathematical tools for the purpose. In spite of all the success and widespread applications of machine-learning (ML) algorithms, these algorithms are based on statistical learning alone. Currently, they are nowhere close to 'human-like' intelligence as they fail to answer and learn based on the important "Why?" questions. Hence, researchers are attempting to integrate ML with the science of causality. Among the many causal learning issues encountered by ML, one is that these algorithms are dumb to the temporal order or structure in data. In this work we develop a machine learning pipeline based on a recently proposed 'neurochaos' feature learning technique (ChaosFEX feature extractor), that helps us to learn generalized causal-structure in given time-series data.
△ Less
Submitted 6 December, 2021;
originally announced December 2021.
-
Problems with information theoretic approaches to causal learning
Authors:
Nithin Nagaraj
Abstract:
The language of information theory is favored in both causal reasoning and machine learning frameworks. But, is there a better language than this? In this study, we demonstrate the pitfalls of infotheoretic estimation using first order statistics on (short) sequences for causal learning. We recommend the use of data compression based approaches for causality testing since these make very little as…
▽ More
The language of information theory is favored in both causal reasoning and machine learning frameworks. But, is there a better language than this? In this study, we demonstrate the pitfalls of infotheoretic estimation using first order statistics on (short) sequences for causal learning. We recommend the use of data compression based approaches for causality testing since these make very little assumptions on data as opposed to infotheoretic measures, and are more robust to finite data length effects. We conclude with a discussion on the challenges posed in modeling the effects of conditioning process $X$ with another process $Y$ in causal machine learning. Specifically, conditioning can increase 'confusion' which can be difficult to model by classical information theory. A conscious causal agent creates new choices, decisions and meaning which poses huge challenges for AI.
△ Less
Submitted 24 October, 2021;
originally announced October 2021.
-
Causal Analysis of Carnatic Music: A Preliminary Study
Authors:
Abhsihek Nandekar,
Preeth Khona,
Rajani M. B.,
Anindya Sinha,
Nithin Nagaraj
Abstract:
The musicological analysis of Carnatic music is challenging, owing to its rich structure and complexity. Automated \textit{rāga} classification, pitch detection, tonal analysis, modelling and information retrieval of this form of southern Indian classical music have, however, made significant progress in recent times. A causal analysis to investigate the musicological structure of Carnatic composi…
▽ More
The musicological analysis of Carnatic music is challenging, owing to its rich structure and complexity. Automated \textit{rāga} classification, pitch detection, tonal analysis, modelling and information retrieval of this form of southern Indian classical music have, however, made significant progress in recent times. A causal analysis to investigate the musicological structure of Carnatic compositions and the identification of the relationships embedded in them have never been previously attempted. In this study, we propose a novel framework for causal discovery, using a compression-complexity measure. Owing to the limited number of compositions available, however, we generated surrogates to further facilitate the analysis of the prevailing causal relationships. Our analysis indicates that the context-free grammar, inferred from more complex compositions, such as the \textit{Mē\d{l}akarta} \textit{rāga}, are a \textit{structural cause} for the \textit{Janya} \textit{rāga}. We also analyse certain special cases of the \textit{Janya rāga} in order to understand their origins and structure better.
△ Less
Submitted 24 September, 2021;
originally announced September 2021.
-
Fairly Constricted Multi-Objective Particle Swarm Optimization
Authors:
Anwesh Bhattacharya,
Snehanshu Saha,
Nithin Nagaraj
Abstract:
It has been well documented that the use of exponentially-averaged momentum (EM) in particle swarm optimization (PSO) is advantageous over the vanilla PSO algorithm. In the single-objective setting, it leads to faster convergence and avoidance of local minima. Naturally, one would expect that the same advantages of EM carry over to the multi-objective setting. Hence, we extend the state of the art…
▽ More
It has been well documented that the use of exponentially-averaged momentum (EM) in particle swarm optimization (PSO) is advantageous over the vanilla PSO algorithm. In the single-objective setting, it leads to faster convergence and avoidance of local minima. Naturally, one would expect that the same advantages of EM carry over to the multi-objective setting. Hence, we extend the state of the art Multi-objective optimization (MOO) solver, SMPSO, by incorporating EM in it. As a consequence, we develop the mathematical formalism of constriction fairness which is at the core of extended SMPSO algorithm. The proposed solver matches the performance of SMPSO across the ZDT, DTLZ and WFG problem suites and even outperforms it in certain instances.
△ Less
Submitted 13 November, 2022; v1 submitted 10 April, 2021;
originally announced April 2021.
-
When Noise meets Chaos: Stochastic Resonance in Neurochaos Learning
Authors:
Harikrishnan NB,
Nithin Nagaraj
Abstract:
Chaos and Noise are ubiquitous in the Brain. Inspired by the chaotic firing of neurons and the constructive role of noise in neuronal models, we for the first time connect chaos, noise and learning. In this paper, we demonstrate Stochastic Resonance (SR) phenomenon in Neurochaos Learning (NL). SR manifests at the level of a single neuron of NL and enables efficient subthreshold signal detection. F…
▽ More
Chaos and Noise are ubiquitous in the Brain. Inspired by the chaotic firing of neurons and the constructive role of noise in neuronal models, we for the first time connect chaos, noise and learning. In this paper, we demonstrate Stochastic Resonance (SR) phenomenon in Neurochaos Learning (NL). SR manifests at the level of a single neuron of NL and enables efficient subthreshold signal detection. Furthermore, SR is shown to occur in single and multiple neuronal NL architecture for classification tasks - both on simulated and real-world spoken digit datasets. Intermediate levels of noise in neurochaos learning enables peak performance in classification tasks thus highlighting the role of SR in AI applications, especially in brain inspired learning architectures.
△ Less
Submitted 9 March, 2021; v1 submitted 2 February, 2021;
originally announced February 2021.
-
A Neurochaos Learning Architecture for Genome Classification
Authors:
Harikrishnan NB,
Pranay SY,
Nithin Nagaraj
Abstract:
There has been empirical evidence of presence of non-linearity and chaos at the level of single neurons in biological neural networks. The properties of chaotic neurons inspires us to employ them in artificial learning systems. Here, we propose a Neurochaos Learning (NL) architecture, where the neurons used to extract features from data are 1D chaotic maps. ChaosFEX+SVM, an instance of this NL arc…
▽ More
There has been empirical evidence of presence of non-linearity and chaos at the level of single neurons in biological neural networks. The properties of chaotic neurons inspires us to employ them in artificial learning systems. Here, we propose a Neurochaos Learning (NL) architecture, where the neurons used to extract features from data are 1D chaotic maps. ChaosFEX+SVM, an instance of this NL architecture, is proposed as a hybrid combination of chaos and classical machine learning algorithm. We formally prove that a single layer of NL with a finite number of 1D chaotic neurons satisfies the Universal Approximation Theorem with an exact value for the number of chaotic neurons needed to approximate a discrete real valued function with finite support. This is made possible due to the topological transitivity property of chaos and the existence of uncountably infinite number of dense orbits for the chosen 1D chaotic map. The chaotic neurons in NL get activated under the presence of an input stimulus (data) and output a chaotic firing trajectory. From such chaotic firing trajectories of individual neurons of NL, we extract Firing Time, Firing Rate, Energy and Entropy that constitute ChaosFEX features. These ChaosFEX features are then fed to a Support Vector Machine with linear kernel for classification. The effectiveness of chaotic feature engineering performed by NL (ChaosFEX+SVM) is demonstrated for synthetic and real world datasets in the low and high training sample regimes. Specifically, we consider the problem of classification of genome sequences of SARS-CoV-2 from other coronaviruses (SARS-CoV-1, MERS-CoV and others). With just one training sample per class for 1000 random trials of training, we report an average macro F1-score > 0.99 for the classification of SARS-CoV-2 from SARS-CoV-1 genome sequences. Robustness of ChaosFEX features to additive noise is also demonstrated.
△ Less
Submitted 12 October, 2020;
originally announced October 2020.
-
Causal Discovery using Compression-Complexity Measures
Authors:
Pranay SY,
Nithin Nagaraj
Abstract:
Causal inference is one of the most fundamental problems across all domains of science. We address the problem of inferring a causal direction from two observed discrete symbolic sequences $X$ and $Y$. We present a framework which relies on lossless compressors for inferring context-free grammars (CFGs) from sequence pairs and quantifies the extent to which the grammar inferred from one sequence c…
▽ More
Causal inference is one of the most fundamental problems across all domains of science. We address the problem of inferring a causal direction from two observed discrete symbolic sequences $X$ and $Y$. We present a framework which relies on lossless compressors for inferring context-free grammars (CFGs) from sequence pairs and quantifies the extent to which the grammar inferred from one sequence compresses the other sequence. We infer $X$ causes $Y$ if the grammar inferred from $X$ better compresses $Y$ than in the other direction. To put this notion to practice, we propose three models that use the Compression-Complexity Measures (CCMs) - Lempel-Ziv (LZ) complexity and Effort-To-Compress (ETC) to infer CFGs and discover causal directions without demanding temporal structures. We evaluate these models on synthetic and real-world benchmarks and empirically observe performances competitive with current state-of-the-art methods. Lastly, we present two unique applications of the proposed models for causal inference directly from pairs of genome sequences belonging to the SARS-CoV-2 virus. Using a large number of sequences, we show that our models capture directed causal information exchange between sequence pairs, presenting novel opportunities for addressing key issues such as contact-tracing, motif discovery, evolution of virulence and pathogenicity in future applications.
△ Less
Submitted 17 March, 2021; v1 submitted 19 October, 2020;
originally announced October 2020.
-
Measuring Causality: The Science of Cause and Effect
Authors:
Aditi Kathpalia,
Nithin Nagaraj
Abstract:
Determining and measuring cause-effect relationships is fundamental to most scientific studies of natural phenomena. The notion of causation is distinctly different from correlation which only looks at association of trends or patterns in measurements. In this article, we review different notions of causality and focus especially on measuring causality from time series data. Causality testing find…
▽ More
Determining and measuring cause-effect relationships is fundamental to most scientific studies of natural phenomena. The notion of causation is distinctly different from correlation which only looks at association of trends or patterns in measurements. In this article, we review different notions of causality and focus especially on measuring causality from time series data. Causality testing finds numerous applications in diverse disciplines such as neuroscience, econometrics, climatology, physics and artificial intelligence.
△ Less
Submitted 19 October, 2019;
originally announced October 2019.
-
ChaosNet: A Chaos based Artificial Neural Network Architecture for Classification
Authors:
Harikrishnan Nellippallil Balakrishnan,
Aditi Kathpalia,
Snehanshu Saha,
Nithin Nagaraj
Abstract:
Inspired by chaotic firing of neurons in the brain, we propose ChaosNet -- a novel chaos based artificial neural network architecture for classification tasks. ChaosNet is built using layers of neurons, each of which is a 1D chaotic map known as the Generalized Luroth Series (GLS) which has been shown in earlier works to possess very useful properties for compression, cryptography and for computin…
▽ More
Inspired by chaotic firing of neurons in the brain, we propose ChaosNet -- a novel chaos based artificial neural network architecture for classification tasks. ChaosNet is built using layers of neurons, each of which is a 1D chaotic map known as the Generalized Luroth Series (GLS) which has been shown in earlier works to possess very useful properties for compression, cryptography and for computing XOR and other logical operations. In this work, we design a novel learning algorithm on ChaosNet that exploits the topological transitivity property of the chaotic GLS neurons. The proposed learning algorithm gives consistently good performance accuracy in a number of classification tasks on well known publicly available datasets with very limited training samples. Even with as low as 7 (or fewer) training samples/class (which accounts for less than 0.05% of the total available data), ChaosNet yields performance accuracies in the range 73.89 % - 98.33 %. We demonstrate the robustness of ChaosNet to additive parameter noise and also provide an example implementation of a 2-layer ChaosNet for enhancing classification accuracy. We envisage the development of several other novel learning algorithms on ChaosNet in the near future.
△ Less
Submitted 6 October, 2019;
originally announced October 2019.
-
Causal Stability and Synchronization
Authors:
Aditi Kathpalia,
Nithin Nagaraj
Abstract:
Synchronization of chaos arises between coupled dynamical systems and is very well understood as a temporal phenomena which leads the coupled systems to converge or develop a dependence with time. In this work, we provide a complementary spatial perspective to this phenomenon by introducing the novel idea of causal stability. We then propose and prove a causal stability synchronization theorem as…
▽ More
Synchronization of chaos arises between coupled dynamical systems and is very well understood as a temporal phenomena which leads the coupled systems to converge or develop a dependence with time. In this work, we provide a complementary spatial perspective to this phenomenon by introducing the novel idea of causal stability. We then propose and prove a causal stability synchronization theorem as a necessary and sufficient condition for synchronization. We also provide an empirical criteria to identify synchronizing variables in coupled identical chaotic dynamical systems based on causality analysis on time series data of the driving system alone.
△ Less
Submitted 28 June, 2019;
originally announced July 2019.
-
Evolution of Novel Activation Functions in Neural Network Training with Applications to Classification of Exoplanets
Authors:
Snehanshu Saha,
Nithin Nagaraj,
Archana Mathur,
Rahul Yedida
Abstract:
We present analytical exploration of novel activation functions as consequence of integration of several ideas leading to implementation and subsequent use in habitability classification of exoplanets. Neural networks, although a powerful engine in supervised methods, often require expensive tuning efforts for optimized performance. Habitability classes are hard to discriminate, especially when at…
▽ More
We present analytical exploration of novel activation functions as consequence of integration of several ideas leading to implementation and subsequent use in habitability classification of exoplanets. Neural networks, although a powerful engine in supervised methods, often require expensive tuning efforts for optimized performance. Habitability classes are hard to discriminate, especially when attributes used as hard markers of separation are removed from the data set. The solution is approached from the point of investigating analytical properties of the proposed activation functions. The theory of ordinary differential equations and fixed point are exploited to justify the "lack of tuning efforts" to achieve optimal performance compared to traditional activation functions. Additionally, the relationship between the proposed activation functions and the more popular ones is established through extensive analytical and empirical evidence. Finally, the activation functions have been implemented in plain vanilla feed-forward neural network to classify exoplanets.
△ Less
Submitted 1 June, 2019;
originally announced June 2019.
-
A Novel Chaos Theory Inspired Neuronal Architecture
Authors:
Harikrishnan N B,
Nithin Nagaraj
Abstract:
The practical success of widely used machine learning (ML) and deep learning (DL) algorithms in Artificial Intelligence (AI) community owes to availability of large datasets for training and huge computational resources. Despite the enormous practical success of AI, these algorithms are only loosely inspired from the biological brain and do not mimic any of the fundamental properties of neurons in…
▽ More
The practical success of widely used machine learning (ML) and deep learning (DL) algorithms in Artificial Intelligence (AI) community owes to availability of large datasets for training and huge computational resources. Despite the enormous practical success of AI, these algorithms are only loosely inspired from the biological brain and do not mimic any of the fundamental properties of neurons in the brain, one such property being the chaotic firing of biological neurons. This motivates us to develop a novel neuronal architecture where the individual neurons are intrinsically chaotic in nature. By making use of the topological transitivity property of chaos, our neuronal network is able to perform classification tasks with very less number of training samples. For the MNIST dataset, with as low as $0.1 \%$ of the total training data, our method outperforms ML and matches DL in classification accuracy for up to $7$ training samples/class. For the Iris dataset, our accuracy is comparable with ML algorithms, and even with just two training samples/class, we report an accuracy as high as $95.8 \%$. This work highlights the effectiveness of chaos and its properties for learning and paves the way for chaos-inspired neuronal architectures by closely mimicking the chaotic nature of neurons in the brain.
△ Less
Submitted 19 May, 2019;
originally announced May 2019.
-
A Two-Parameter Model for Ultrasonic Tissue Characterization with Harmonic Imaging
Authors:
Kajoli Banerjee Krishnan,
Nithin Nagaraj,
Nitin Singhal,
Shalini Thapar,
Komal Yadav
Abstract:
Over the past few decades, researchers have developed several approaches such as the Reference Phantom Method (RPM) to estimate ultrasound attenuation coefficient (AC) and backscatter coefficient (BSC). AC and BSC can help to discriminate pathology from normal tissue during in-vivo imaging. In this paper, we propose a new RPM model to simultaneously compute AC and BSC for harmonic imaging and a no…
▽ More
Over the past few decades, researchers have developed several approaches such as the Reference Phantom Method (RPM) to estimate ultrasound attenuation coefficient (AC) and backscatter coefficient (BSC). AC and BSC can help to discriminate pathology from normal tissue during in-vivo imaging. In this paper, we propose a new RPM model to simultaneously compute AC and BSC for harmonic imaging and a normalized score that combines the two parameters as a measure of disease progression. The model utilizes the spectral difference between two regions of interest, the first, a proximal, close to the probe and second, a distal, away from the probe. We have implemented an algorithm based on the model and shown that it provides accurate and stable estimates to within 5% of AC and BSC for simulated received echo from post-focal depths of a homogeneous liver-like medium. For practical applications with time gain and time frequency compensated in-phase and quadrature (IQ) data from ultrasound scanner, the method has been approximated and generalized to estimate AC and BSC for tissue layer underlying a more attenuative subcutaneous layer. The angular spectrum approach for ultrasound propagation in biological tissue is employed as a virtual Reference Phantom (VRP). The VRP is calibrated with a fixed probe and scanning protocol for application to liver tissue. In a feasibility study with 16 subjects, the method is able to separate 9/11 cases of progressive non-alcoholic fatty liver disease from 5 normal. In particular, it is able to separate 4/5 cases of non-alcoholic steato-hepatitis and early fibrosis (F<=2) from normal tissue. More extensive clinical studies are needed to assess the full capability of this model for screening and monitoring disease progression in liver and other tissues.
△ Less
Submitted 10 December, 2017;
originally announced December 2017.
-
Causality Testing: A Data Compression Framework
Authors:
Aditi Kathpalia,
Nithin Nagaraj
Abstract:
Causality testing, the act of determining cause and effect from measurements, is widely used in physics, climatology, neuroscience, econometrics and other disciplines. As a result, a large number of causality testing methods based on various principles have been developed. Causal relationships in complex systems are typically accompanied by entropic exchanges which are encoded in patterns of dynam…
▽ More
Causality testing, the act of determining cause and effect from measurements, is widely used in physics, climatology, neuroscience, econometrics and other disciplines. As a result, a large number of causality testing methods based on various principles have been developed. Causal relationships in complex systems are typically accompanied by entropic exchanges which are encoded in patterns of dynamical measurements. A data compression algorithm which can extract these encoded patterns could be used for inferring these relations. This motivates us to propose, for the first time, a generic causality testing framework based on data compression. The framework unifies existing causality testing methods and enables us to innovate a novel Compression-Complexity Causality measure. This measure is rigorously tested on simulated and real-world time series and is found to overcome the limitations of Granger Causality and Transfer Entropy, especially for noisy and non-synchronous measurements. Additionally, it gives insight on the `kind' of causal influence between input time series by the notions of positive and negative causality.
△ Less
Submitted 18 February, 2018; v1 submitted 11 October, 2017;
originally announced October 2017.
-
Simulation Study of Two Measures of Integrated Information
Authors:
Suresh Jois,
Nithin Nagaraj
Abstract:
Background: Many authors have proposed Quantitative Theories of Consciousness (QTC) based on theoretical principles like information theory, Granger causality and complexity. Recently, Virmani and Nagaraj (arXiv:1608.08450v2 [cs.IT]) noted the similarity between Integrated Information and Compression-Complexity, and on this basis, proposed a novel measure of network complexity called Phi-Compressi…
▽ More
Background: Many authors have proposed Quantitative Theories of Consciousness (QTC) based on theoretical principles like information theory, Granger causality and complexity. Recently, Virmani and Nagaraj (arXiv:1608.08450v2 [cs.IT]) noted the similarity between Integrated Information and Compression-Complexity, and on this basis, proposed a novel measure of network complexity called Phi-Compression Complexity (Phi-C or $Φ^C$). Their computer simulations using Boolean networks showed that $Φ^C$ compares favorably to Giulio Tononi et al's Integrated Information measure $Φ$ 3.0 and exhibits desirable mathematical and computational characteristics. Methods: In the present work, $Φ^C$ was measured for two types of simulated networks: (A) Networks representing simple neuronal connectivity motifs (presented in Fig.9 of Tononi and Sporns, BMC Neuroscience 4(1), 2003); (B) random networks derived from Erdös-R ényi G(N, p)graphs. Code for all simulations was written in Python 3.6, and the library NetworkX was used to simulate the graphs. Results and discussions summary: In simulations A, for the same set of networks, $Φ^C$ values differ from the values of IIT 1.0 $Φ$ in a counter-intuitive manner. It appears that $Φ^C$ captures some invariant aspects of the interplay between information integration, network topology, graph composition and node entropy. While Virmani and Nagaraj (arXiv:1608.08450v2 [cs.IT]) sought to highlight the correlations between $Φ^C$ and IIT $Φ$, the results of simulations A highlight the differences between the two measures in the way they capture the integrated information. In simulations B, the results of simulations A are extended to the more general case of random networks. In the concluding section we outline the novel aspects of this paper, and our ongoing and future research.
△ Less
Submitted 29 June, 2017;
originally announced June 2017.
-
New Empirical Evidence on Disjunction Effect and Cultural Dependence
Authors:
Indranil Mukhopadhyay,
Nithin Nagaraj,
Sisir Roy
Abstract:
We perform new experiment using almost the same sample size considered by Tversky and Shafir to test the validity of classical probability theory in decision making. The results clearly indicate that the disjunction effect depends also on culture and more specifically on gender (females rather than males). We did more statistical analysis rather that putting the actual values done by previous auth…
▽ More
We perform new experiment using almost the same sample size considered by Tversky and Shafir to test the validity of classical probability theory in decision making. The results clearly indicate that the disjunction effect depends also on culture and more specifically on gender (females rather than males). We did more statistical analysis rather that putting the actual values done by previous authors. We propose different kind of disjunction effect i.e. strong and weak based on our statistical analysis.
△ Less
Submitted 1 March, 2017;
originally announced March 2017.
-
Three Perspectives on Complexity $-$ Entropy, Compression, Subsymmetry
Authors:
Nithin Nagaraj,
Karthi Balasubramanian
Abstract:
There is no single universally accepted definition of "Complexity". There are several perspectives on complexity and what constitutes complex behaviour or complex systems, as opposed to regular, predictable behaviour and simple systems. In this paper, we explore the following perspectives on complexity: "effort-to-describe" (Shannon entropy $H$, Lempel-Ziv complexity $LZ$), "effort-to-compress" (…
▽ More
There is no single universally accepted definition of "Complexity". There are several perspectives on complexity and what constitutes complex behaviour or complex systems, as opposed to regular, predictable behaviour and simple systems. In this paper, we explore the following perspectives on complexity: "effort-to-describe" (Shannon entropy $H$, Lempel-Ziv complexity $LZ$), "effort-to-compress" ($ETC$ complexity) and "degree-of-order" (Subsymmetry or $SubSym$). While Shannon entropy and $LZ$ are very popular and widely used, $ETC$ is a recently proposed measure for time series. In this paper, we also propose a novel normalized measure $SubSym$ based on the existing idea of counting the number of subsymmetries or palindromes within a sequence. We compare the performance of these complexity measures on the following tasks: a) characterizing complexity of short binary sequences of lengths 4 to 16, b) distinguishing periodic and chaotic time series from 1D logistic map and 2D Hénon map, and c) distinguishing between tonic and irregular spiking patterns generated from the "Adaptive exponential integrate-and-fire" neuron model. Our study reveals that each perspective has its own advantages and uniqueness while also having an overlap with each other.
△ Less
Submitted 31 October, 2016;
originally announced November 2016.
-
Dynamical Complexity Of Short and Noisy Time Series
Authors:
Nithin Nagaraj,
Karthi Balasubramanian
Abstract:
Shannon Entropy has been extensively used for characterizing complexity of time series arising from chaotic dynamical systems and stochastic processes such as Markov chains. However, for short and noisy time series, Shannon entropy performs poorly. Complexity measures which are based on lossless compression algorithms are a good substitute in such scenarios. We evaluate the performance of two such…
▽ More
Shannon Entropy has been extensively used for characterizing complexity of time series arising from chaotic dynamical systems and stochastic processes such as Markov chains. However, for short and noisy time series, Shannon entropy performs poorly. Complexity measures which are based on lossless compression algorithms are a good substitute in such scenarios. We evaluate the performance of two such Compression-Complexity Measures namely Lempel-Ziv complexity ($LZ$) and Effort-To-Compress ($ETC$) on short time series from chaotic dynamical systems in the presence of noise. Both $LZ$ and $ETC$ outperform Shannon entropy ($H$) in accurately characterizing the dynamical complexity of such systems. For very short binary sequences (which arise in neuroscience applications), $ETC$ has higher number of distinct complexity values than $LZ$ and $H$, thus enabling a finer resolution. For two-state ergodic Markov chains, we empirically show that $ETC$ converges to a steady state value faster than $LZ$. Compression-Complexity Measures are promising for applications which involve short and noisy time series.
△ Less
Submitted 7 December, 2016; v1 submitted 7 September, 2016;
originally announced September 2016.
-
A Compression-Complexity Measure of Integrated Information
Authors:
Mohit Virmani,
Nithin Nagaraj
Abstract:
Quantifying integrated information is a leading approach towards building a fundamental theory of consciousness. Integrated Information Theory (IIT) has gained attention in this regard due to its theoretically strong framework. However, it faces some limitations such as current state dependence, computationally expensive and inability to be applied to real brain data. On the other hand, Perturbati…
▽ More
Quantifying integrated information is a leading approach towards building a fundamental theory of consciousness. Integrated Information Theory (IIT) has gained attention in this regard due to its theoretically strong framework. However, it faces some limitations such as current state dependence, computationally expensive and inability to be applied to real brain data. On the other hand, Perturbational Complexity Index (PCI) is a clinical measure for distinguishing different levels of consciousness. Though PCI claims to capture the functional differentiation and integration in brain networks (similar to IIT), its link to integrated information theories is rather weak. Inspired by these two approaches, we propose a new measure - $Φ^C$ using a novel compression-complexity perspective that serves as a bridge between the two, for the first time. $Φ^C$ is founded on the principles of lossless data compression based complexity measures which characterize the dynamical complexity of brain networks. $Φ^{C}$ exhibits following salient innovations: (i) mathematically well bounded, (ii) negligible current state dependence unlike $Φ$, (iii) integrated information measured as compression-complexity rather than as an infotheoretic quantity, and (iv) faster to compute since number of atomic bipartitions scales linearly with the number of nodes of the network, thus avoiding combinatorial explosion. Our computer simulations show that $Φ^C$ has similar hierarchy to $<Φ>$ for several multiple-node networks and it demonstrates a rich interplay between differentiation, integration and entropy of the nodes of a network. $Φ^C$ is a promising heuristic measure to characterize the quantity of integrated information (and hence a measure of quantity of consciousness) in larger networks like human brain and provides an opportunity to test the predictions of brain complexity on real neural data.
△ Less
Submitted 13 December, 2016; v1 submitted 23 August, 2016;
originally announced August 2016.
-
Cardiac Aging Detection Using Complexity Measures
Authors:
Karthi Balasubramanian,
Nithin Nagaraj
Abstract:
As we age, our hearts undergo changes which result in reduction in complexity of physiological interactions between different control mechanisms. This results in a potential risk of cardiovascular diseases which are the number one cause of death globally. Since cardiac signals are nonstationary and nonlinear in nature, complexity measures are better suited to handle such data. In this study, non-i…
▽ More
As we age, our hearts undergo changes which result in reduction in complexity of physiological interactions between different control mechanisms. This results in a potential risk of cardiovascular diseases which are the number one cause of death globally. Since cardiac signals are nonstationary and nonlinear in nature, complexity measures are better suited to handle such data. In this study, non-invasive methods for detection of cardiac aging using complexity measures are explored. Lempel-Ziv (LZ) complexity, Approximate Entropy (ApEn) and Effort-to-Compress (ETC) measures are used to differentiate between healthy young and old subjects using heartbeat interval data. We show that both LZ and ETC complexity measures are able to differentiate between young and old subjects with only 10 data samples while ApEn requires at least 15 data samples.
△ Less
Submitted 1 March, 2016;
originally announced March 2016.
-
Neural Signal Multiplexing via Compressed Sensing
Authors:
Nithin Nagaraj,
K. R. Sahasranand
Abstract:
Transport of neural signals in the brain is challenging, owing to neural interference and neural noise. There is experimental evidence of multiplexing of sensory information across population of neurons, particularly in the vertebrate visual and olfactory systems. Recently, it has been discovered that in lateral intraparietal cortex of the brain, decision signals are multiplexed with decision-irre…
▽ More
Transport of neural signals in the brain is challenging, owing to neural interference and neural noise. There is experimental evidence of multiplexing of sensory information across population of neurons, particularly in the vertebrate visual and olfactory systems. Recently, it has been discovered that in lateral intraparietal cortex of the brain, decision signals are multiplexed with decision-irrelevant visual signals. Furthermore, it is well known that several cortical neurons exhibit chaotic spiking patterns. Multiplexing of chaotic neural signals and their successful demultiplexing in the neurons amidst interference and noise, is difficult to explain. In this work, a novel compressed sensing model for efficient multiplexing of chaotic neural signals constructed using the Hindmarsh-Rose spiking model is proposed. The signals are multiplexed from a pre-synaptic neuron to its neighbouring post-synaptic neuron, in the presence of $10^4$ interfering noisy neural signals and demultiplexed using compressed sensing techniques.
△ Less
Submitted 13 January, 2016;
originally announced January 2016.
-
Compressed Shattering
Authors:
Harikumar Kannampillil,
Anand Krishnadas Nambisan,
Sandra Kizhakkekundil,
Shreeja Sugathan,
Nithin Nagaraj
Abstract:
The central idea of compressed sensing is to exploit the fact that most signals of interest are sparse in some domain and use this to reduce the number of measurements to encode. However, if the sparsity of the input signal is not precisely known, but known to lie within a specified range, compressed sensing as such cannot exploit this fact and would need to use the same number of measurements eve…
▽ More
The central idea of compressed sensing is to exploit the fact that most signals of interest are sparse in some domain and use this to reduce the number of measurements to encode. However, if the sparsity of the input signal is not precisely known, but known to lie within a specified range, compressed sensing as such cannot exploit this fact and would need to use the same number of measurements even for a very sparse signal. In this paper, we propose a novel method called Compressed Shattering to adapt compressed sensing to the specified sparsity range, without changing the sensing matrix by creating shattered signals which have fixed sparsity. This is accomplished by first suitably permuting the input spectrum and then using a filter bank to create fixed sparsity shattered signals. By ensuring that all the shattered signals are utmost 1-sparse, we make use of a simple but efficient deterministic sensing matrix to yield very low number of measurements. For a discrete-time signal of length 1000, with a sparsity range of $5 - 25$, traditional compressed sensing requires $175$ measurements, whereas Compressed Shattering would only need $20 - 100$ measurements.
△ Less
Submitted 10 January, 2016;
originally announced January 2016.
-
Lossless Data Compression with Error Detection using Cantor Set
Authors:
Nithin Nagaraj
Abstract:
In 2009, a lossless compression algorithm based on 1D chaotic maps known as Generalized Luröth Series (or GLS) has been proposed. This algorithm (GLS-coding) encodes the input message as a symbolic sequence on an appropriate 1D chaotic map (GLS) and the compressed file is obtained as the initial value by iterating backwards on the map. For ergodic sources, it was shown that GLS-coding achieves the…
▽ More
In 2009, a lossless compression algorithm based on 1D chaotic maps known as Generalized Luröth Series (or GLS) has been proposed. This algorithm (GLS-coding) encodes the input message as a symbolic sequence on an appropriate 1D chaotic map (GLS) and the compressed file is obtained as the initial value by iterating backwards on the map. For ergodic sources, it was shown that GLS-coding achieves the best possible lossless compression (in the noiseless setting) bounded by Shannon entropy. However, in the presence of noise, even small errors in the compressed file leads to catastrophic decoding errors owing to sensitive dependence on initial values. In this paper, we first show that Repetition codes $\mathcal{R}_n$ (every symbol is repeated $n$ times, where $n$ is a positive odd integer), the oldest and the most basic error correction and detection codes in literature, actually lie on a Cantor set with a fractal dimension of $\frac{1}{n}$, which is also the rate of the code. Inspired by this, we incorporate error detection capability to GLS-coding by ensuring that the compressed file (initial value on the map) lies on a Cantor set of measure zero. Even a 1-bit error in the initial value will throw it outside the Cantor set which can be detected while decoding. The error detection performance (and also the rate of the code) can be controlled by the fractal dimension of the Cantor set and could be suitably adjusted depending on the noise level of the communication channel.
△ Less
Submitted 10 August, 2013;
originally announced August 2013.
-
Comment on 'Interpretation of the Lempel-Ziv Complexity Measure in the context of Biomedical Signal Analysis'
Authors:
Karthi Balasubramanian,
Gayathri R Prabhu,
Nithin Nagaraj
Abstract:
In this Communication, we express our reservations on some aspects of the interpretation of the Lempel-Ziv Complexity measure (LZ) by Mateo et al. in "Interpretation of the Lempel-Ziv complexity measure in the context of biomedical signal analysis," IEEE Trans. Biomed. Eng., vol. 53, no. 11, pp. 2282-2288, Nov. 2006. In particular, we comment on the dependence of the LZ complexity measure on numbe…
▽ More
In this Communication, we express our reservations on some aspects of the interpretation of the Lempel-Ziv Complexity measure (LZ) by Mateo et al. in "Interpretation of the Lempel-Ziv complexity measure in the context of biomedical signal analysis," IEEE Trans. Biomed. Eng., vol. 53, no. 11, pp. 2282-2288, Nov. 2006. In particular, we comment on the dependence of the LZ complexity measure on number of harmonics, frequency content and amplitude modulation. We disagree with the following statements made by Mateo et al.
1. "LZ is not sensitive to the number of harmonics in periodic signals."
2. "LZ increases as the frequency of a sinusoid increases."
3. "Amplitude modulation of a signal doesnot result in an increase in LZ."
We show the dependence of LZ complexity measure on harmonics and amplitude modulation by using a modified version of the synthetic signal that has been used in the original paper. Also, the second statement is a generic statement which is not entirely true. This is true only in the low frequency regime and definitely not true in moderate and high frequency regimes.
△ Less
Submitted 5 August, 2013; v1 submitted 1 August, 2013;
originally announced August 2013.
-
Classification of Periodic, Chaotic and Random Sequences using NSRPS Complexity Measure
Authors:
Karthi Balasubramanian,
Gayathri R. Prabhu,
Lakshmipriya V. K.,
Maneesha Krishnan,
Praveena R.,
Nithin Nagaraj
Abstract:
Data compression algorithms are generally perceived as being of interest for data communication and storage purposes only. However, their use in the field of data classification and analysis is also of equal importance. Automatic data classification and analysis finds use in varied fields like bioinformatics, language and sequence recognition and authorship attribution. Different complexity measur…
▽ More
Data compression algorithms are generally perceived as being of interest for data communication and storage purposes only. However, their use in the field of data classification and analysis is also of equal importance. Automatic data classification and analysis finds use in varied fields like bioinformatics, language and sequence recognition and authorship attribution. Different complexity measures proposed in literature like Shannon entropy, Relative entropy, Kolmogrov and Algorithmic complexity have drawbacks that make these methods ineffective in analyzing short sequences that are typical in population dynamics and other fields.
In this paper, we study Non-Sequential Recursive Pair Substitution (NSRPS), a lossless compression algorithm first proposed by Ebeling {\it et al.} [Math. Biosc. 52, 1980] and Jiménez-Montaño {\it et al.} [arXiv:cond-mat/0204134, 2002]). Using this algorithm, a new complexity measure was recently proposed (Nagaraj {\it et al.} [arXiv:nlin.CD/1101.4341v1, 2011]). In this work, we use NSRPS complexity measure for analyzing and classifying symbolic sequences generated by 1D chaotic dynamical systems. Even with learning data-sets of length as small as 25 and test data-sets of length as small as 10, NSRPS measure is able to accurately classify the test sequence as periodic, chaotic or random. For such short data lengths, methods which use entropy measure and traditional lossless compression algorithm like LZ77 [A.Lempel and J.Ziv, IEEE Trans. Inform. Theory {\bf 22}, 75 (1976)] (used for instance by {\it Gzip}, {\it Winzip} etc.) fails.
△ Less
Submitted 22 May, 2012;
originally announced May 2012.
-
Lossless Compression and Complexity of Chaotic Sequences
Authors:
Nithin Nagaraj,
Mathew Shaji Kavalekalam,
Arjun Venugopal T.,
Nithin Krishnan
Abstract:
We investigate the complexity of short symbolic sequences of chaotic dynamical systems by using lossless compression algorithms. In particular, we study Non-Sequential Recursive Pair Substitution (NSRPS), a lossless compression algorithm first proposed by W. Ebeling et al. [Math. Biosc. 52, 1980] and Jiménez-Montaño et al. [arXiv:cond-mat/0204134, 2002]) which was subsequently shown to be optimal.…
▽ More
We investigate the complexity of short symbolic sequences of chaotic dynamical systems by using lossless compression algorithms. In particular, we study Non-Sequential Recursive Pair Substitution (NSRPS), a lossless compression algorithm first proposed by W. Ebeling et al. [Math. Biosc. 52, 1980] and Jiménez-Montaño et al. [arXiv:cond-mat/0204134, 2002]) which was subsequently shown to be optimal. NSPRS has also been used to estimate Entropy of written English (P. Grassberger [arXiv:physics/0207023, 2002]). We propose a new measure of complexity - defined as the number of iterations of NSRPS required to transform the input sequence into a constant sequence. We test this measure on symbolic sequences of the Logistic map for various values of the bifurcation parameter. The proposed measure of complexity is easy to compute and is observed to be highly correlated with the Lyapunov exponent of the original non-linear time series, even for very short symbolic sequences (as short as 50 samples). Finally, we construct symbolic sequences from the Skew-Tent map which are incompressible by popular compression algorithms like WinZip, WinRAR and 7-Zip, but compressible by NSRPS.
△ Less
Submitted 22 January, 2011;
originally announced January 2011.
-
Sharing Graphs
Authors:
K. R. Sahasranand,
Nithin Nagaraj
Abstract:
Almost all known secret sharing schemes work on numbers. Such methods will have difficulty in sharing graphs since the number of graphs increases exponentially with the number of nodes. We propose a secret sharing scheme for graphs where we use graph intersection for reconstructing the secret which is hidden as a sub graph in the shares. Our method does not rely on heavy computational operations s…
▽ More
Almost all known secret sharing schemes work on numbers. Such methods will have difficulty in sharing graphs since the number of graphs increases exponentially with the number of nodes. We propose a secret sharing scheme for graphs where we use graph intersection for reconstructing the secret which is hidden as a sub graph in the shares. Our method does not rely on heavy computational operations such as modular arithmetic or polynomial interpolation but makes use of very basic operations like assignment and checking for equality, and graph intersection can also be performed visually. In certain cases, the secret could be reconstructed using just pencil and paper by authorised parties but cannot be broken by an adversary even with unbounded computational power. The method achieves perfect secrecy for (2, n) scheme and requires far fewer operations compared to Shamir's algorithm. The proposed method could be used to share objects such as matrices, sets, plain text and even a heterogeneous collection of these. Since we do not require a previously agreed upon encoding scheme, the method is very suitable for sharing heterogeneous collection of objects in a dynamic fashion.
△ Less
Submitted 15 September, 2010;
originally announced September 2010.
-
How not to share a set of secrets
Authors:
K. R. Sahasranand,
Nithin Nagaraj,
S. Rajan
Abstract:
This note analyses one of the existing space efficient secret sharing schemes and suggests vulnerabilities in its design. We observe that the said algorithm fails for certain choices of the set of secrets and there is no reason for preferring this particular scheme over alternative schemes. The paper also elaborates the adoption of a scheme proposed by Hugo Krawczyk as an extension of Shamir's s…
▽ More
This note analyses one of the existing space efficient secret sharing schemes and suggests vulnerabilities in its design. We observe that the said algorithm fails for certain choices of the set of secrets and there is no reason for preferring this particular scheme over alternative schemes. The paper also elaborates the adoption of a scheme proposed by Hugo Krawczyk as an extension of Shamir's scheme, for a set of secrets. Such an implementation is space optimal and works for all choices of secrets. We also propose two new methods of attack which are valid under certain assumptions and observe that it is the elimination of random values that facilitates these kinds of attacks.
△ Less
Submitted 9 March, 2010; v1 submitted 12 January, 2010;
originally announced January 2010.
-
Huffman Coding as a Non-linear Dynamical System
Authors:
Nithin Nagaraj
Abstract:
In this paper, source coding or data compression is viewed as a measurement problem. Given a measurement device with fewer states than the observable of a stochastic source, how can one capture the essential information? We propose modeling stochastic sources as piecewise linear discrete chaotic dynamical systems known as Generalized Luröth Series (GLS) which dates back to Georg Cantor's work in…
▽ More
In this paper, source coding or data compression is viewed as a measurement problem. Given a measurement device with fewer states than the observable of a stochastic source, how can one capture the essential information? We propose modeling stochastic sources as piecewise linear discrete chaotic dynamical systems known as Generalized Luröth Series (GLS) which dates back to Georg Cantor's work in 1869. The Lyapunov exponent of GLS is equal to the Shannon's entropy of the source (up to a constant of proportionality). By successively approximating the source with GLS having fewer states (with the closest Lyapunov exponent), we derive a binary coding algorithm which exhibits minimum redundancy (the least average codeword length with integer codeword lengths). This turns out to be a re-discovery of Huffman coding, the popular lossless compression algorithm used in the JPEG international standard for still image compression.
△ Less
Submitted 19 June, 2009;
originally announced June 2009.
-
Increasing Average Period Lengths by Switching of Robust Chaos Maps in Finite Precision
Authors:
Nithin Nagaraj,
Mahesh C. Shastry,
Prabhakar G. Vaidya
Abstract:
Grebogi, Ott and Yorke (Phys. Rev. A 38(7), 1988) have investigated the effect of finite precision on average period length of chaotic maps. They showed that the average length of periodic orbits ($T$) of a dynamical system scales as a function of computer precision ($ε$) and the correlation dimension ($d$) of the chaotic attractor: $T \sim ε^{-d/2}$. In this work, we are concerned with increasi…
▽ More
Grebogi, Ott and Yorke (Phys. Rev. A 38(7), 1988) have investigated the effect of finite precision on average period length of chaotic maps. They showed that the average length of periodic orbits ($T$) of a dynamical system scales as a function of computer precision ($ε$) and the correlation dimension ($d$) of the chaotic attractor: $T \sim ε^{-d/2}$. In this work, we are concerned with increasing the average period length which is desirable for chaotic cryptography applications. Our experiments reveal that random and chaotic switching of deterministic chaotic dynamical systems yield higher average length of periodic orbits as compared to simple sequential switching or absence of switching. To illustrate the application of switching, a novel generalization of the Logistic map that exhibits Robust Chaos (absence of attracting periodic orbits) is first introduced. We then propose a pseudo-random number generator based on chaotic switching between Robust Chaos maps which is found to successfully pass stringent statistical tests of randomness.
△ Less
Submitted 22 November, 2008; v1 submitted 12 November, 2008;
originally announced November 2008.
-
Multiplexing of discrete chaotic signals in presence of noise
Authors:
Nithin Nagaraj,
Prabhakar G. Vaidya
Abstract:
In this paper, multiplexing of discrete chaotic signals in the presence of noise is investigated. Existing methods are based on chaotic synchronization which is susceptible to noise and parameter mismatch. Furthermore, these methods fail for multiplexing more than two discrete chaotic signals. We propose two novel methods to multiplex multiple discrete chaotic signals based on the principle of s…
▽ More
In this paper, multiplexing of discrete chaotic signals in the presence of noise is investigated. Existing methods are based on chaotic synchronization which is susceptible to noise and parameter mismatch. Furthermore, these methods fail for multiplexing more than two discrete chaotic signals. We propose two novel methods to multiplex multiple discrete chaotic signals based on the principle of symbolic sequence invariance in the presence of noise and finite precision implementation of finding the initial condition of an arbitrarily long symbolic sequence of a chaotic map.
△ Less
Submitted 29 October, 2008;
originally announced October 2008.
-
One-Time Pad, Arithmetic Coding and Logic Gates: An unifying theme using Dynamical Systems
Authors:
Nithin Nagaraj,
Prabhakar G. Vaidya
Abstract:
In this letter, we prove that the perfectly secure One-Time Pad (OTP) encryption can be seen as finding the initial condition on the binary map under a random switch based on the perfectly random pad. This turns out to be a special case of Grangetto's randomized arithmetic coding performed on the Binary Map. Furthermore, we derive the set of possible perfect secrecy systems using such an approac…
▽ More
In this letter, we prove that the perfectly secure One-Time Pad (OTP) encryption can be seen as finding the initial condition on the binary map under a random switch based on the perfectly random pad. This turns out to be a special case of Grangetto's randomized arithmetic coding performed on the Binary Map. Furthermore, we derive the set of possible perfect secrecy systems using such an approach. Since OTP encryption is an XOR operation, we thus have a dynamical systems implementation of the XOR gate. We show similar implementations for other gates such as NOR, NAND, OR, XNOR, AND and NOT. The dynamical systems framework unifies the three areas to which Shannon made foundational contributions: lossless compression (Source Coding), perfect encryption (Cryptography), and design of logic gates (Computation)
△ Less
Submitted 1 March, 2008;
originally announced March 2008.
-
Cryptanalysis of a Chaotic Image Encryption Algorithm
Authors:
Nikhil Balaji,
Nithin Nagaraj
Abstract:
Line map, an invertible, two-dimensional chaotic encryption algorithm was introduced recently. In this paper, we propose several weaknesses of the method based on standard cryptanalytic attacks. We perform a side-channel attack by observing the execution time of the encryption algorithm and successfully reduce the key space by a factor of 10^4 for a key length of 16 digits. We find the existence…
▽ More
Line map, an invertible, two-dimensional chaotic encryption algorithm was introduced recently. In this paper, we propose several weaknesses of the method based on standard cryptanalytic attacks. We perform a side-channel attack by observing the execution time of the encryption algorithm and successfully reduce the key space by a factor of 10^4 for a key length of 16 digits. We find the existence of equivalent keys which reduce the key space by a significant margin, even in the absence of any attack. Also, we find that the ciphertext is not sensitive to small changes in the plaintext due to poor diffusion.
△ Less
Submitted 12 January, 2008; v1 submitted 1 January, 2008;
originally announced January 2008.
-
A Non-linear Generalization of Singular Value Decomposition and its Application to Cryptanalysis
Authors:
Prabhakar G. Vaidya,
Sa**i Anand P. S,
Nithin Nagaraj
Abstract:
Singular Value Decomposition (SVD) is a powerful tool in linear algebra.We propose an extension of SVD for both the qualitative detection and quantitative determination of nonlinearity in a time series. The paper illustrates nonlinear SVD with the help of data generated from nonlinear maps and flows (differential equations).
Singular Value Decomposition (SVD) is a powerful tool in linear algebra.We propose an extension of SVD for both the qualitative detection and quantitative determination of nonlinearity in a time series. The paper illustrates nonlinear SVD with the help of data generated from nonlinear maps and flows (differential equations).
△ Less
Submitted 11 February, 2009; v1 submitted 30 November, 2007;
originally announced November 2007.
-
A Non-linear Dynamical Systems' Proof of Kraft-McMillan Inequality and its Converse
Authors:
Nithin Nagaraj
Abstract:
In this short paper, we shall provide a dynamical systems' proof of the famous Kraft-McMillan inequality and its converse. Kraft-McMillan inequality is a basic result in information theory which gives a necessary and sufficient condition for the lengths of the codewords of a code to be uniquely decodable.
In this short paper, we shall provide a dynamical systems' proof of the famous Kraft-McMillan inequality and its converse. Kraft-McMillan inequality is a basic result in information theory which gives a necessary and sufficient condition for the lengths of the codewords of a code to be uniquely decodable.
△ Less
Submitted 31 October, 2007;
originally announced October 2007.
-
A non-linear dynamical systems approach to source compression for constrained sources
Authors:
Nithin Nagaraj,
Prabhakar G. Vaidya,
Rajesh Sundaresan
Abstract:
We have recently established a strong connection between the Tent map (also known as Generalized Luroth Series or GLS which is a chaotic, ergodic and lebesgue measure preserving non-linear dynamical system) and Arithmetic coding which is a popular source compression algorithm used in international compression standards such as JPEG2000 and H.264. This was for independent and identically distribu…
▽ More
We have recently established a strong connection between the Tent map (also known as Generalized Luroth Series or GLS which is a chaotic, ergodic and lebesgue measure preserving non-linear dynamical system) and Arithmetic coding which is a popular source compression algorithm used in international compression standards such as JPEG2000 and H.264. This was for independent and identically distributed binary sources. In this paper, we address the problem of compression of ergodic Markov binary sources with certain words forbidden from the message space. We shall show that GLS can be modified suitably to achieve Shannon's entropy rate for these sources.
△ Less
Submitted 11 September, 2007;
originally announced September 2007.
-
Joint Entropy Coding and Encryption using Robust Chaos
Authors:
Nithin Nagaraj,
Prabhakar G Vaidya,
Kishor G Bhat
Abstract:
We propose a framework for joint entropy coding and encryption using Chaotic maps. We begin by observing that the message symbols can be treated as the symbolic sequence of a discrete dynamical system. For an appropriate choice of the dynamical system, we could back-iterate and encode the message as the initial condition of the dynamical system. We show that such an encoding achieves Shannon's e…
▽ More
We propose a framework for joint entropy coding and encryption using Chaotic maps. We begin by observing that the message symbols can be treated as the symbolic sequence of a discrete dynamical system. For an appropriate choice of the dynamical system, we could back-iterate and encode the message as the initial condition of the dynamical system. We show that such an encoding achieves Shannon's entropy and hence optimal for compression. It turns out that the appropriate discrete dynamical system to achieve optimality is the piecewise-linear Generalized Luroth Series (GLS) and further that such an entropy coding technique is exactly equivalent to the popular Arithmetic Coding algorithm. GLS is a generalization of Arithmetic Coding with different modes of operation.
GLS preserves the Lebesgue measure and is ergodic. We show that these properties of GLS enable a framework for joint compression and encryption and thus give a justification of the recent work of Grangetto et al. and Wen et al. Both these methods have the obvious disadvantage of the key length being equal to the message length for strong security. We derive measure preserving piece-wise non-linear GLS (nGLS) and their skewed cousins, which exhibit Robust Chaos. We propose a joint entropy coding and encryption framework using skewed-nGLS and demonstrate Shannon's desired sensitivity to the key parameter. Potentially, our method could improve the security and key efficiency over Grangetto's method while still maintaining the total compression ratio. This is a new area of research with promising applications in communications.
△ Less
Submitted 22 August, 2006;
originally announced August 2006.
-
The B-Exponential Map: A Generalization of the Logistic Map, and Its Applications In Generating Pseudo-random Numbers
Authors:
Mahesh C Shastry,
Nithin Nagaraj,
Prabhakar G Vaidya
Abstract:
A 1-dimensional generalization of the well known Logistic Map is proposed. The proposed family of maps is referred to as the B-Exponential Map. The dynamics of this map are analyzed and found to have interesting properties. In particular, the B-Exponential Map exhibits robust chaos for all real values of the parameter B >= e^(-4). We then propose a pseudo-random number generator based on the B-E…
▽ More
A 1-dimensional generalization of the well known Logistic Map is proposed. The proposed family of maps is referred to as the B-Exponential Map. The dynamics of this map are analyzed and found to have interesting properties. In particular, the B-Exponential Map exhibits robust chaos for all real values of the parameter B >= e^(-4). We then propose a pseudo-random number generator based on the B-Exponential Map by chaotically hop** between different trajectories for different values of B. We call this BEACH (B-Exponential All-Chaotic Map Hop**) pseudo-random number generator. BEACH successfully passes stringent statistical randomness tests such as ENT, NIST and Diehard. An implementation of BEACH is also outlined.
△ Less
Submitted 17 July, 2006; v1 submitted 14 July, 2006;
originally announced July 2006.
-
Re-visiting the One-Time Pad
Authors:
Nithin Nagaraj,
Vivek Vaidya,
Prabhakar G Vaidya
Abstract:
In 1949, Shannon proved the perfect secrecy of the Vernam cryptographic system,also popularly known as the One-Time Pad (OTP). Since then, it has been believed that the perfectly random and uncompressible OTP which is transmitted needs to have a length equal to the message length for this result to be true. In this paper, we prove that the length of the transmitted OTP which actually contains us…
▽ More
In 1949, Shannon proved the perfect secrecy of the Vernam cryptographic system,also popularly known as the One-Time Pad (OTP). Since then, it has been believed that the perfectly random and uncompressible OTP which is transmitted needs to have a length equal to the message length for this result to be true. In this paper, we prove that the length of the transmitted OTP which actually contains useful information need not be compromised and could be less than the message length without sacrificing perfect secrecy. We also provide a new interpretation for the OTP encryption by treating the message bits as making True/False statements about the pad, which we define as a private-object. We introduce the paradigm of private-object cryptography where messages are transmitted by verifying statements about a secret-object. We conclude by suggesting the use of Formal Axiomatic Systems for investing N bits of secret.
△ Less
Submitted 18 August, 2005;
originally announced August 2005.