-
Efficient Data Fusion using the Tsetlin Machine
Authors:
Rupsa Saha,
Vladimir I. Zadorozhny,
Ole-Christoffer Granmo
Abstract:
We propose a novel way of assessing and fusing noisy dynamic data using a Tsetlin Machine. Our approach consists in monitoring how explanations in form of logical clauses that a TM learns changes with possible noise in dynamic data. This way TM can recognize the noise by lowering weights of previously learned clauses, or reflect it in the form of new clauses. We also perform a comprehensive experi…
▽ More
We propose a novel way of assessing and fusing noisy dynamic data using a Tsetlin Machine. Our approach consists in monitoring how explanations in form of logical clauses that a TM learns changes with possible noise in dynamic data. This way TM can recognize the noise by lowering weights of previously learned clauses, or reflect it in the form of new clauses. We also perform a comprehensive experimental study using notably different datasets that demonstrated high performance of the proposed approach.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
TBAM: Towards An Agent-Based Model to Enrich Twitter Data
Authors:
Usman Anjum,
Vladimir Zadorozhny,
Prashant Krishnamurthy
Abstract:
Twitter (one example of microblogging) is widely being used by researchers to understand human behavior, specifically how people behave when a significant event occurs and how it changes user microblogging patterns. The changing microblogging behavior can reveal patterns that can help in detecting real-world events. However, the Twitter data that is available has limitations, such as, it is incomp…
▽ More
Twitter (one example of microblogging) is widely being used by researchers to understand human behavior, specifically how people behave when a significant event occurs and how it changes user microblogging patterns. The changing microblogging behavior can reveal patterns that can help in detecting real-world events. However, the Twitter data that is available has limitations, such as, it is incomplete and noisy and the samples are irregular. In this paper we create a model, called Twitter Behavior Agent-Based Model (TBAM) to simulate Twitter pattern and behavior using Agent-Based Modeling (ABM). The generated data from ABM simulations can be used in place or to complement the real-world data toward improving the accuracy of event detection. We confirm the validity of our model by finding the cross-correlation between the real data collected from Twitter and the data generated using TBAM.
△ Less
Submitted 31 January, 2023;
originally announced February 2023.
-
A Relational Tsetlin Machine with Applications to Natural Language Understanding
Authors:
Rupsa Saha,
Ole-Christoffer Granmo,
Vladimir I. Zadorozhny,
Morten Goodwin
Abstract:
TMs are a pattern recognition approach that uses finite state machines for learning and propositional logic to represent patterns. In addition to being natively interpretable, they have provided competitive accuracy for various tasks. In this paper, we increase the computing power of TMs by proposing a first-order logic-based framework with Herbrand semantics. The resulting TM is relational and ca…
▽ More
TMs are a pattern recognition approach that uses finite state machines for learning and propositional logic to represent patterns. In addition to being natively interpretable, they have provided competitive accuracy for various tasks. In this paper, we increase the computing power of TMs by proposing a first-order logic-based framework with Herbrand semantics. The resulting TM is relational and can take advantage of logical structures appearing in natural language, to learn rules that represent how actions and consequences are related in the real world. The outcome is a logic program of Horn clauses, bringing in a structured view of unstructured data. In closed-domain question-answering, the first-order representation produces 10x more compact KBs, along with an increase in answering accuracy from 94.83% to 99.48%. The approach is further robust towards erroneous, missing, and superfluous information, distilling the aspects of a text that are important for real-world understanding.
△ Less
Submitted 22 February, 2021;
originally announced February 2021.
-
FaNDS: Fake News Detection System Using Energy Flow
Authors:
Jiawei Xu,
Vladimir Zadorozhny,
Danchen Zhang,
John Grant
Abstract:
Recently, the term "fake news" has been broadly and extensively utilized for disinformation, misinformation, hoaxes, propaganda, satire, rumors, click-bait, and junk news. It has become a serious problem around the world. We present a new system, FaNDS, that detects fake news efficiently. The system is based on several concepts used in some previous works but in a different context. There are two…
▽ More
Recently, the term "fake news" has been broadly and extensively utilized for disinformation, misinformation, hoaxes, propaganda, satire, rumors, click-bait, and junk news. It has become a serious problem around the world. We present a new system, FaNDS, that detects fake news efficiently. The system is based on several concepts used in some previous works but in a different context. There are two main concepts: an Inconsistency Graph and Energy Flow. The Inconsistency Graph contains news items as nodes and inconsistent opinions between them for edges. Energy Flow assigns each node an initial energy and then some energy is propagated along the edges until the energy distribution on all nodes converges. To illustrate FaNDS we use the original data from the Fake News Challenge (FNC-1). First, the data has to be reconstructed in order to generate the Inconsistency Graph. The graph contains various subgraphs with well-defined shapes that represent different types of connections between the news items. Then the Energy Flow method is applied. The nodes with high energy are the candidates for being fake news. In our experiments, all these were indeed fake news as we checked each using several reliable web sites. We compared FaNDS to several other fake news detection methods and found it to be more sensitive in discovering fake news items.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
Process Discovery using Classification Tree Hidden Semi-Markov Model
Authors:
Yihuang Kang,
Vladimir Zadorozhny
Abstract:
Various and ubiquitous information systems are being used in monitoring, exchanging, and collecting information. These systems are generating massive amount of event sequence logs that may help us understand underlying phenomenon. By analyzing these logs, we can learn process models that describe system procedures, predict the development of the system, or check whether the changes are expected. I…
▽ More
Various and ubiquitous information systems are being used in monitoring, exchanging, and collecting information. These systems are generating massive amount of event sequence logs that may help us understand underlying phenomenon. By analyzing these logs, we can learn process models that describe system procedures, predict the development of the system, or check whether the changes are expected. In this paper, we consider a novel technique that models these sequences of events in temporal-probabilistic manners. Specifically, we propose a probabilistic process model that combines hidden semi-Markov model and classification trees learning. Our experimental result shows that the proposed approach can answer a kind of question-"what are the most frequent sequence of system dynamics relevant to a given sequence of observable events?". For example, "Given a series of medical treatments, what are the most relevant patients' health condition pattern changes at different times?"
△ Less
Submitted 12 July, 2018;
originally announced July 2018.
-
Process Monitoring Using Maximum Sequence Divergence
Authors:
Yihuang Kang,
Vladimir Zadorozhny
Abstract:
Process Monitoring involves tracking a system's behaviors, evaluating the current state of the system, and discovering interesting events that require immediate actions. In this paper, we consider monitoring temporal system state sequences to help detect the changes of dynamic systems, check the divergence of the system development, and evaluate the significance of the deviation. We begin with dis…
▽ More
Process Monitoring involves tracking a system's behaviors, evaluating the current state of the system, and discovering interesting events that require immediate actions. In this paper, we consider monitoring temporal system state sequences to help detect the changes of dynamic systems, check the divergence of the system development, and evaluate the significance of the deviation. We begin with discussions of data reduction, symbolic data representation, and the anomaly detection in temporal discrete sequences. Time-series representation methods are also discussed and used in this paper to discretize raw data into sequences of system states. Markov Chains and stationary state distributions are continuously generated from temporal sequences to represent snapshots of the system dynamics in different time frames. We use generalized Jensen-Shannon Divergence as the measure to monitor changes of the stationary symbol probability distributions and evaluate the significance of system deviations. We prove that the proposed approach is able to detect deviations of the systems we monitor and assess the deviation significance in probabilistic manner.
△ Less
Submitted 9 July, 2018;
originally announced July 2018.