-
Literature Filtering for Systematic Reviews with Transformers
Authors:
John Hawkins,
David Tivey
Abstract:
Identifying critical research within the growing body of academic work is an essential element of quality research. Systematic review processes, used in evidence-based medicine, formalise this as a procedure that must be followed in a research program. However, it comes with an increasing burden in terms of the time required to identify the important articles of research for a given topic. In this…
▽ More
Identifying critical research within the growing body of academic work is an essential element of quality research. Systematic review processes, used in evidence-based medicine, formalise this as a procedure that must be followed in a research program. However, it comes with an increasing burden in terms of the time required to identify the important articles of research for a given topic. In this work, we develop a method for building a general-purpose filtering system that matches a research question, posed as a natural language description of the required content, against a candidate set of articles obtained via the application of broad search terms. Our results demonstrate that transformer models, pre-trained on biomedical literature then fine tuned for the specific task, offer a promising solution to this problem. The model can remove large volumes of irrelevant articles for most research questions.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Enhanced Review Detection and Recognition: A Platform-Agnostic Approach with Application to Online Commerce
Authors:
Priyabrata Karmakar,
John Hawkins
Abstract:
Online commerce relies heavily on user generated reviews to provide unbiased information about products that they have not physically seen. The importance of reviews has attracted multiple exploitative online behaviours and requires methods for monitoring and detecting reviews. We present a machine learning methodology for review detection and extraction, and demonstrate that it generalises for us…
▽ More
Online commerce relies heavily on user generated reviews to provide unbiased information about products that they have not physically seen. The importance of reviews has attracted multiple exploitative online behaviours and requires methods for monitoring and detecting reviews. We present a machine learning methodology for review detection and extraction, and demonstrate that it generalises for use across websites that were not contained in the training data. This method promises to drive applications for automatic detection and evaluation of reviews, regardless of their source. Furthermore, we showcase the versatility of our method by implementing and discussing three key applications for analysing reviews: Sentiment Inconsistency Analysis, which detects and filters out unreliable reviews based on inconsistencies between ratings and comments; Multi-language support, enabling the extraction and translation of reviews from various languages without relying on HTML scra**; and Fake review detection, achieved by integrating a trained NLP model to identify and distinguish between genuine and fake reviews.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Recursive deep learning framework for forecasting the decadal world economic outlook
Authors:
Tianyi Wang,
Rodney Beard,
John Hawkins,
Rohitash Chandra
Abstract:
Gross domestic product (GDP) is the most widely used indicator in macroeconomics and the main tool for measuring a country's economic ouput. Due to the diversity and complexity of the world economy, a wide range of models have been used, but there are challenges in making decadal GDP forecasts given unexpected changes such as pandemics and wars. Deep learning models are well suited for modeling te…
▽ More
Gross domestic product (GDP) is the most widely used indicator in macroeconomics and the main tool for measuring a country's economic ouput. Due to the diversity and complexity of the world economy, a wide range of models have been used, but there are challenges in making decadal GDP forecasts given unexpected changes such as pandemics and wars. Deep learning models are well suited for modeling temporal sequences have been applied for time series forecasting. In this paper, we develop a deep learning framework to forecast the GDP growth rate of the world economy over a decade. We use Penn World Table as the source of our data, taking data from 1980 to 2019, across 13 countries, such as Australia, China, India, the United States and so on. We test multiple deep learning models, LSTM, BD-LSTM, ED-LSTM and CNN, and compared their results with the traditional time series model (ARIMA,VAR). Our results indicate that ED-LSTM is the best performing model. We present a recursive deep learning framework to predict the GDP growth rate in the next ten years. We predict that most countries will experience economic growth slowdown, stagnation or even recession within five years; only China, France and India are predicted to experience stable, or increasing, GDP growth.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution
Authors:
Anthony Zador,
Sean Escola,
Blake Richards,
Bence Ölveczky,
Yoshua Bengio,
Kwabena Boahen,
Matthew Botvinick,
Dmitri Chklovskii,
Anne Churchland,
Claudia Clopath,
James DiCarlo,
Surya Ganguli,
Jeff Hawkins,
Konrad Koerding,
Alexei Koulakov,
Yann LeCun,
Timothy Lillicrap,
Adam Marblestone,
Bruno Olshausen,
Alexandre Pouget,
Cristina Savin,
Terrence Sejnowski,
Eero Simoncelli,
Sara Solla,
David Sussillo
, et al. (2 additional authors not shown)
Abstract:
Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts…
▽ More
Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts the focus from those capabilities like game playing and language that are especially well-developed or uniquely human to those capabilities, inherited from over 500 million years of evolution, that are shared with all animals. Building models that can pass the embodied Turing test will provide a roadmap for the next generation of AI.
△ Less
Submitted 22 February, 2023; v1 submitted 15 October, 2022;
originally announced October 2022.
-
Implementing the BBE Agent-Based Model of a Sports-Betting Exchange
Authors:
Dave Cliff,
James Hawkins,
James Keen,
Roberto Lau-Soto
Abstract:
We describe three independent implementations of a new agent-based model (ABM) that simulates a contemporary sports-betting exchange, such as those offered commercially by companies including Betfair, Smarkets, and Betdaq. The motivation for constructing this ABM, which is known as the Bristol Betting Exchange (BBE), is so that it can serve as a synthetic data generator, producing large volumes of…
▽ More
We describe three independent implementations of a new agent-based model (ABM) that simulates a contemporary sports-betting exchange, such as those offered commercially by companies including Betfair, Smarkets, and Betdaq. The motivation for constructing this ABM, which is known as the Bristol Betting Exchange (BBE), is so that it can serve as a synthetic data generator, producing large volumes of data that can be used to develop and test new betting strategies via advanced data analytics and machine learning techniques. Betting exchanges act as online platforms on which bettors can find willing counterparties to a bet, and they do this in a way that is directly comparable to the manner in which electronic financial exchanges, such as major stock markets, act as platforms that allow traders to find willing counterparties to buy from or sell to: the platform aggregates and anonymises orders from multiple participants, showing a summary of the market that is updated in real-time. In the first instance, BBE is aimed primarily at producing synthetic data for in-play betting (also known as in-race or in-game betting) where bettors can place bets on the outcome of a track-race event, such as a horse race, after the race has started and for as long as the race is underway, with betting only ceasing when the race ends. The rationale for, and design of, BBE has been described in detail in a previous paper that we summarise here, before discussing our comparative results which contrast a single-threaded implementation in Python, a multi-threaded implementation in Python, and an implementation where Python header-code calls simulations of the track-racing events written in OpenCL that execute on a 640-core GPU -- this runs approximately 1000 times faster than the single-threaded Python. Our source-code for BBE is freely available on GitHub.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
Minimum Viable Model Estimates for Machine Learning Projects
Authors:
John Hawkins
Abstract:
Prioritization of machine learning projects requires estimates of both the potential ROI of the business case and the technical difficulty of building a model with the required characteristics. In this work we present a technique for estimating the minimum required performance characteristics of a predictive model given a set of information about how it will be used. This technique will result in…
▽ More
Prioritization of machine learning projects requires estimates of both the potential ROI of the business case and the technical difficulty of building a model with the required characteristics. In this work we present a technique for estimating the minimum required performance characteristics of a predictive model given a set of information about how it will be used. This technique will result in robust, objective comparisons between potential projects. The resulting estimates will allow data scientists and managers to evaluate whether a proposed machine learning project is likely to succeed before any modelling needs to be done.
The technique has been implemented into the open source application MinViME (Minimum Viable Model Estimator) which can be installed via the PyPI python package management system, or downloaded directly from the GitHub repository. Available at https://github.com/john-hawkins/MinViME
△ Less
Submitted 1 January, 2021;
originally announced January 2021.
-
Demographics in Social Media Data for Public Health Research: Does it matter?
Authors:
Nina Cesare,
Christan Grant,
Jared B. Hawkins,
John S. Brownstein,
Elaine O. Nsoesie
Abstract:
Social media data provides propitious opportunities for public health research. However, studies suggest that disparities may exist in the representation of certain populations (e.g., people of lower socioeconomic status). To quantify and address these disparities in population representation, we need demographic information, which is usually missing from most social media platforms. Here, we prop…
▽ More
Social media data provides propitious opportunities for public health research. However, studies suggest that disparities may exist in the representation of certain populations (e.g., people of lower socioeconomic status). To quantify and address these disparities in population representation, we need demographic information, which is usually missing from most social media platforms. Here, we propose an ensemble approach for inferring demographics from social media data.
Several methods have been proposed for inferring demographic attributes such as, age, gender and race/ethnicity. However, most of these methods require large volumes of data, which makes their application to large scale studies challenging. We develop a scalable approach that relies only on user names to predict gender. We develop three separate classifiers trained on data containing the gender labels of 7,953 Twitter users from Kaggle.com. Next, we combine predictions from the individual classifiers using a stacked generalization technique and apply the ensemble classifier to a dataset of 36,085 geotagged foodborne illness related tweets from the United States.
Our ensemble approach achieves an accuracy, precision, recall, and F1 score of 0.828, 0.851, 0.852 and 0.837, respectively, higher than the individual machine learning approaches. The ensemble classifier also covers any user with an alphanumeric name, while the data matching approach, which achieves an accuracy of 0.917, only covers 67% of users. Application of our method to reports of foodborne illness in the United States highlights disparities in tweeting by gender and shows that counties with a high volume of foodborne-illness related tweets are heavily overrepresented by female Twitter users.
△ Less
Submitted 6 November, 2017; v1 submitted 30 October, 2017;
originally announced October 2017.
-
How do neurons operate on sparse distributed representations? A mathematical theory of sparsity, neurons and active dendrites
Authors:
Subutai Ahmad,
Jeff Hawkins
Abstract:
We propose a formal mathematical model for sparse representations and active dendrites in neocortex. Our model is inspired by recent experimental findings on active dendritic processing and NMDA spikes in pyramidal neurons. These experimental and modeling studies suggest that the basic unit of pattern memory in the neocortex is instantiated by small clusters of synapses operated on by localized no…
▽ More
We propose a formal mathematical model for sparse representations and active dendrites in neocortex. Our model is inspired by recent experimental findings on active dendritic processing and NMDA spikes in pyramidal neurons. These experimental and modeling studies suggest that the basic unit of pattern memory in the neocortex is instantiated by small clusters of synapses operated on by localized non-linear dendritic processes. We derive a number of scaling laws that characterize the accuracy of such dendrites in detecting activation patterns in a neuronal population under adverse conditions. We introduce the union property which shows that synapses for multiple patterns can be randomly mixed together within a segment and still lead to highly accurate recognition. We describe simulation results that provide further insight into sparse representations as well as two primary results. First we show that pattern recognition by a neuron with active dendrites can be extremely accurate and robust with high dimensional sparse inputs even when using a tiny number of synapses to recognize large patterns. Second, equations representing recognition accuracy of a dendrite predict optimal NMDA spiking thresholds under a generous set of assumptions. The prediction tightly matches NMDA spiking thresholds measured in the literature. Our model matches many of the known properties of pyramidal neurons. As such the theory provides a mathematical framework for understanding the benefits and limits of sparse representations in cortical networks.
△ Less
Submitted 12 May, 2016; v1 submitted 4 January, 2016;
originally announced January 2016.
-
Continuous online sequence learning with an unsupervised neural network model
Authors:
Yuwei Cui,
Subutai Ahmad,
Jeff Hawkins
Abstract:
The ability to recognize and predict temporal sequences of sensory inputs is vital for survival in natural environments. Based on many known properties of cortical neurons, hierarchical temporal memory (HTM) sequence memory is recently proposed as a theoretical framework for sequence learning in the cortex. In this paper, we analyze properties of HTM sequence memory and apply it to sequence learni…
▽ More
The ability to recognize and predict temporal sequences of sensory inputs is vital for survival in natural environments. Based on many known properties of cortical neurons, hierarchical temporal memory (HTM) sequence memory is recently proposed as a theoretical framework for sequence learning in the cortex. In this paper, we analyze properties of HTM sequence memory and apply it to sequence learning and prediction problems with streaming data. We show the model is able to continuously learn a large number of variable-order temporal sequences using an unsupervised Hebbian-like learning rule. The sparse temporal codes formed by the model can robustly handle branching temporal sequences by maintaining multiple predictions until there is sufficient disambiguating evidence. We compare the HTM sequence memory with other sequence learning algorithms, including statistical methods: autoregressive integrated moving average (ARIMA), feedforward neural networks: online sequential extreme learning machine (ELM), and recurrent neural networks: long short-term memory (LSTM) and echo-state networks (ESN), on sequence prediction problems with both artificial and real-world data. The HTM model achieves comparable accuracy to other state-of-the-art algorithms. The model also exhibits properties that are critical for sequence learning, including continuous online learning, the ability to handle multiple predictions and branching sequences with high order statistics, robustness to sensor noise and fault tolerance, and good performance without task-specific hyper- parameters tuning. Therefore the HTM sequence memory not only advances our understanding of how the brain may solve the sequence learning problem, but is also applicable to a wide range of real-world problems such as discrete and continuous sequence prediction, anomaly detection, and sequence classification.
△ Less
Submitted 28 April, 2016; v1 submitted 16 December, 2015;
originally announced December 2015.
-
Why Neurons Have Thousands of Synapses, A Theory of Sequence Memory in Neocortex
Authors:
Jeff Hawkins,
Subutai Ahmad
Abstract:
Neocortical neurons have thousands of excitatory synapses. It is a mystery how neurons integrate the input from so many synapses and what kind of large-scale network behavior this enables. It has been previously proposed that non-linear properties of dendrites enable neurons to recognize multiple patterns. In this paper we extend this idea by showing that a neuron with several thousand synapses ar…
▽ More
Neocortical neurons have thousands of excitatory synapses. It is a mystery how neurons integrate the input from so many synapses and what kind of large-scale network behavior this enables. It has been previously proposed that non-linear properties of dendrites enable neurons to recognize multiple patterns. In this paper we extend this idea by showing that a neuron with several thousand synapses arranged along active dendrites can learn to accurately and robustly recognize hundreds of unique patterns of cellular activity, even in the presence of large amounts of noise and pattern variation. We then propose a neuron model where some of the patterns recognized by a neuron lead to action potentials and define the classic receptive field of the neuron, whereas the majority of the patterns recognized by a neuron act as predictions by slightly depolarizing the neuron without immediately generating an action potential. We then present a network model based on neurons with these properties and show that the network learns a robust model of time-based sequences. Given the similarity of excitatory neurons throughout the neocortex and the importance of sequence memory in inference and behavior, we propose that this form of sequence memory is a universal property of neocortical tissue. We further propose that cellular layers in the neocortex implement variations of the same sequence memory algorithm to achieve different aspects of inference and behavior. The neuron and network models we introduce are robust over a wide range of parameters as long as the network uses a sparse distributed code of cellular activations. The sequence capacity of the network scales linearly with the number of synapses on each neuron. Thus neurons need thousands of synapses to learn the many temporal patterns in sensory stimuli and motor sequences.
△ Less
Submitted 1 December, 2015; v1 submitted 31 October, 2015;
originally announced November 2015.
-
Properties of Sparse Distributed Representations and their Application to Hierarchical Temporal Memory
Authors:
Subutai Ahmad,
Jeff Hawkins
Abstract:
Empirical evidence demonstrates that every region of the neocortex represents information using sparse activity patterns. This paper examines Sparse Distributed Representations (SDRs), the primary information representation strategy in Hierarchical Temporal Memory (HTM) systems and the neocortex. We derive a number of properties that are core to scaling, robustness, and generalization. We use the…
▽ More
Empirical evidence demonstrates that every region of the neocortex represents information using sparse activity patterns. This paper examines Sparse Distributed Representations (SDRs), the primary information representation strategy in Hierarchical Temporal Memory (HTM) systems and the neocortex. We derive a number of properties that are core to scaling, robustness, and generalization. We use the theory to provide practical guidelines and illustrate the power of SDRs as the basis of HTM. Our goal is to help create a unified mathematical and practical framework for SDRs as it relates to cortical function.
△ Less
Submitted 25 March, 2015;
originally announced March 2015.
-
Solving Set Constraint Satisfaction Problems using ROBDDs
Authors:
P. J. Hawkins,
V. Lagoon,
P. J. Stuckey
Abstract:
In this paper we present a new approach to modeling finite set domain constraint problems using Reduced Ordered Binary Decision Diagrams (ROBDDs). We show that it is possible to construct an efficient set domain propagator which compactly represents many set domains and set constraints using ROBDDs. We demonstrate that the ROBDD-based approach provides unprecedented flexibility in modeling constr…
▽ More
In this paper we present a new approach to modeling finite set domain constraint problems using Reduced Ordered Binary Decision Diagrams (ROBDDs). We show that it is possible to construct an efficient set domain propagator which compactly represents many set domains and set constraints using ROBDDs. We demonstrate that the ROBDD-based approach provides unprecedented flexibility in modeling constraint satisfaction problems, leading to performance improvements. We also show that the ROBDD-based modeling approach can be extended to the modeling of integer and multiset constraint problems in a straightforward manner. Since domain propagation is not always practical, we also show how to incorporate less strict consistency notions into the ROBDD framework, such as set bounds, cardinality bounds and lexicographic bounds consistency. Finally, we present experimental results that demonstrate the ROBDD-based solver performs better than various more conventional constraint solvers on several standard set constraint problems.
△ Less
Submitted 9 September, 2011;
originally announced September 2011.
-
Automated Real-Time Testing (ARTT) for Embedded Control Systems (ECS)
Authors:
Jon Hawkins,
Haung V. Nguyen,
Reginald B. Howard
Abstract:
Develo** real-time automated test systems for embedded control systems has been a real problem. Some engineers and scientists have used customized software and hardware as a solution, which can be very expensive and time consuming to develop. We have discovered how to integrate a suite of commercially available off-the-shelf software tools and hardware to develop a scalable test platform that…
▽ More
Develo** real-time automated test systems for embedded control systems has been a real problem. Some engineers and scientists have used customized software and hardware as a solution, which can be very expensive and time consuming to develop. We have discovered how to integrate a suite of commercially available off-the-shelf software tools and hardware to develop a scalable test platform that is capable of performing complete black-box testing for a dual-channel real-time Embedded-PLC-based control system (www.aps.anl.gov). We will discuss how the Vali/Test Pro testing methodology was implemented to structure testing for a personnel safety system with large quantities of requirements and test cases.
This work was supported by the U.S. Department of Energy, Basic Energy Sciences, under Contract No. W-31-109-Eng-38.
△ Less
Submitted 7 January, 2002; v1 submitted 2 November, 2001;
originally announced November 2001.