-
What's in an embedding? Would a rose by any embedding smell as sweet?
Authors:
Venkat Venkatasubramanian
Abstract:
Large Language Models (LLMs) are often criticized for lacking true "understanding" and the ability to "reason" with their knowledge, being seen merely as autocomplete systems. We believe that this assessment might be missing a nuanced insight. We suggest that LLMs do develop a kind of empirical "understanding" that is "geometry"-like, which seems adequate for a range of applications in NLP, comput…
▽ More
Large Language Models (LLMs) are often criticized for lacking true "understanding" and the ability to "reason" with their knowledge, being seen merely as autocomplete systems. We believe that this assessment might be missing a nuanced insight. We suggest that LLMs do develop a kind of empirical "understanding" that is "geometry"-like, which seems adequate for a range of applications in NLP, computer vision, coding assistance, etc. However, this "geometric" understanding, built from incomplete and noisy data, makes them unreliable, difficult to generalize, and lacking in inference capabilities and explanations, similar to the challenges faced by heuristics-based expert systems decades ago.
To overcome these limitations, we suggest that LLMs should be integrated with an "algebraic" representation of knowledge that includes symbolic AI elements used in expert systems. This integration aims to create large knowledge models (LKMs) that not only possess "deep" knowledge grounded in first principles, but also have the ability to reason and explain, mimicking human expert capabilities. To harness the full potential of generative AI safely and effectively, a paradigm shift is needed from LLM to more comprehensive LKM.
△ Less
Submitted 15 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Quo Vadis ChatGPT? From Large Language Models to Large Knowledge Models
Authors:
Venkat Venkatasubramanian,
Arijit Chakraborty
Abstract:
The startling success of ChatGPT and other large language models (LLMs) using transformer-based generative neural network architecture in applications such as natural language processing and image synthesis has many researchers excited about potential opportunities in process systems engineering (PSE). The almost human-like performance of LLMs in these areas is indeed very impressive, surprising,…
▽ More
The startling success of ChatGPT and other large language models (LLMs) using transformer-based generative neural network architecture in applications such as natural language processing and image synthesis has many researchers excited about potential opportunities in process systems engineering (PSE). The almost human-like performance of LLMs in these areas is indeed very impressive, surprising, and a major breakthrough. Their capabilities are very useful in certain tasks, such as writing first drafts of documents, code writing assistance, text summarization, etc. However, their success is limited in highly scientific domains as they cannot yet reason, plan, or explain due to their lack of in-depth domain knowledge. This is a problem in domains such as chemical engineering as they are governed by fundamental laws of physics and chemistry (and biology), constitutive relations, and highly technical knowledge about materials, processes, and systems. Although purely data-driven machine learning has its immediate uses, the long-term success of AI in scientific and engineering domains would depend on develo** hybrid AI systems that use first principles and technical knowledge effectively. We call these hybrid AI systems Large Knowledge Models (LKMs), as they will not be limited to only NLP-based techniques or NLP-like applications. In this paper, we discuss the challenges and opportunities in develo** such systems in chemical engineering.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Density and Affinity Dependent Social Segregation and Arbitrage Equilibrium in a Multi-class Schelling Game
Authors:
Venkat Venkatasubramanian,
Jessica Shi,
Leo Goldman,
Arun Sankar E. M.,
Abhishek Sivaram
Abstract:
Contrary to the widely believed hypothesis that larger, denser cities promote socioeconomic mixing, a recent study (Nilforoshan et al. 2023) reports the opposite behavior, i.e. more segregation. Here, we present a game-theoretic model that predicts such a density-dependent segregation outcome in both one- and two-class systems. The model provides key insights into the analytical conditions that le…
▽ More
Contrary to the widely believed hypothesis that larger, denser cities promote socioeconomic mixing, a recent study (Nilforoshan et al. 2023) reports the opposite behavior, i.e. more segregation. Here, we present a game-theoretic model that predicts such a density-dependent segregation outcome in both one- and two-class systems. The model provides key insights into the analytical conditions that lead to such behavior. Furthermore, the arbitrage equilibrium outcome implies the equality of effective utilities among all agents. This could be interpreted as all agents being equally "happy" in their respective environments in our ideal society. We believe that our model contributes towards a deeper mathematical understanding of social dynamics and behavior, which is important as we strive to develop more harmonious societies.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Jaynes Machine: The universal microstructure of deep neural networks
Authors:
Venkat Venkatasubramanian,
N. Sanjeevrajan,
Manasi Khandekar
Abstract:
We present a novel theory of the microstructure of deep neural networks. Using a theoretical framework called statistical teleodynamics, which is a conceptual synthesis of statistical thermodynamics and potential game theory, we predict that all highly connected layers of deep neural networks have a universal microstructure of connection strengths that is distributed lognormally ($LN(μ, σ)$). Furt…
▽ More
We present a novel theory of the microstructure of deep neural networks. Using a theoretical framework called statistical teleodynamics, which is a conceptual synthesis of statistical thermodynamics and potential game theory, we predict that all highly connected layers of deep neural networks have a universal microstructure of connection strengths that is distributed lognormally ($LN(μ, σ)$). Furthermore, under ideal conditions, the theory predicts that $μ$ and $σ$ are the same for all layers in all networks. This is shown to be the result of an arbitrage equilibrium where all connections compete and contribute the same effective utility towards the minimization of the overall loss function. These surprising predictions are shown to be supported by empirical data from six large-scale deep neural networks in real life. We also discuss how these results can be exploited to reduce the amount of data, time, and computational resources needed to train large deep neural networks.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
G-MATT: Single-step Retrosynthesis Prediction using Molecular Grammar Tree Transformer
Authors:
Kevin Zhang,
Vipul Mann,
Venkat Venkatasubramanian
Abstract:
Various template-based and template-free approaches have been proposed for single-step retrosynthesis prediction in recent years. While these approaches demonstrate strong performance from a data-driven metrics standpoint, many model architectures do not incorporate underlying chemistry principles. Here, we propose a novel chemistry-aware retrosynthesis prediction framework that combines powerful…
▽ More
Various template-based and template-free approaches have been proposed for single-step retrosynthesis prediction in recent years. While these approaches demonstrate strong performance from a data-driven metrics standpoint, many model architectures do not incorporate underlying chemistry principles. Here, we propose a novel chemistry-aware retrosynthesis prediction framework that combines powerful data-driven models with prior domain knowledge. We present a tree-to-sequence transformer architecture that utilizes hierarchical SMILES grammar-based trees, incorporating crucial chemistry information that is often overlooked by SMILES text-based representations, such as local structures and functional groups. The proposed framework, grammar-based molecular attention tree transformer (G-MATT), achieves significant performance improvements compared to baseline retrosynthesis models. G-MATT achieves a promising top-1 accuracy of 51% (top-10 accuracy of 79.1%), invalid rate of 1.5%, and bioactive similarity rate of 74.8% on the USPTO- 50K dataset. Additional analyses of G-MATT attention maps demonstrate the ability to retain chemistry knowledge without relying on excessively complex model architectures.
△ Less
Submitted 14 August, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
AI-driven Hypergraph Network of Organic Chemistry: Network Statistics and Applications in Reaction Classification
Authors:
Vipul Mann,
Venkat Venkatasubramanian
Abstract:
Rapid discovery of new reactions and molecules in recent years has been facilitated by the advancements in high throughput screening, accessibility to a much more complex chemical design space, and the development of accurate molecular modeling frameworks. A holistic study of the growing chemistry literature is, therefore, required that focuses on understanding the recent trends and extrapolating…
▽ More
Rapid discovery of new reactions and molecules in recent years has been facilitated by the advancements in high throughput screening, accessibility to a much more complex chemical design space, and the development of accurate molecular modeling frameworks. A holistic study of the growing chemistry literature is, therefore, required that focuses on understanding the recent trends and extrapolating them into possible future trajectories. To this end, several network theory-based studies have been reported that use a directed graph representation of chemical reactions. Here, we perform a study based on representing chemical reactions as hypergraphs where the hyperedges represent chemical reactions and nodes represent the participating molecules. We use a standard reactions dataset to construct a hypernetwork and report its statistics such as degree distributions, average path length, assortativity or degree correlations, PageRank centrality, and graph-based clusters (or communities). We also compute each statistic for an equivalent directed graph representation of reactions to draw parallels and highlight differences between the two. To demonstrate the AI applicability of hypergraph reaction representation, we generate dense hypergraph embeddings and use them in the reaction classification problem. We conclude that the hypernetwork representation is flexible, preserves reaction context, and uncovers hidden insights that are otherwise not apparent in a traditional directed graph representation of chemical reactions.
△ Less
Submitted 27 March, 2023; v1 submitted 2 August, 2022;
originally announced August 2022.
-
Robust and Efficient Swarm Communication Topologies for Hostile Environments
Authors:
Vipul Mann,
Abhishek Sivaram,
Laya Das,
Venkat Venkatasubramanian
Abstract:
Swarm Intelligence-based optimization techniques combine systematic exploration of the search space with information available from neighbors and rely strongly on communication among agents. These algorithms are typically employed to solve problems where the function landscape is not adequately known and there are multiple local optima that could result in premature convergence for other algorithm…
▽ More
Swarm Intelligence-based optimization techniques combine systematic exploration of the search space with information available from neighbors and rely strongly on communication among agents. These algorithms are typically employed to solve problems where the function landscape is not adequately known and there are multiple local optima that could result in premature convergence for other algorithms. Applications of such algorithms can be found in communication systems involving design of networks for efficient information dissemination to a target group, targeted drug-delivery where drug molecules search for the affected site before diffusing, and high-value target localization with a network of drones. In several of such applications, the agents face a hostile environment that can result in loss of agents during the search. Such a loss changes the communication topology of the agents and hence the information available to agents, ultimately influencing the performance of the algorithm. In this paper, we present a study of the impact of loss of agents on the performance of such algorithms as a function of the initial network configuration. We use particle swarm optimization to optimize an objective function with multiple sub-optimal regions in a hostile environment and study its performance for a range of network topologies with loss of agents. The results reveal interesting trade-offs between efficiency, robustness, and performance for different topologies that are subsequently leveraged to discover general properties of networks that maximize performance. Moreover, networks with small-world properties are seen to maximize performance under hostile conditions.
△ Less
Submitted 21 August, 2020;
originally announced August 2020.
-
Challenges & Solutions for above 6 GHz Radio Access Network Integration for Future Mobile Communication Systems
Authors:
Marcin Rybakowski,
Krystian Safjan,
Venkatkumar Venkatasubramanian,
Arnesh Vijay,
Laurent Dussopt,
Ali Zaidi,
Michael Peter,
Jian Luo,
Maria Fresia,
Mehrdad Shariat
Abstract:
Mobile communication technology has been rapidly evolving ever since its first introduction in the late 1980s. The development witnessed is not just in the refinement of the radio access techniques, but also in the progression towards offering sophisticated features and services to the mobile phone users. To fulfill this ever-growing user demand and market trends, frequency ranges in millimeter wa…
▽ More
Mobile communication technology has been rapidly evolving ever since its first introduction in the late 1980s. The development witnessed is not just in the refinement of the radio access techniques, but also in the progression towards offering sophisticated features and services to the mobile phone users. To fulfill this ever-growing user demand and market trends, frequency ranges in millimeter wave bands are envisioned for wireless radio transmission. To respond to this trends, the EU-funded mmMAGIC project has been launched and its main objective is to design and develop radio access techniques operating in 6-100 GHz bands. When it comes to develo** technologies for systems operating these frequency ranges, a major challenge encountered will be in terms of its radio access network integration. Unquestionably, issues at various aspects of physical layer design, channel modelling, architecture, network functions and deployment will be encountered; problems in multi-node and multi-antenna transceiver designs will surface as well. The work carried in this project will address those challenges and propose solutions; but additionally, measure its efficiency against the project specific KPIs set to meet the requirements of the operational future 5G systems. The main intention of this paper is to outline some of the challenges, more specifically to highlight the network integration challenges, and discuss some of its technical solutions. The primary purpose here is to focus towards integrated 5G technology, thereby opening further research avenues for the exploration of new and alternate frequency bands in the electromagnetic spectrum.
△ Less
Submitted 4 August, 2017;
originally announced August 2017.
-
Social influence makes self-interested crowds smarter: an optimal control perspective
Authors:
Yu Luo,
Garud Iyengar,
Venkat Venkatasubramanian
Abstract:
It is very common to observe crowds of individuals solving similar problems with similar information in a largely independent manner. We argue here that crowds can become "smarter," i.e., more efficient and robust, by partially following the average opinion. This observation runs counter to the widely accepted claim that the wisdom of crowds deteriorates with social influence. The key difference i…
▽ More
It is very common to observe crowds of individuals solving similar problems with similar information in a largely independent manner. We argue here that crowds can become "smarter," i.e., more efficient and robust, by partially following the average opinion. This observation runs counter to the widely accepted claim that the wisdom of crowds deteriorates with social influence. The key difference is that individuals are self-interested and hence will reject feedbacks that do not improve their performance. We propose a control-theoretic methodology to compute the degree of social influence, i.e., the level to which one accepts the population feedback, that optimizes performance. We conducted an experiment with human subjects ($N = 194$), where the participants were first asked to solve an optimization problem independently, i.e., under no social influence. Our theoretical methodology estimates a $30\%$ degree of social influence to be optimal, resulting in a $29\%$ improvement in the crowd's performance. We then let the same cohort solve a new problem and have access to the average opinion. Surprisingly, we find the average degree of social influence in the cohort to be $32\%$ with a $29\%$ improvement in performance: In other words, the crowd self-organized into a near-optimal setting. We believe this new paradigm for making crowds "smarter" has the potential for making a significant impact on a diverse set of fields including population health to government planning. We include a case study to show how a crowd of states can collectively learn the level of taxation and expenditure that optimizes economic growth.
△ Less
Submitted 4 November, 2016;
originally announced November 2016.
-
Providing Scalable Data Services in Ubiquitous Networks
Authors:
Tanu Malik,
Raghvendra Prasad,
Sanket Patil,
Amitabh Chaudhary,
Venkat Venkatasubramanian
Abstract:
Topology is a fundamental part of a network that governs connectivity between nodes, the amount of data flow and the efficiency of data flow between nodes. In traditional networks, due to physical limitations, topology remains static for the course of the network operation. Ubiquitous data networks (UDNs), alternatively, are more adaptive and can be configured for changes in their topology. This f…
▽ More
Topology is a fundamental part of a network that governs connectivity between nodes, the amount of data flow and the efficiency of data flow between nodes. In traditional networks, due to physical limitations, topology remains static for the course of the network operation. Ubiquitous data networks (UDNs), alternatively, are more adaptive and can be configured for changes in their topology. This flexibility in controlling their topology makes them very appealing and an attractive medium for supporting "anywhere, any place" communication. However, it raises the problem of designing a dynamic topology. The dynamic topology design problem is of particular interest to application service providers who need to provide cost-effective data services on a ubiquitous network. In this paper we describe algorithms that decide when and how the topology should be reconfigured in response to a change in the data communication requirements of the network. In particular, we describe and compare a greedy algorithm, which is often used for topology reconfiguration, with a non-greedy algorithm based on metrical task systems. Experiments show the algorithm based on metrical task system has comparable performance to the greedy algorithm at a much lower reconfiguration cost.
△ Less
Submitted 25 May, 2010;
originally announced May 2010.
-
Entropy Maximization as a Holistic Design Principle for Complex Optimal Networks and the Emergence of Power Laws
Authors:
Venkat Venkatasubramanian,
Dimitris N. Politis,
Priyan R. Patkar
Abstract:
We present a general holistic theory for the organization of complex networks, both human-engineered and naturally-evolved. Introducing concepts of value of interactions and satisfaction as generic network performance measures, we show that the underlying organizing principle is to meet an overall performance target for wide-ranging operating or environmental conditions. This design or survival…
▽ More
We present a general holistic theory for the organization of complex networks, both human-engineered and naturally-evolved. Introducing concepts of value of interactions and satisfaction as generic network performance measures, we show that the underlying organizing principle is to meet an overall performance target for wide-ranging operating or environmental conditions. This design or survival requirement of reliable performance under uncertainty leads, via the maximum entropy principle, to the emergence of a power law vertex degree distribution. The theory also predicts exponential or Poisson degree distributions depending on network redundancy, thus explaining all three regimes as different manifestations of a common underlying phenomenon within a unified theoretical framework.
△ Less
Submitted 3 August, 2004;
originally announced August 2004.
-
Spontaneous Emergence of Complex Optimal Networks through Evolutionary Adaptation
Authors:
Venkat Venkatasubramanian,
Santhoji Katare,
Priyan R. Patkar,
Fang** Mu
Abstract:
An important feature of many complex systems, both natural and artificial, is the structure and organization of their interaction networks with interesting properties. Here we present a theory of self-organization by evolutionary adaptation in which we show how the structure and organization of a network is related to the survival, or in general the performance, objectives of the system. We prop…
▽ More
An important feature of many complex systems, both natural and artificial, is the structure and organization of their interaction networks with interesting properties. Here we present a theory of self-organization by evolutionary adaptation in which we show how the structure and organization of a network is related to the survival, or in general the performance, objectives of the system. We propose that a complex system optimizes its network structure in order to maximize its overall survival fitness which is composed of short-term and long-term survival components. These in turn depend on three critical measures of the network, namely, efficiency, robustness and cost, and the environmental selection pressure. Using a graph theoretical case study, we show that when efficiency is paramount the "Star" topology emerges and when robustness is important the "Circle" topology is found. When efficiency and robustness requirements are both important to varying degrees, other classes of networks such as the "Hub" emerge. Our assumptions and results are consistent with observations across a wide variety of applications.
△ Less
Submitted 24 February, 2004;
originally announced February 2004.