-
Exploiting Data Significance in Remote Estimation of Discrete-State Markov Sources
Authors:
Ji** Luo,
Nikolaos Pappas
Abstract:
We consider the semantics-aware remote estimation of a discrete-state Markov source with normal (low-priority) and alarm (high-priority) states. Erroneously announcing a normal state at the destination when the source is actually in an alarm state (i.e., missed alarm error) incurs a significantly higher cost than falsely announcing an alarm state when the source is in a normal state (i.e., false a…
▽ More
We consider the semantics-aware remote estimation of a discrete-state Markov source with normal (low-priority) and alarm (high-priority) states. Erroneously announcing a normal state at the destination when the source is actually in an alarm state (i.e., missed alarm error) incurs a significantly higher cost than falsely announcing an alarm state when the source is in a normal state (i.e., false alarm error). Moreover, successive reception of an estimation error may cause significant lasting impact, e.g., maintenance cost and misoperations. Motivated by this, we assign different costs to different estimation errors and introduce two new age metrics, namely the Age of Missed Alarm (AoMA) and the Age of False Alarm (AoFA), to account for the lasting impact incurred by different estimation errors. Notably, the two age processes evolve dependently and can distinguish between different types of estimation errors and different synced states. The aim is to achieve an optimal trade-off between the cost of estimation error, lasting impact, and communication utilization. The problem is formulated as an average-cost, countably infinite state-space Markov decision process (MDP). We show that the optimal policy exhibits a switching-type structure, making it amenable to policy storage and algorithm design. Notably, when the source is symmetric and states are equally important, the optimal policy has identical thresholds, i.e., threshold-type. Theoretical and numerical results underscore that our approach extends the current understanding of the Age of Incorrect Information (AoII) and the cost of actuation error (CAE), showing that they are specific instances within our broader framework.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Sequential Editing for Lifelong Training of Speech Recognition Models
Authors:
Devang Kulshreshtha,
Saket Dingliwal,
Brady Houston,
Nikolaos Pappas,
Srikanth Ronanki
Abstract:
Automatic Speech Recognition (ASR) traditionally assumes known domains, but adding data from a new domain raises concerns about computational inefficiencies linked to retraining models on both existing and new domains. Fine-tuning solely on new domain risks Catastrophic Forgetting (CF). To address this, Lifelong Learning (LLL) algorithms have been proposed for ASR. Prior research has explored tech…
▽ More
Automatic Speech Recognition (ASR) traditionally assumes known domains, but adding data from a new domain raises concerns about computational inefficiencies linked to retraining models on both existing and new domains. Fine-tuning solely on new domain risks Catastrophic Forgetting (CF). To address this, Lifelong Learning (LLL) algorithms have been proposed for ASR. Prior research has explored techniques such as Elastic Weight Consolidation, Knowledge Distillation, and Replay, all of which necessitate either additional parameters or access to prior domain data. We propose Sequential Model Editing as a novel method to continually learn new domains in ASR systems. Different than previous methods, our approach does not necessitate access to prior datasets or the introduction of extra parameters. Our study demonstrates up to 15% Word Error Rate Reduction (WERR) over fine-tuning baseline, and superior efficiency over other LLL techniques on CommonVoice English multi-accent dataset.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
DEM: Distribution Edited Model for Training with Mixed Data Distributions
Authors:
Dhananjay Ram,
Aditya Rawal,
Momchil Hardalov,
Nikolaos Pappas,
Sheng Zha
Abstract:
Training with mixed data distributions is a common and important part of creating multi-task and instruction-following models. The diversity of the data distributions and cost of joint training makes the optimization procedure extremely challenging. Data mixing methods partially address this problem, albeit having a sub-optimal performance across data sources and require multiple expensive trainin…
▽ More
Training with mixed data distributions is a common and important part of creating multi-task and instruction-following models. The diversity of the data distributions and cost of joint training makes the optimization procedure extremely challenging. Data mixing methods partially address this problem, albeit having a sub-optimal performance across data sources and require multiple expensive training runs. In this paper, we propose a simple and efficient alternative for better optimization of the data sources by combining models individually trained on each data source with the base model using basic element-wise vector operations. The resulting model, namely Distribution Edited Model (DEM), is 11x cheaper than standard data mixing and outperforms strong baselines on a variety of benchmarks, yielding up to 6.2% improvement on MMLU, 11.5% on BBH, 16.1% on DROP, and 9.3% on HELM with models of size 3B to 13B. Notably, DEM does not require full re-training when modifying a single data-source, thus making it very flexible and scalable for training with diverse data sources.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Age of Information Versions: a Semantic View of Markov Source Monitoring
Authors:
Mehrdad Salimnejad,
Marios Kountouris,
Anthony Ephremides,
Nikolaos Pappas
Abstract:
We consider the problem of real-time remote monitoring of a two-state Markov process, where a sensor observes the state of the source and makes a decision on whether to transmit the status updates over an unreliable channel or not. We introduce a modified randomized stationary sampling and transmission policy where the decision to perform sampling occurs probabilistically depending on the current…
▽ More
We consider the problem of real-time remote monitoring of a two-state Markov process, where a sensor observes the state of the source and makes a decision on whether to transmit the status updates over an unreliable channel or not. We introduce a modified randomized stationary sampling and transmission policy where the decision to perform sampling occurs probabilistically depending on the current state of the source and whether the system was in a sync state during the previous time slot or not. We then propose two new performance metrics, coined the Version Innovation Age (VIA) and the Age of Incorrect Version (AoIV) and analyze their performance under the modified randomized stationary and other state-of-the-art sampling and transmission policies. Specifically, we derive closed-form expressions for the distribution and the average of VIA, AoIV, and Age of Incorrect Information (AoII) under these policies. Furthermore, we formulate and solve three constrained optimization problems. The first optimization problem aims to minimize the average VIA subject to constraints on the time-averaged sampling cost and time-averaged reconstruction error. In the second and third problems, the objective is to minimize the average AoIV and AoII, respectively, while considering a constraint on the time-averaged sampling cost. Finally, we compare the performance of various sampling and transmission policies and identify the conditions under which each policy outperforms the others in optimizing the proposed metrics.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Timeliness of Status Update System: The Effect of Parallel Transmission Using Heterogeneous Updating Devices
Authors:
Zhengchuan Chen,
Kang Lang,
Nikolaos Pappas,
Howard H. Yang,
Min Wang,
Zhong Tian,
Tony Q. S. Quek
Abstract:
Timely status updating is the premise of emerging interaction-based applications in the Internet of Things (IoT). Using redundant devices to update the status of interest is a promising method to improve the timeliness of information. However, parallel status updating leads to out-of-order arrivals at the monitor, significantly challenging timeliness analysis. This work studies the Age of Informat…
▽ More
Timely status updating is the premise of emerging interaction-based applications in the Internet of Things (IoT). Using redundant devices to update the status of interest is a promising method to improve the timeliness of information. However, parallel status updating leads to out-of-order arrivals at the monitor, significantly challenging timeliness analysis. This work studies the Age of Information (AoI) of a multi-queue status update system where multiple devices monitor the same physical process. Specifically, two systems are considered: the Basic System, which only has type-1 devices that are ad hoc devices located close to the source, and the Hybrid System, which contains additional type-2 devices that are infrastructure-based devices located in fixed points compared to the Basic System. Using the Stochastic Hybrid Systems (SHS) framework, a mathematical model that combines discrete and continuous dynamics, we derive the expressions of the average AoI of the considered two systems in closed form. Numerical results verify the accuracy of the analysis. It is shown that when the number and parameters of the type-1 devices/type-2 devices are fixed, the logarithm of average AoI will linearly decrease with the logarithm of the total arrival rate of type-2 devices or that of the number of type-1 devices under specific condition. It has also been demonstrated that the proposed systems can significantly outperform the FCFS M/M/N status update system.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Optimizing Information Freshness in IoT Systems with Update Rate Constraints: A Token-Based Approach
Authors:
Erfan Delfani,
Nikolaos Pappas
Abstract:
In Internet of Things (IoT) status update systems, where information is sampled and subsequently transmitted from a source to a destination node, the imperative necessity lies in maintaining the timeliness of information and updating the system with optimal frequency. Optimizing information freshness in resource-limited status update systems often involves Constrained Markov Decision Process (CMDP…
▽ More
In Internet of Things (IoT) status update systems, where information is sampled and subsequently transmitted from a source to a destination node, the imperative necessity lies in maintaining the timeliness of information and updating the system with optimal frequency. Optimizing information freshness in resource-limited status update systems often involves Constrained Markov Decision Process (CMDP) problems with update rate constraints. Solving CMDP problems, especially with multiple constraints, is a challenging task. To address this, we present a token-based approach that transforms CMDP into an unconstrained MDP, simplifying the solution process. We apply this approach to systems with one and two update rate constraints for optimizing Age of Incorrect Information (AoII) and Age of Information (AoI) metrics, respectively, and explore the analytical and numerical aspects. Additionally, we introduce an iterative triangle bisection method for solving the CMDP problems with two constraints, comparing its results with the token-based MDP approach. Our findings show that the token-based approach yields superior performance over baseline policies, converging to the optimal policy as the maximum number of tokens increases.
△ Less
Submitted 3 June, 2024; v1 submitted 7 May, 2024;
originally announced May 2024.
-
State-Aware Timeliness in Energy Harvesting IoT Systems Monitoring a Markovian Source
Authors:
Erfan Delfani,
George J. Stamatakis,
Nikolaos Pappas
Abstract:
In this study, we investigate the optimal transmission policies within an energy harvesting status update system, where the demand for status updates depends on the state of the source. The system monitors a two-state Markovian source that characterizes a stochastic process, which can be in either a normal state or an alarm state, with a higher demand for fresh updates when the source is in the al…
▽ More
In this study, we investigate the optimal transmission policies within an energy harvesting status update system, where the demand for status updates depends on the state of the source. The system monitors a two-state Markovian source that characterizes a stochastic process, which can be in either a normal state or an alarm state, with a higher demand for fresh updates when the source is in the alarm state. We propose a metric to capture the freshness of status updates for each state of the stochastic process by introducing two Age of Information (AoI) variables, extending the definition of AoI to account for the state changes of the stochastic process. We formulate the problem as a Markov Decision Process (MDP), utilizing a transition cost function that applies linear and non-linear penalties based on AoI and the state of the stochastic process. Through analytical investigation, we delve into the structure of the optimal transmission policy for the resulting MDP problem. Furthermore, we evaluate the derived policies via numerical results and demonstrate their effectiveness in reserving energy in anticipation of forthcoming alarm states.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Numeric Reward Machines
Authors:
Kristina Levina,
Nikolaos Pappas,
Athanasios Karapantelakis,
Aneta Vulgarakis Feljan,
Jendrik Seipp
Abstract:
Reward machines inform reinforcement learning agents about the reward structure of the environment and often drastically speed up the learning process. However, reward machines only accept Boolean features such as robot-reached-gold. Consequently, many inherently numeric tasks cannot profit from the guidance offered by reward machines. To address this gap, we aim to extend reward machines with num…
▽ More
Reward machines inform reinforcement learning agents about the reward structure of the environment and often drastically speed up the learning process. However, reward machines only accept Boolean features such as robot-reached-gold. Consequently, many inherently numeric tasks cannot profit from the guidance offered by reward machines. To address this gap, we aim to extend reward machines with numeric features such as distance-to-gold. For this, we present two types of reward machines: numeric-Boolean and numeric. In a numeric-Boolean reward machine, distance-to-gold is emulated by two Boolean features distance-to-gold-decreased and robot-reached-gold. In a numeric reward machine, distance-to-gold is used directly alongside the Boolean feature robot-reached-gold. We compare our new approaches to a baseline reward machine in the Craft domain, where the numeric feature is the agent-to-target distance. We use cross-product Q-learning, Q-learning with counter-factual experiences, and the options framework for learning. Our experimental results show that our new approaches significantly outperform the baseline approach. Extending reward machines with numeric features opens up new possibilities of using reward machines in inherently numeric tasks.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Semantic-Aware Remote Estimation of Multiple Markov Sources Under Constraints
Authors:
Ji** Luo,
Nikolaos Pappas
Abstract:
This paper studies semantic-aware communication for remote estimation of multiple Markov sources over a lossy and rate-constrained channel. Unlike most existing studies that treat all source states equally, we exploit the semantics of information and consider that the remote actuator has different tolerances for the estimation errors of different states. We aim to find an optimal scheduling policy…
▽ More
This paper studies semantic-aware communication for remote estimation of multiple Markov sources over a lossy and rate-constrained channel. Unlike most existing studies that treat all source states equally, we exploit the semantics of information and consider that the remote actuator has different tolerances for the estimation errors of different states. We aim to find an optimal scheduling policy that minimizes the long-term state-dependent costs of estimation errors under a transmission frequency constraint. We theoretically show the structure of the optimal policy by leveraging the average-cost Constrained Markov Decision Process (CMDP) theory and the Lagrangian dynamic programming. By exploiting the optimal structural results, we develop a novel policy search algorithm, termed intersection search plus relative value iteration (Insec-RVI), that can find the optimal policy using only a few iterations. To avoid the ``curse of dimensionality'' of MDPs, we propose an online low-complexity drift-plus-penalty (DPP) scheduling algorithm based on the Lyapunov optimization theorem. We also design an efficient average-cost Q-learning algorithm to estimate the optimal policy without knowing a priori the channel and source statistics. Numerical results show that continuous transmission is inefficient, and remarkably, our semantic-aware policies can attain the optimum by strategically utilizing fewer transmissions by exploiting the timing of the important information.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Adaptive Federated Learning Over the Air
Authors:
Chenhao Wang,
Zihan Chen,
Nikolaos Pappas,
Howard H. Yang,
Tony Q. S. Quek,
H. Vincent Poor
Abstract:
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training. This approach capitalizes on the inherent superposition property of wireless channels, facilitating fast and scalable parameter aggregation. Meanwhile, it enhances the robustness of the model training process by dynamically adjusting the stepsize in accor…
▽ More
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training. This approach capitalizes on the inherent superposition property of wireless channels, facilitating fast and scalable parameter aggregation. Meanwhile, it enhances the robustness of the model training process by dynamically adjusting the stepsize in accordance with the global gradient update. We derive the convergence rate of the training algorithms, encompassing the effects of channel fading and interference, for a broad spectrum of nonconvex loss functions. Our analysis shows that the AdaGrad-based algorithm converges to a stationary point at the rate of $\mathcal{O}( \ln{(T)} /{ T^{ 1 - \frac{1}α } } )$, where $α$ represents the tail index of the electromagnetic interference. This result indicates that the level of heavy-tailedness in interference distribution plays a crucial role in the training efficiency: the heavier the tail, the slower the algorithm converges. In contrast, an Adam-like algorithm converges at the $\mathcal{O}( 1/T )$ rate, demonstrating its advantage in expediting the model training process. We conduct extensive experiments that corroborate our theoretical findings and affirm the practical efficacy of our proposed federated adaptive gradient methods.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets
Authors:
Hossein Aboutalebi,
Hwanjun Song,
Yusheng Xie,
Arshit Gupta,
Justin Sun,
Hang Su,
Igor Shalyminov,
Nikolaos Pappas,
Siffi Singh,
Saab Mansour
Abstract:
Development of multimodal interactive systems is hindered by the lack of rich, multimodal (text, images) conversational data, which is needed in large quantities for LLMs. Previous approaches augment textual dialogues with retrieved images, posing privacy, diversity, and quality constraints. In this work, we introduce \textbf{M}ultimodal \textbf{A}ugmented \textbf{G}enerative \textbf{I}mages \text…
▽ More
Development of multimodal interactive systems is hindered by the lack of rich, multimodal (text, images) conversational data, which is needed in large quantities for LLMs. Previous approaches augment textual dialogues with retrieved images, posing privacy, diversity, and quality constraints. In this work, we introduce \textbf{M}ultimodal \textbf{A}ugmented \textbf{G}enerative \textbf{I}mages \textbf{D}ialogues (MAGID), a framework to augment text-only dialogues with diverse and high-quality images. Subsequently, a diffusion model is applied to craft corresponding images, ensuring alignment with the identified text. Finally, MAGID incorporates an innovative feedback loop between an image description generation module (textual LLM) and image quality modules (addressing aesthetics, image-text matching, and safety), that work in tandem to generate high-quality and multi-modal dialogues. We compare MAGID to other SOTA baselines on three dialogue datasets, using automated and human evaluation. Our results show that MAGID is comparable to or better than baselines, with significant improvements in human evaluation, especially against retrieval baselines where the image database is small.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Eliciting Better Multilingual Structured Reasoning from LLMs through Code
Authors:
Bryan Li,
Tamer Alkhouli,
Daniele Bonadiman,
Nikolaos Pappas,
Saab Mansour
Abstract:
The development of large language models (LLM) has shown progress on reasoning, though studies have largely considered either English or simple reasoning tasks. To address this, we introduce a multilingual structured reasoning and explanation dataset, termed xSTREET, that covers four tasks across six languages. xSTREET exposes a gap in base LLM performance between English and non-English reasoning…
▽ More
The development of large language models (LLM) has shown progress on reasoning, though studies have largely considered either English or simple reasoning tasks. To address this, we introduce a multilingual structured reasoning and explanation dataset, termed xSTREET, that covers four tasks across six languages. xSTREET exposes a gap in base LLM performance between English and non-English reasoning tasks.
We then propose two methods to remedy this gap, building on the insight that LLMs trained on code are better reasoners. First, at training time, we augment a code dataset with multilingual comments using machine translation while kee** program code as-is. Second, at inference time, we bridge the gap between training and inference by employing a prompt structure that incorporates step-by-step code primitives to derive new facts and find a solution. Our methods show improved multilingual performance on xSTREET, most notably on the scientific commonsense reasoning subtask. Furthermore, the models show no regression on non-reasoning tasks, thus demonstrating our techniques maintain general-purpose abilities.
△ Less
Submitted 12 June, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Semantic Text Transmission via Prediction with Small Language Models: Cost-Similarity Trade-off
Authors:
Bhavani A Madhabhavi,
Gangadhar Karevvanavar,
Rajshekhar V Bhat,
Nikolaos Pappas
Abstract:
We consider the communication of natural language text from a source to a destination over noiseless and character-erasure channels. We exploit language's inherent correlations and predictability to constrain transmission costs by allowing the destination to predict or complete words with potential dissimilarity with the source text. Concretely, our objective is to obtain achievable…
▽ More
We consider the communication of natural language text from a source to a destination over noiseless and character-erasure channels. We exploit language's inherent correlations and predictability to constrain transmission costs by allowing the destination to predict or complete words with potential dissimilarity with the source text. Concretely, our objective is to obtain achievable $(\bar{c}, \bar{s})$ pairs, where $\bar{c}$ is the average transmission cost at the source and $\bar{s}$ is the average semantic similarity measured via cosine similarity between vector embedding of words at the source and those predicted/completed at the destination. We obtain $(\bar{c}, \bar{s})$ pairs for neural language and first-order Markov chain-based small language models (SLM) for prediction, using both a threshold policy that transmits a word if its cosine similarity with that predicted/completed at the destination is below a threshold, and a periodic policy, which transmits words after a specific interval and predicts/completes the words in between, at the destination. We adopt an SLM for word completion. We demonstrate that, when communication occurs over a noiseless channel, the threshold policy achieves a higher $\bar{s}$ for a given $\bar{c}$ than the periodic policy and that the $\bar{s}$ achieved with the neural SLM is greater than or equal to that of the Markov chain-based algorithm for the same $\bar{c}$. The improved performance comes with a higher complexity in terms of time and computing requirements. However, when communication occurs over a character-erasure channel, all prediction algorithms and scheduling policies perform poorly. Furthermore, if character-level Huffman coding is used, the required $\bar{c}$ to achieve a given $\bar{s}$ is reduced, but the above observations still apply.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Enabling Communication and Control Co-Design in 6G Networks
Authors:
Onur Ayan,
Nikolaos Pappas,
Miguel Angel Gutierrez Estevez,
Xueli An,
Wolfgang Kellerer
Abstract:
Networked control systems (NCSs), which are feedback control loops closed over a communication network, have been a popular research topic over the past decades. Numerous works in the literature propose novel algorithms and protocols with joint consideration of communication and control. However, the vast majority of the recent research results, which have shown remarkable performance improvements…
▽ More
Networked control systems (NCSs), which are feedback control loops closed over a communication network, have been a popular research topic over the past decades. Numerous works in the literature propose novel algorithms and protocols with joint consideration of communication and control. However, the vast majority of the recent research results, which have shown remarkable performance improvements if a cross-layer methodology is followed, have not been widely adopted by the industry. In this work, we review the shortcomings of today's mobile networks that render cross-layer solutions, such as semantic and goal-oriented communications, very challenging in practice. To tackle this, we propose a new framework for 6G user plane design that simplifies the adoption of recent research results in networked control, thereby facilitating the joint communication and control design in next-generation mobile networks.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
DeAL: Decoding-time Alignment for Large Language Models
Authors:
James Y. Huang,
Sailik Sengupta,
Daniele Bonadiman,
Yi-an Lai,
Arshit Gupta,
Nikolaos Pappas,
Saab Mansour,
Katrin Kirchhoff,
Dan Roth
Abstract:
Large Language Models (LLMs) are nowadays expected to generate content aligned with human preferences. Current work focuses on alignment at model training time, through techniques such as Reinforcement Learning with Human Feedback (RLHF). However, it is unclear if such methods are an effective choice to teach alignment objectives to the model. First, the inability to incorporate multiple, custom r…
▽ More
Large Language Models (LLMs) are nowadays expected to generate content aligned with human preferences. Current work focuses on alignment at model training time, through techniques such as Reinforcement Learning with Human Feedback (RLHF). However, it is unclear if such methods are an effective choice to teach alignment objectives to the model. First, the inability to incorporate multiple, custom rewards and reliance on a model developer's view of universal and static principles are key limitations. Second, the residual gaps in model training and the reliability of such approaches are also questionable (e.g. susceptibility to jail-breaking even after safety training). To address these, we propose DeAL, a framework that allows the user to customize reward functions and enables Decoding-time Alignment of LLMs (DeAL). At its core, we view decoding as a heuristic-guided search process and facilitate the use of a wide variety of alignment objectives. Our experiments with programmatic constraints such as keyword and length constraints (studied widely in the pre-LLM era) and abstract objectives such as harmlessness and helpfulness (proposed in the post-LLM era) show that we can DeAL with fine-grained trade-offs, improve adherence to alignment objectives, and address residual gaps in LLMs. Lastly, while DeAL can be effectively paired with RLHF and prompting techniques, its generality makes decoding slower, an optimization we leave for future work.
△ Less
Submitted 20 February, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Boosting Dynamic TDD in Small Cell Networks by the Multiplicative Weight Update Method
Authors:
Jiaqi Zhu,
Nikolaos Pappas,
Howard H. Yang
Abstract:
We leverage the Multiplicative Weight Update (MWU) method to develop a decentralized algorithm that significantly improves the performance of dynamic time division duplexing (D-TDD) in small cell networks. The proposed algorithm adaptively adjusts the time portion allocated to uplink (UL) and downlink (DL) transmissions at every node during each scheduled time slot, aligning the packet transmissio…
▽ More
We leverage the Multiplicative Weight Update (MWU) method to develop a decentralized algorithm that significantly improves the performance of dynamic time division duplexing (D-TDD) in small cell networks. The proposed algorithm adaptively adjusts the time portion allocated to uplink (UL) and downlink (DL) transmissions at every node during each scheduled time slot, aligning the packet transmissions toward the most appropriate link directions according to the feedback of signal-to-interference ratio information. Our simulation results reveal that compared to the (conventional) fixed configuration of UL/DL transmission probabilities in D-TDD, incorporating MWU into D-TDD brings about a two-fold improvement of mean packet throughput in the DL and a three-fold improvement of the same performance metric in the UL, resulting in the D-TDD even outperforming Static-TDD in the UL. It also shows that the proposed scheme maintains a consistent performance gain in the presence of an ascending traffic load, validating its effectiveness in boosting the network performance. This work also demonstrates an approach that accounts for algorithmic considerations at the forefront when solving stochastic problems.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Version age-based client scheduling policy for federated learning
Authors:
Xinyi Hu,
Nikolaos Pappas,
Howard H. Yang
Abstract:
Federated Learning (FL) has emerged as a privacy-preserving machine learning paradigm facilitating collaborative training across multiple clients without sharing local data. Despite advancements in edge device capabilities, communication bottlenecks present challenges in aggregating a large number of clients; only a portion of the clients can update their parameters upon each global aggregation. T…
▽ More
Federated Learning (FL) has emerged as a privacy-preserving machine learning paradigm facilitating collaborative training across multiple clients without sharing local data. Despite advancements in edge device capabilities, communication bottlenecks present challenges in aggregating a large number of clients; only a portion of the clients can update their parameters upon each global aggregation. This phenomenon introduces the critical challenge of stragglers in FL and the profound impact of client scheduling policies on global model convergence and stability. Existing scheduling strategies address staleness but predominantly focus on either timeliness or content. Motivated by this, we introduce the novel concept of Version Age of Information (VAoI) to FL. Unlike traditional Age of Information metrics, VAoI considers both timeliness and content staleness. Each client's version age is updated discretely, indicating the freshness of information. VAoI is incorporated into the client scheduling policy to minimize the average VAoI, mitigating the impact of outdated local updates and enhancing the stability of FL systems.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Version Innovation Age and Age of Incorrect Version for Monitoring Markovian Sources
Authors:
Mehrdad Salimnejad,
Marios Kountouris,
Anthony Ephremides,
Nikolaos Pappas
Abstract:
In this paper, we propose two new performance metrics, coined the Version Innovation Age (VIA) and the Age of Incorrect Version (AoIV) for real-time monitoring of a two-state Markov process over an unreliable channel. We analyze their performance under the change-aware, semantics-aware, and randomized stationary sampling and transmission policies. We derive closed-form expressions for the distribu…
▽ More
In this paper, we propose two new performance metrics, coined the Version Innovation Age (VIA) and the Age of Incorrect Version (AoIV) for real-time monitoring of a two-state Markov process over an unreliable channel. We analyze their performance under the change-aware, semantics-aware, and randomized stationary sampling and transmission policies. We derive closed-form expressions for the distribution and the average of VIA, AoIV, and AoII for these policies. We then formulate and solve an optimization problem to minimize the average VIA, subject to constraints on the time-averaged sampling cost and time-averaged reconstruction error. Finally, we compare the performance of various sampling and transmission policies and identify the conditions under which each policy outperforms the others in optimizing the proposed metrics.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Age of Actuated Information and Age of Actuation in a Data-Caching Energy Harvesting Actuator
Authors:
Ali Nikkhah,
Anthony Ephremides,
Nikolaos Pappas
Abstract:
In this paper, we introduce two metrics, namely, age of actuation (AoA) and age of actuated information (AoAI), within a discrete-time system model that integrates data caching and energy harvesting (EH). AoA evaluates the timeliness of actions irrespective of the age of the information, while AoAI considers the freshness of the utilized data packet. We use Markov Chain analysis to model the syste…
▽ More
In this paper, we introduce two metrics, namely, age of actuation (AoA) and age of actuated information (AoAI), within a discrete-time system model that integrates data caching and energy harvesting (EH). AoA evaluates the timeliness of actions irrespective of the age of the information, while AoAI considers the freshness of the utilized data packet. We use Markov Chain analysis to model the system's evolution. Furthermore, we employ three-dimensional Markov Chain analysis to characterize the stationary distributions for AoA and AoAI and calculate their average values. Our findings from the analysis, validated by simulations, show that while AoAI consistently decreases with increased data and energy packet arrival rates, AoA presents a more complex behavior, with potential increases under conditions of limited data or energy resources. These metrics go towards the semantics of information and goal-oriented communications since they consider the timeliness of utilizing the information to perform an action.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Goal-Oriented Multiple Access Connectivity for Networked Intelligent Systems
Authors:
Pouya Agheli,
Nikolaos Pappas,
Marios Kountouris
Abstract:
We design a self-decision goal-oriented multiple access scheme, where sensing agents observe a common event and individually decide to communicate the event's attributes as updates to the monitoring agents, to satisfy a certain goal. Decisions are based on the usefulness of updates, generated under uniform, change- and semantics-aware acquisition, as well as statistics and updates of other agents.…
▽ More
We design a self-decision goal-oriented multiple access scheme, where sensing agents observe a common event and individually decide to communicate the event's attributes as updates to the monitoring agents, to satisfy a certain goal. Decisions are based on the usefulness of updates, generated under uniform, change- and semantics-aware acquisition, as well as statistics and updates of other agents. We obtain optimal activation probabilities and threshold criteria for decision-making under all schemes, maximizing a grade of effectiveness metric. Alongside studying the effect of different parameters on effectiveness, our simulation results show that the self-decision scheme may attain at least 92% of optimal performance.
△ Less
Submitted 14 June, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
Age of Actuation and Timeliness: Semantics in a Wireless Power Transfer System
Authors:
Ali Nikkhah,
Anthony Ephremides,
Nikolaos Pappas
Abstract:
In this paper, we investigate a model relevant to semantics-aware goal-oriented communications, and we propose a new metric that incorporates the utilization of information in addition to its timelines. Specifically, we consider the transmission of observations from an external process to a battery-powered receiver through status updates. These updates inform the receiver about the process status…
▽ More
In this paper, we investigate a model relevant to semantics-aware goal-oriented communications, and we propose a new metric that incorporates the utilization of information in addition to its timelines. Specifically, we consider the transmission of observations from an external process to a battery-powered receiver through status updates. These updates inform the receiver about the process status and enable actuation if sufficient energy is available to achieve a goal. We focus on a wireless power transfer (WPT) model, where the receiver receives energy from a dedicated power transmitter and occasionally from the data transmitter when they share a common channel. We analyze the Age of Information (AoI) and propose a new metric, the \textit{Age of Actuation (AoA), which is relevant when the receiver utilizes the status updates to perform actions in a timely manner}. We provide analytical characterizations of the average AoA and the violation probability of the AoA, demonstrating that AoA generalizes AoI. Moreover, we introduce and analytically characterize the \textit{Probability of Missing Actuation (PoMA)}; this metric becomes relevant also \textit{to quantify the incurred cost of a missed action}. We formulate unconstrained and constrained optimization problems for all the metrics and present numerical evaluations of our analytical results. This proposed set of metrics goes beyond the traditional timeliness metrics since the synergy of different flows is now considered.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Age-Threshold Slotted ALOHA for Optimizing Information Freshness in Mobile Networks
Authors:
Fangming Zhao,
Nikolaos Pappas,
Chuan Ma,
Xinghua Sun,
Tony Q. S. Quek,
Howard H. Yang
Abstract:
We optimize the Age of Information (AoI) in mobile networks using the age-threshold slotted ALOHA (TSA) protocol. The network comprises multiple source-destination pairs, where each source sends a sequence of status update packets to its destination over a shared spectrum. The TSA protocol stipulates that a source node must remain silent until its AoI reaches a predefined threshold, after which th…
▽ More
We optimize the Age of Information (AoI) in mobile networks using the age-threshold slotted ALOHA (TSA) protocol. The network comprises multiple source-destination pairs, where each source sends a sequence of status update packets to its destination over a shared spectrum. The TSA protocol stipulates that a source node must remain silent until its AoI reaches a predefined threshold, after which the node accesses the radio channel with a certain probability. Using stochastic geometry tools, we derive analytical expressions for the transmission success probability, mean peak AoI, and time-average AoI. Subsequently, we obtain closed-form expressions for the optimal update rate and age threshold that minimize the mean peak and time-average AoI, respectively. In addition, we establish a scaling law for the mean peak AoI and time-average AoI in mobile networks, revealing that the optimal mean peak AoI and time-average AoI increase linearly with the deployment density. Notably, the growth rate of time-average AoI under TSA is half of that under conventional slotted ALOHA. When considering the optimal mean peak AoI, the TSA protocol exhibits comparable performance to the traditional slotted ALOHA protocol. These findings conclusively affirm the advantage of TSA in reducing higher-order AoI, particularly in densely deployed networks.
△ Less
Submitted 5 June, 2024; v1 submitted 17 December, 2023;
originally announced December 2023.
-
Value of Information and Timing-aware Scheduling for Federated Learning
Authors:
Muhammad Azeem Khan,
Howard H. Yang,
Zihan Chen,
Antonio Iera,
Nikolaos Pappas
Abstract:
Data possesses significant value as it fuels advancements in AI. However, protecting the privacy of the data generated by end-user devices has become crucial. Federated Learning (FL) offers a solution by preserving data privacy during training. FL brings the model directly to User Equipments (UEs) for local training by an access point (AP). The AP periodically aggregates trained parameters from UE…
▽ More
Data possesses significant value as it fuels advancements in AI. However, protecting the privacy of the data generated by end-user devices has become crucial. Federated Learning (FL) offers a solution by preserving data privacy during training. FL brings the model directly to User Equipments (UEs) for local training by an access point (AP). The AP periodically aggregates trained parameters from UEs, enhancing the model and sending it back to them. However, due to communication constraints, only a subset of UEs can update parameters during each global aggregation. Consequently, develo** innovative scheduling algorithms is vital to enable complete FL implementation and enhance FL convergence. In this paper, we present a scheduling policy combining Age of Update (AoU) concepts and data Shapley metrics. This policy considers the freshness and value of received parameter updates from individual data sources and real-time channel conditions to enhance FL's operational efficiency. The proposed algorithm is simple, and its effectiveness is demonstrated through simulations.
△ Less
Submitted 16 December, 2023;
originally announced December 2023.
-
Optimizing Information Freshness over a Channel that Wears Out
Authors:
George J. Stamatakis,
Osvaldo Simeone,
Nikolaos Pappas
Abstract:
A sensor samples and transmits status updates to a destination through a wireless channel that wears out over time and with every use. At each time slot, the sensor can decide to sample and transmit a fresh status update, restore the initial quality of the channel, or remain silent. The actions impose different costs on the operation of the system, and we study the problem of optimally selecting t…
▽ More
A sensor samples and transmits status updates to a destination through a wireless channel that wears out over time and with every use. At each time slot, the sensor can decide to sample and transmit a fresh status update, restore the initial quality of the channel, or remain silent. The actions impose different costs on the operation of the system, and we study the problem of optimally selecting the actions at the transmitter so as to maximize the freshness of the information at the receiver, while minimizing the communication cost. Freshness is measured by the age of information (AoI). The problem is addressed using dynamic programming, and numerical results are presented to provide insights into the optimal transmission policy.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Version Age of Information Minimization over Fading Broadcast Channels
Authors:
Gangadhar Karevvanavar,
Hrishikesh Pable,
Om Patil,
Rajshekhar V Bhat,
Nikolaos Pappas
Abstract:
We consider a base station (BS) that receives version update packets from multiple exogenous streams and broadcasts them to corresponding users over a fading broadcast channel using a non-orthogonal multiple access (NOMA) scheme. Sequentially indexed packets arrive randomly in each stream, with new packets making the previous ones obsolete. In this case, we consider the version age of information…
▽ More
We consider a base station (BS) that receives version update packets from multiple exogenous streams and broadcasts them to corresponding users over a fading broadcast channel using a non-orthogonal multiple access (NOMA) scheme. Sequentially indexed packets arrive randomly in each stream, with new packets making the previous ones obsolete. In this case, we consider the version age of information (VAoI) at a user, defined as the difference in the version index of the latest available packet at the BS and that at the user, as a metric of freshness of information. Our objective is to minimize a weighted sum of average VAoI across users subject to an average power constraint at the BS by optimally scheduling the update packets from various streams for transmission and transmitting them with sufficient powers to guarantee their successful delivery. We consider the class of channel-only stationary randomized policies (CO-SRP), which rely solely on channel power gains for transmission decisions. We solve the resulting non-convex problem optimally and show that the VAoI achieved under the optimal CO-SRP is within twice the optimal achievable VAoI. We also obtained a Constrained Markov Decision Process (CMDP)-based solution and its structural properties. Numerical simulations show a close performance between the optimal CO-SRP and CMDP-based solutions. Additionally, a time division multiple access (TDMA) scheme, which allows transmission to at most one user at a time, matches NOMA's performance under tight average power constraints. However, NOMA outperforms TDMA as the constraint is relaxed.
△ Less
Submitted 12 February, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Goal-oriented Estimation of Multiple Markov Sources in Resource-constrained Systems
Authors:
Ji** Luo,
Nikolaos Pappas
Abstract:
This paper investigates goal-oriented communication for remote estimation of multiple Markov sources in resource-constrained networks. An agent decides the updating times of the sources and transmits the packet to a remote destination over an unreliable channel with delay. The destination is tasked with source reconstruction for actuation. We utilize the metric \textit{cost of actuation error} (CA…
▽ More
This paper investigates goal-oriented communication for remote estimation of multiple Markov sources in resource-constrained networks. An agent decides the updating times of the sources and transmits the packet to a remote destination over an unreliable channel with delay. The destination is tasked with source reconstruction for actuation. We utilize the metric \textit{cost of actuation error} (CAE) to capture the state-dependent actuation costs. We aim for a sampling policy that minimizes the long-term average CAE subject to an average resource constraint. We formulate this problem as an average-cost constrained Markov Decision Process (CMDP) and relax it into an unconstrained problem by utilizing \textit{Lyapunov drift} techniques. Then, we propose a low-complexity \textit{drift-plus-penalty} (DPP) policy for systems with known source/channel statistics and a Lyapunov optimization-based deep reinforcement learning (LO-DRL) policy for unknown environments. Our policies significantly reduce the number of uninformative transmissions by exploiting the timing of the important information.
△ Less
Submitted 3 June, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Effective Communication: When to Pull Updates?
Authors:
Pouya Agheli,
Nikolaos Pappas,
Petar Popovski,
Marios Kountouris
Abstract:
We study a pull-based communication system where a sensing agent updates an actuation agent using a query control policy, which is adjusted in the evolution of an observed information source and the usefulness of each update for achieving a specific goal. For that, a controller decides whether to pull an update at each slot, predicting what is probably occurring at the source and how much effectiv…
▽ More
We study a pull-based communication system where a sensing agent updates an actuation agent using a query control policy, which is adjusted in the evolution of an observed information source and the usefulness of each update for achieving a specific goal. For that, a controller decides whether to pull an update at each slot, predicting what is probably occurring at the source and how much effective impact that update could have at the endpoint. Thus, temporal changes in the source evolution could modify the query arrivals so as to capture important updates. The amount of impact is determined by a grade of effectiveness (GoE) metric, which incorporates both freshness and usefulness attributes of the communicated updates. Applying an iterative algorithm, we derive query decisions that maximize the long-term average GoE for the communicated packets, subject to cost constraints. Our analytical and numerical results show that the proposed query policy exhibits higher effectiveness than existing periodic and probabilistic query policies for a wide range of query arrival rates.
△ Less
Submitted 14 February, 2024; v1 submitted 10 November, 2023;
originally announced November 2023.
-
Version Age-Optimal Cached Status Updates in a Gossi** Network with Energy Harvesting Sensor
Authors:
Erfan Delfani,
Nikolaos Pappas
Abstract:
In this work, we consider a real-time IoT monitoring system in which an energy harvesting sensor with a finite-size battery measures a physical process and transmits the status updates to an aggregator. The aggregator, equipped with caching capabilities, can serve the external requests of a destination network with either a stored update or a fresh update from the sensor. We assume the destination…
▽ More
In this work, we consider a real-time IoT monitoring system in which an energy harvesting sensor with a finite-size battery measures a physical process and transmits the status updates to an aggregator. The aggregator, equipped with caching capabilities, can serve the external requests of a destination network with either a stored update or a fresh update from the sensor. We assume the destination network acts as a gossi** network in which the update packets are forwarded among the nodes in a randomized setting. We utilize the Markov Decision Process framework to model and optimize the network's average Version Age of Information (AoI) and obtain the optimal policy at the aggregator. The structure of the optimal policy is analytically demonstrated and numerically verified. Numerical results highlight the effect of the system parameters on the average Version AoI. The simulations reveal the superior performance of the optimal policy compared to a set of baseline policies.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Optimal Status Updates for Minimizing Age of Correlated Information in IoT Networks with Energy Harvesting Sensors
Authors:
Chao Xu,
Xinyan Zhang,
Howard H. Yang,
Xijun Wang,
Nikolaos Pappas,
Dusit Niyato,
Tony Q. S. Quek
Abstract:
Many real-time applications of the Internet of Things (IoT) need to deal with correlated information generated by multiple sensors. The design of efficient status update strategies that minimize the Age of Correlated Information (AoCI) is a key factor. In this paper, we consider an IoT network consisting of sensors equipped with the energy harvesting (EH) capability. We optimize the average AoCI a…
▽ More
Many real-time applications of the Internet of Things (IoT) need to deal with correlated information generated by multiple sensors. The design of efficient status update strategies that minimize the Age of Correlated Information (AoCI) is a key factor. In this paper, we consider an IoT network consisting of sensors equipped with the energy harvesting (EH) capability. We optimize the average AoCI at the data fusion center (DFC) by appropriately managing the energy harvested by sensors, whose true battery states are unobservable during the decision-making process. Particularly, we first formulate the dynamic status update procedure as a partially observable Markov decision process (POMDP), where the environmental dynamics are unknown to the DFC. In order to address the challenges arising from the causality of energy usage, unknown environmental dynamics, unobservability of sensors'true battery states, and large-scale discrete action space, we devise a deep reinforcement learning (DRL)-based dynamic status update algorithm. The algorithm leverages the advantages of the soft actor-critic and long short-term memory techniques. Meanwhile, it incorporates our proposed action decomposition and map** mechanism. Extensive simulations are conducted to validate the effectiveness of our proposed algorithm by comparing it with available DRL algorithms for POMDPs.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.
-
State-aware Real-time Tracking and Remote Reconstruction of a Markov Source
Authors:
Mehrdad Salimnejad,
Marios Kountouris,
Nikolaos Pappas
Abstract:
The problem of real-time remote tracking and reconstruction of a two-state Markov process is considered here. A transmitter sends samples from an observed information source to a remote monitor over an unreliable wireless channel. The receiver, in turn, performs an action according to the state of the reconstructed source. We propose a state-aware randomized stationary sampling and transmission po…
▽ More
The problem of real-time remote tracking and reconstruction of a two-state Markov process is considered here. A transmitter sends samples from an observed information source to a remote monitor over an unreliable wireless channel. The receiver, in turn, performs an action according to the state of the reconstructed source. We propose a state-aware randomized stationary sampling and transmission policy which accounts for the importance of different states of the information source, and their impact on the goal of the communication process. We then analyze the performance of the proposed policy, and compare it with existing goal-oriented joint sampling and transmission policies, with respect to a set of performance metrics. Specifically, we study the real-time reconstruction error, the cost of actuation error, the consecutive error, and a new metric, coined importance-aware consecutive error. In addition, we formulate and solve a constrained optimization problem that aims to obtain the optimal sampling probabilities that minimize the average cost of actuation error. Our results show that in the scenario of constrained sampling generation, the optimal state-aware randomized stationary policy outperforms all other sampling policies for fast evolving sources, and, under certain conditions, for slowly varying sources. Otherwise, a semantics-aware policy performs better only when the source is slowly varying.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
Age of Information in Locally Adaptive Frame Slotted ALOHA
Authors:
Zhiling Yue,
Howard H. Yang,
Meng Zhang,
Nikolaos Pappas
Abstract:
We consider a random access network consisting of source-destination pairs. Each source node generates status updates and transmits this information to its intended destination over a shared spectrum. The goal is to minimize the network-wide Age of Information (AoI). We develop a frame slotted ALOHA (FSA)-based policy for generating and transmitting status updates, where the frame size of each sou…
▽ More
We consider a random access network consisting of source-destination pairs. Each source node generates status updates and transmits this information to its intended destination over a shared spectrum. The goal is to minimize the network-wide Age of Information (AoI). We develop a frame slotted ALOHA (FSA)-based policy for generating and transmitting status updates, where the frame size of each source node is adjusted according to its local environment. The proposed policy is of low complexity and can be implemented in a distributed manner. Additionally, it significantly improves the network AoI performance by (a) equalizing the update generation intervals at each source and (b) reducing interference across the network. Furthermore, we derive an analytical expression for the average network AoI attained for that policy. We evaluate the performance of the proposed scheme through simulations, which demonstrate that the locally adaptive FSA policy achieves a remarkable gain in terms of AoI compared to the slotted ALOHA counterpart, confirming the effectiveness of the proposed method.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Experimental Study of Transport Layer Protocols for Wireless Networked Control Systems
Authors:
Polina Kutsevol,
Onur Ayan,
Nikolaos Pappas,
Wolfgang Kellerer
Abstract:
In Wireless Networked Control Systems (WNCSs), the feedback control loops are closed over a wireless communication network. The proliferation of WNCSs requires efficient network resource management mechanisms since the control performance is significantly affected by the impairments caused by network limitations. In conventional communication networks, the amount of transmitted data is one of the…
▽ More
In Wireless Networked Control Systems (WNCSs), the feedback control loops are closed over a wireless communication network. The proliferation of WNCSs requires efficient network resource management mechanisms since the control performance is significantly affected by the impairments caused by network limitations. In conventional communication networks, the amount of transmitted data is one of the key performance indicators. In contrast, in WNCSs, the efficiency of the network is measured by its ability to facilitate control applications, and the data transmission rate should be limited to avoid network congestion. In this work, we consider an experimental setup where multiple control loops share a wireless communication network. Our testbed comprises up to five control loops that include Zolertia Re-Mote devices implementing IEEE 802.15.4 standard. We propose a novel relevance- and network-aware transport layer (TL) scheme for WNCSs. The proposed scheme admits the most important measurements for the control process into the network while taking current network conditions into account. Moreover, we propose a mechanism for the scheme parameters adaptation in dynamic scenarios with unknown network statistics. Unlike the conventional TL mechanisms failing to provide adequate control performance due to either congestion in the network or inefficient utilization of available resources, our method prevents network congestion while kee** the control performance high. We argue that relevance- and network-awareness are critical components of network protocol design to avoid control performance degradation in practice.
△ Less
Submitted 21 June, 2023; v1 submitted 18 June, 2023;
originally announced June 2023.
-
Analysis of the Age of Information in Age-Threshold Slotted ALOHA
Authors:
Howard H. Yang,
Nikolaos Pappas,
Tony Q. S. Quek,
Martin Haenggi
Abstract:
We investigate the performance of a random access network consisting of source-destination dipoles. The source nodes transmit information packets to their destinations over a shared spectrum. All the transmitters in this network adhere to an age threshold slotted ALOHA (TSA) protocol: every source node remains silent until the age of information (AoI) reaches a threshold, after which the source ac…
▽ More
We investigate the performance of a random access network consisting of source-destination dipoles. The source nodes transmit information packets to their destinations over a shared spectrum. All the transmitters in this network adhere to an age threshold slotted ALOHA (TSA) protocol: every source node remains silent until the age of information (AoI) reaches a threshold, after which the source accesses the radio channel with a certain probability. We derive a tight approximation for the signal-to-interference-plus-noise ratio (SINR) meta distribution and verify its accuracy through simulations. We also obtain analytical expressions for the average AoI. Our analysis reveals that when the network is densely deployed, employing TSA significantly decreases the average AoI. The update rate and age threshold must be jointly optimized to fully exploit the potential of the TSA protocol.
△ Less
Submitted 16 June, 2023;
originally announced June 2023.
-
Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing
Authors:
Shufan Wang,
Sebastien Jean,
Sailik Sengupta,
James Gung,
Nikolaos Pappas,
Yi Zhang
Abstract:
In executable task-oriented semantic parsing, the system aims to translate users' utterances in natural language to machine-interpretable programs (API calls) that can be executed according to pre-defined API specifications. With the popularity of Large Language Models (LLMs), in-context learning offers a strong baseline for such scenarios, especially in data-limited regimes. However, LLMs are kno…
▽ More
In executable task-oriented semantic parsing, the system aims to translate users' utterances in natural language to machine-interpretable programs (API calls) that can be executed according to pre-defined API specifications. With the popularity of Large Language Models (LLMs), in-context learning offers a strong baseline for such scenarios, especially in data-limited regimes. However, LLMs are known to hallucinate and therefore pose a formidable challenge in constraining generated content. Thus, it remains uncertain if LLMs can effectively perform task-oriented utterance-to-API generation where respecting API's structural and task-specific constraints is crucial.
In this work, we seek to measure, analyze and mitigate such constraints violations. First, we identify the categories of various constraints in obtaining API-semantics from task-oriented utterances, and define fine-grained metrics that complement traditional ones. Second, we leverage these metrics to conduct a detailed error analysis of constraints violations seen in state-of-the-art LLMs, which motivates us to investigate two mitigation strategies: Semantic-Retrieval of Demonstrations (SRD) and API-aware Constrained Decoding (API-CD). Our experiments show that these strategies are effective at reducing constraints violations and improving the quality of the generated API calls, but require careful consideration given their implementation complexity and latency.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification
Authors:
Mujeen Sung,
James Gung,
Elman Mansimov,
Nikolaos Pappas,
Raphael Shu,
Salvatore Romeo,
Yi Zhang,
Vittorio Castelli
Abstract:
Intent classification (IC) plays an important role in task-oriented dialogue systems. However, IC models often generalize poorly when training without sufficient annotated examples for each user intent. We propose a novel pre-training method for text encoders that uses contrastive learning with intent psuedo-labels to produce embeddings that are well-suited for IC tasks, reducing the need for manu…
▽ More
Intent classification (IC) plays an important role in task-oriented dialogue systems. However, IC models often generalize poorly when training without sufficient annotated examples for each user intent. We propose a novel pre-training method for text encoders that uses contrastive learning with intent psuedo-labels to produce embeddings that are well-suited for IC tasks, reducing the need for manual annotations. By applying this pre-training strategy, we also introduce Pre-trained Intent-aware Encoder (PIE), which is designed to align encodings of utterances with their intent names. Specifically, we first train a tagger to identify key phrases within utterances that are crucial for interpreting intents. We then use these extracted phrases to create examples for pre-training a text encoder in a contrastive manner. As a result, our PIE model achieves up to 5.4% and 4.0% higher accuracy than the previous state-of-the-art text encoder for the N-way zero- and one-shot settings on four IC datasets.
△ Less
Submitted 13 November, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Semantic Filtering and Source Coding in Distributed Wireless Monitoring Systems
Authors:
Pouya Agheli,
Nikolaos Pappas,
Marios Kountouris
Abstract:
The problem of goal-oriented semantic filtering and timely source coding in multiuser communication systems is considered here. We study a distributed monitoring system in which multiple information sources, each observing a physical process, provide status update packets to multiple monitors having heterogeneous goals. Two semantic filtering schemes are first proposed as a means to admit or drop…
▽ More
The problem of goal-oriented semantic filtering and timely source coding in multiuser communication systems is considered here. We study a distributed monitoring system in which multiple information sources, each observing a physical process, provide status update packets to multiple monitors having heterogeneous goals. Two semantic filtering schemes are first proposed as a means to admit or drop arrival packets based on their goal-dependent importance, which is a function of the intrinsic and extrinsic attributes of information and the probability of occurrence of each realization. Admitted packets at each sensor are then encoded and transmitted over block-fading wireless channels so that served monitors can timely fulfill their goals. A truncated error control scheme is derived, which allows transmitters to drop or retransmit undelivered packets based on their significance. Then, we formulate the timely source encoding optimization problem and analytically derive the optimal codeword lengths assigned to the admitted packets which maximize a weighted sum of semantic utility functions for all pairs of communicating sensors and monitors. Our analytical and numerical results provide the optimal design parameters for different arrival rates and highlight the improvement in timely status update delivery using the proposed semantic filtering, source coding, and error control schemes.
△ Less
Submitted 14 February, 2024; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
Authors:
Zhanpeng Zeng,
Cole Hawkins,
Mingyi Hong,
Aston Zhang,
Nikolaos Pappas,
Vikas Singh,
Shuai Zheng
Abstract:
Transformers are central in modern natural language processing and computer vision applications. Despite recent works devoted to reducing the quadratic cost of such models (as a function of the sequence length), dealing with ultra long sequences (e.g., with more than 16K tokens) remains challenging. Applications such as answering questions based on a book or summarizing a scientific article are in…
▽ More
Transformers are central in modern natural language processing and computer vision applications. Despite recent works devoted to reducing the quadratic cost of such models (as a function of the sequence length), dealing with ultra long sequences (e.g., with more than 16K tokens) remains challenging. Applications such as answering questions based on a book or summarizing a scientific article are inefficient or infeasible. Here, we propose to significantly improve the efficiency of Transformers for ultra long sequences, by compressing the sequence into a much smaller representation at each layer. Specifically, by exploiting the fact that in many tasks, only a small subset of special tokens (we call VIP-tokens) are most relevant to the final prediction, we propose a VIP-token centric compression (VCC) scheme which selectively compresses the sequence based on their impact on approximating the representation of the VIP-tokens. Compared with competitive baselines, our algorithm is not only efficient (achieving more than $3\times$ efficiency gain compared to baselines on 4K and 16K lengths), but also offers competitive/better performance on a large number of tasks. Further, we show that our algorithm scales to 128K tokens (or more) while consistently offering accuracy improvement.
△ Less
Submitted 27 May, 2023; v1 submitted 7 May, 2023;
originally announced May 2023.
-
Goal-oriented Policies for Cost of Actuation Error Minimization in Wireless Autonomous Systems
Authors:
Emmanouil Fountoulakis,
Nikolaos Pappas,
Marios Kountouris
Abstract:
We consider the minimization of the cost of actuation error under resource constraints for real-time tracking in wireless autonomous systems. A transmitter monitors the state of a discrete random process and sends updates to the receiver over an unreliable wireless channel. The receiver takes actions according to the estimated state of the source. For each discrepancy between the real state of the…
▽ More
We consider the minimization of the cost of actuation error under resource constraints for real-time tracking in wireless autonomous systems. A transmitter monitors the state of a discrete random process and sends updates to the receiver over an unreliable wireless channel. The receiver takes actions according to the estimated state of the source. For each discrepancy between the real state of the source and the estimated one, we consider a different cost of actuation error. This models the case where some states, and consequently the corresponding actions to be taken, are more important than others. We provide two algorithms: one reaching an optimal solution but of high complexity, and one providing a suboptimal solution but with low complexity. The performance of the two algorithms are quite close as shown by the simulations.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Age of Information Under Frame Slotted ALOHA-Based Status Updating Protocol
Authors:
Zhiling Yue,
Howard H. Yang,
Meng Zhang,
Nikolaos Pappas
Abstract:
We propose a frame slotted ALOHA (FSA)-based protocol for a random access network where sources transmit status updates to their intended destinations. We evaluate the effect of such a protocol on the network's timeliness performance using the Age of Information (AoI) metric. Specifically, we leverage tools from stochastic geometry to model the spatial positions of the source-destination pairs and…
▽ More
We propose a frame slotted ALOHA (FSA)-based protocol for a random access network where sources transmit status updates to their intended destinations. We evaluate the effect of such a protocol on the network's timeliness performance using the Age of Information (AoI) metric. Specifically, we leverage tools from stochastic geometry to model the spatial positions of the source-destination pairs and capture the entanglement amongst the nodes' spatial-temporal attributes through the interference they caused to each other. We derive analytical expressions for the average and variance of AoI over a typical transmission link in Poisson bipolar and cellular networks, respectively. Our analysis shows that in densely deployed networks, the FSA-based status updating protocol can significantly decrease the average AoI and in addition, stabilizes the age performance by substantially reducing the variance of AoI. Furthermore, under the same updating frequency, converting a slotted ALOHA protocol into an FSA-based one always leads to a reduction in the average AoI. Moreover, implementing FSA in conjunction with power control can further benefit the AoI performance, although the particular values of framesize and power control factor must be adequately tuned to achieve the optimal gain.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
Distortion Minimization with Age of Information and Cost Constraints
Authors:
Jayanth S,
Nikolaos Pappas,
Rajshekhar V Bhat
Abstract:
We consider a source monitoring a stochastic process with a transmitter to transmit timely information through a wireless ON/OFF channel to a destination. We assume that once the source samples the data, the sampled data has to be processed to identify the state of the stochastic process. The processing can take place either at the source before transmission or after transmission at the destinatio…
▽ More
We consider a source monitoring a stochastic process with a transmitter to transmit timely information through a wireless ON/OFF channel to a destination. We assume that once the source samples the data, the sampled data has to be processed to identify the state of the stochastic process. The processing can take place either at the source before transmission or after transmission at the destination. The objective is to minimize the distortion while kee** the age of information (AoI) that measures the timeliness of information under a certain threshold. We use a stationary randomized policy (SRP) framework to solve the formulated problem. We show that the two-dimensional discrete-time Markov chain considering the AoI and instantaneous distortion as the state is lumpable and we obtain the expression for the expected AoI under the SRP.
△ Less
Submitted 24 June, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Age of Actuation in a Wireless Power Transfer System
Authors:
Ali Nikkhah,
Anthony Ephremides,
Nikolaos Pappas
Abstract:
In this paper, we study a model relevant to semantics-aware goal-oriented communications. More specifically, observations from an external process are transmitted through status updates to a battery-powered receiver. From these updates, the receiver is informed about the status of the process and if there is sufficient energy, uses them to perform an actuation to achieve a goal. We consider a wire…
▽ More
In this paper, we study a model relevant to semantics-aware goal-oriented communications. More specifically, observations from an external process are transmitted through status updates to a battery-powered receiver. From these updates, the receiver is informed about the status of the process and if there is sufficient energy, uses them to perform an actuation to achieve a goal. We consider a wireless power transfer model where the destination receives energy from a dedicated power transmitter and occasionally from the data transmitter. We provide the analysis for the Age of Information (AoI). Furthermore, we propose a new metric, namely the \textit{Age of Actuation (AoA) which is relevant when the receiver utilizes the status updates to perform actions in a timely manner}. We analytically characterize the AoA and we show that is a more general metric than AoI. We provide the optimization problems for both metrics, and we numerically evaluate our analytical results.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Real-time Reconstruction of Markov Sources and Remote Actuation over Wireless Channels
Authors:
Mehrdad Salimnejad,
Marios Kountouris,
Nikolaos Pappas
Abstract:
In this work, we study the real-time tracking and reconstruction of an information source with the purpose of actuation. A device monitors the state of the information source and transmits status updates to a receiver over a wireless erasure channel. We consider two models for the source, namely an $N$-state Markov chain and an $N$-state Birth-Death Markov process. We investigate several joint sam…
▽ More
In this work, we study the real-time tracking and reconstruction of an information source with the purpose of actuation. A device monitors the state of the information source and transmits status updates to a receiver over a wireless erasure channel. We consider two models for the source, namely an $N$-state Markov chain and an $N$-state Birth-Death Markov process. We investigate several joint sampling and transmission policies, including a semantics-aware one, and we study their performance with respect to a set of metrics. Specifically, we investigate the real-time reconstruction error and its variance, the cost of actuation error, the consecutive error, and the cost of memory error. These metrics capture different characteristics of the system performance, such as the impact of erroneous actions and the timing of errors. In addition, we propose a randomized stationary sampling and transmission policy and we derive closed-form expressions for the aforementioned metrics. We then formulate two optimization problems. The first optimization problem aims to minimize the time-averaged reconstruction error subject to time-averaged sampling cost constraint. Then, we compare the optimal randomized stationary policy with uniform, change-aware, and semantics-aware sampling policies. Our results show that in the scenario of constrained sampling generation, the optimal randomized stationary policy outperforms all other sampling policies when the source is rapidly evolving. Otherwise, the semantics-aware policy performs the best. The objective of the second optimization problem is to obtain an optimal sampling policy that minimizes the average consecutive error with a constraint on the time-averaged sampling cost. Based on this, we propose a \emph{wait-then-generate} sampling policy which is simple to implement.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Conversation Style Transfer using Few-Shot Learning
Authors:
Shamik Roy,
Raphael Shu,
Nikolaos Pappas,
Elman Mansimov,
Yi Zhang,
Saab Mansour,
Dan Roth
Abstract:
Conventional text style transfer approaches focus on sentence-level style transfer without considering contextual information, and the style is described with attributes (e.g., formality). When applying style transfer in conversations such as task-oriented dialogues, existing approaches suffer from these limitations as context can play an important role and the style attributes are often difficult…
▽ More
Conventional text style transfer approaches focus on sentence-level style transfer without considering contextual information, and the style is described with attributes (e.g., formality). When applying style transfer in conversations such as task-oriented dialogues, existing approaches suffer from these limitations as context can play an important role and the style attributes are often difficult to define in conversations. In this paper, we introduce conversation style transfer as a few-shot learning problem, where the model learns to perform style transfer by observing only a few example dialogues in the target style. We propose a novel in-context learning approach to solve the task with style-free dialogues as a pivot. Human evaluation shows that by incorporating multi-turn context, the model is able to match the target style while having better appropriateness and semantic correctness compared to utterance/sentence-level style transfer. Additionally, we show that conversation style transfer can also benefit downstream tasks. For example, in multi-domain intent classification tasks, the F1 scores improve after transferring the style of training data to match the style of the test data.
△ Less
Submitted 21 September, 2023; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Real-time Remote Reconstruction of a Markov Source and Actuation over Wireless
Authors:
Mehrdad Salimnejad,
Marios Kountouris,
Nikolaos Pappas
Abstract:
In this work, we study the problem of real-time tracking and reconstruction of an information source with the purpose of actuation. A device monitors an $N$-state Markov process and transmits status updates to a receiver over a wireless erasure channel. We consider a set of joint sampling and transmission policies, including a semantics-aware one, and we study their performance with respect to rel…
▽ More
In this work, we study the problem of real-time tracking and reconstruction of an information source with the purpose of actuation. A device monitors an $N$-state Markov process and transmits status updates to a receiver over a wireless erasure channel. We consider a set of joint sampling and transmission policies, including a semantics-aware one, and we study their performance with respect to relevant metrics. Specifically, we investigate the real-time reconstruction error and its variance, the consecutive error, the cost of memory error, and the cost of actuation error. Furthermore, we propose a randomized stationary sampling and transmission policy and derive closed-form expressions for all aforementioned metrics. We then formulate an optimization problem for minimizing the real-time reconstruction error subject to a sampling cost constraint. Our results show that in the scenario of constrained sampling generation, the optimal randomized stationary policy outperforms all other sampling policies when the source is rapidly evolving. Otherwise, the semantics-aware policy performs the best.
△ Less
Submitted 31 March, 2023; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Backward Compatibility During Data Updates by Weight Interpolation
Authors:
Raphael Schumann,
Elman Mansimov,
Yi-An Lai,
Nikolaos Pappas,
Xibin Gao,
Yi Zhang
Abstract:
Backward compatibility of model predictions is a desired property when updating a machine learning driven application. It allows to seamlessly improve the underlying model without introducing regression bugs. In classification tasks these bugs occur in the form of negative flips. This means an instance that was correctly classified by the old model is now classified incorrectly by the updated mode…
▽ More
Backward compatibility of model predictions is a desired property when updating a machine learning driven application. It allows to seamlessly improve the underlying model without introducing regression bugs. In classification tasks these bugs occur in the form of negative flips. This means an instance that was correctly classified by the old model is now classified incorrectly by the updated model. This has direct negative impact on the user experience of such systems e.g. a frequently used voice assistant query is suddenly misclassified. A common reason to update the model is when new training data becomes available and needs to be incorporated. Simply retraining the model with the updated data introduces the unwanted negative flips. We study the problem of regression during data updates and propose Backward Compatible Weight Interpolation (BCWI). This method interpolates between the weights of the old and new model and we show in extensive experiments that it reduces negative flips without sacrificing the improved accuracy of the new model. BCWI is straight forward to implement and does not increase inference cost. We also explore the use of importance weighting during interpolation and averaging the weights of multiple new models in order to further reduce negative flips.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Texture Representation via Analysis and Synthesis with Generative Adversarial Networks
Authors:
Jue Lin,
Gaurav Sharma,
Thrasyvoulos N. Pappas
Abstract:
We investigate data-driven texture modeling via analysis and synthesis with generative adversarial networks. For network training and testing, we have compiled a diverse set of spatially homogeneous textures, ranging from stochastic to regular. We adopt StyleGAN3 for synthesis and demonstrate that it produces diverse textures beyond those represented in the training data. For texture analysis, we…
▽ More
We investigate data-driven texture modeling via analysis and synthesis with generative adversarial networks. For network training and testing, we have compiled a diverse set of spatially homogeneous textures, ranging from stochastic to regular. We adopt StyleGAN3 for synthesis and demonstrate that it produces diverse textures beyond those represented in the training data. For texture analysis, we propose GAN inversion using a novel latent domain reconstruction consistency criterion for synthesized textures, and iterative refinement with Gramian loss for real textures. We propose perceptual procedures for evaluating network capabilities, exploring the global and local behavior of latent space trajectories, and comparing with existing texture analysis-synthesis techniques.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
Dialog2API: Task-Oriented Dialogue with API Description and Example Programs
Authors:
Raphael Shu,
Elman Mansimov,
Tamer Alkhouli,
Nikolaos Pappas,
Salvatore Romeo,
Arshit Gupta,
Saab Mansour,
Yi Zhang,
Dan Roth
Abstract:
Functionality and dialogue experience are two important factors of task-oriented dialogue systems. Conventional approaches with closed schema (e.g., conversational semantic parsing) often fail as both the functionality and dialogue experience are strongly constrained by the underlying schema. We introduce a new paradigm for task-oriented dialogue - Dialog2API - to greatly expand the functionality…
▽ More
Functionality and dialogue experience are two important factors of task-oriented dialogue systems. Conventional approaches with closed schema (e.g., conversational semantic parsing) often fail as both the functionality and dialogue experience are strongly constrained by the underlying schema. We introduce a new paradigm for task-oriented dialogue - Dialog2API - to greatly expand the functionality and provide seamless dialogue experience. The conversational model interacts with the environment by generating and executing programs triggering a set of pre-defined APIs. The model also manages the dialogue policy and interact with the user through generating appropriate natural language responses. By allowing generating free-form programs, Dialog2API supports composite goals by combining different APIs, whereas unrestricted program revision provides natural and robust dialogue experience. To facilitate Dialog2API, the core model is provided with API documents, an execution environment and optionally some example dialogues annotated with programs. We propose an approach tailored for the Dialog2API, where the dialogue states are represented by a stack of programs, with most recently mentioned program on the top of the stack. Dialog2API can work with many application scenarios such as software automation and customer service. In this paper, we construct a dataset for AWS S3 APIs and present evaluation results of in-context learning baselines.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
Age of Information with On-Off Service
Authors:
Ashirwad Sinha,
Praful D. Mankar,
Nikolaos Pappas,
Harpreet S. Dhillon
Abstract:
This paper considers a communication system where a source sends time-sensitive information to its destination. We assume that both arrival and service processes of the messages are memoryless and the source has a single server with no buffer. Besides, we consider that the service is interrupted by an independent random process, which we model using the On-Off process. For this setup, we study the…
▽ More
This paper considers a communication system where a source sends time-sensitive information to its destination. We assume that both arrival and service processes of the messages are memoryless and the source has a single server with no buffer. Besides, we consider that the service is interrupted by an independent random process, which we model using the On-Off process. For this setup, we study the age of information for two queueing disciplines: 1) non-preemptive, where the messages arriving while the server is occupied are discarded, and 2) preemptive, where the in-service messages are replaced with newly arriving messages in the Off states. For these disciplines, we derive closed-form expressions for the mean peak age and mean age.
△ Less
Submitted 8 December, 2022;
originally announced December 2022.
-
Modeling Context With Linear Attention for Scalable Document-Level Translation
Authors:
Zhaofeng Wu,
Hao Peng,
Nikolaos Pappas,
Noah A. Smith
Abstract:
Document-level machine translation leverages inter-sentence dependencies to produce more coherent and consistent translations. However, these models, predominantly based on transformers, are difficult to scale to long documents as their attention layers have quadratic complexity in the sequence length. Recent efforts on efficient attention improve scalability, but their effect on document translat…
▽ More
Document-level machine translation leverages inter-sentence dependencies to produce more coherent and consistent translations. However, these models, predominantly based on transformers, are difficult to scale to long documents as their attention layers have quadratic complexity in the sequence length. Recent efforts on efficient attention improve scalability, but their effect on document translation remains unexplored. In this work, we investigate the efficacy of a recent linear attention model by Peng et al. (2021) on document translation and augment it with a sentential gate to promote a recency inductive bias. We evaluate the model on IWSLT 2015 and OpenSubtitles 2018 against the transformer, demonstrating substantially increased decoding speed on long sequences with similar or better BLEU scores. We show that sentential gating further improves translation quality on IWSLT.
△ Less
Submitted 15 October, 2022;
originally announced October 2022.
-
On the Interplay Between Deadline-Constrained Traffic and the Number of Allowed Retransmissions in Random Access Networks
Authors:
Nikolaos Nomikos,
Themistoklis Charalambous,
Yvonne-Anne Pignolet,
Nikolaos Pappas
Abstract:
In this paper, a network comprising wireless devices equipped with buffers transmitting deadline-constrained data packets over a slotted-ALOHA random-access channel is studied. Although communication protocols facilitating retransmissions increase reliability, packet transmission from the queue experiences delays and thus, packets with time constraints might be dropped before being successfully tr…
▽ More
In this paper, a network comprising wireless devices equipped with buffers transmitting deadline-constrained data packets over a slotted-ALOHA random-access channel is studied. Although communication protocols facilitating retransmissions increase reliability, packet transmission from the queue experiences delays and thus, packets with time constraints might be dropped before being successfully transmitted, while at the same time causing the queue size of the buffer to increase. Towards understanding the trade-off between reliability and delays that might lead to packet drops due to the deadline-constrained bursty traffic with retransmissions, a scenario of a wireless network utilizing a slotted-ALOHA random-access channel is investigated. The main focus is to reveal and investigate further the trade-off between the number of retransmissions and the packet deadline as a function of the arrival rate. Hence, we are able to determine numerically the optimal probability of transmissions and number of retransmissions, given the packet arrival rate and the packet deadline. The analysis of the system was done by means of discrete-time Markov chains. Two scenarios are studied: i) the collision channel model (in which a receiver can decode only when a single packet is transmitted), and ii) the case for which receivers have multi-packet reception capabilities. A performance evaluation for a user with different transmit probability and number of retransmissions is conducted, demonstrating their impact on the average drop rate and throughput, while at the same time showing that there exists a set of values, under which improved performance can be acquired.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.