-
Lyapunov-Based Deep Residual Neural Network (ResNet) Adaptive Control
Authors:
Omkar Sudhir Patil,
Duc M. Le,
Emily J. Griffis,
Warren E. Dixon
Abstract:
Deep Neural Network (DNN)-based controllers have emerged as a tool to compensate for unstructured uncertainties in nonlinear dynamical systems. A recent breakthrough in the adaptive control literature provides a Lyapunov-based approach to derive weight adaptation laws for each layer of a fully-connected feedforward DNN-based adaptive controller. However, deriving weight adaptation laws from a Lyap…
▽ More
Deep Neural Network (DNN)-based controllers have emerged as a tool to compensate for unstructured uncertainties in nonlinear dynamical systems. A recent breakthrough in the adaptive control literature provides a Lyapunov-based approach to derive weight adaptation laws for each layer of a fully-connected feedforward DNN-based adaptive controller. However, deriving weight adaptation laws from a Lyapunov-based analysis remains an open problem for deep residual neural networks (ResNets). This paper provides the first result on Lyapunov-derived weight adaptation for a ResNet-based adaptive controller. A nonsmooth Lyapunov-based analysis is provided to guarantee asymptotic tracking error convergence. Comparative Monte Carlo simulations are provided to demonstrate the performance of the developed ResNet-based adaptive controller. The ResNet-based adaptive controller shows a 64% improvement in the tracking and function approximation performance, in comparison to a fully-connected DNN-based adaptive controller.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Constrained Decoding for Cross-lingual Label Projection
Authors:
Duong Minh Le,
Yang Chen,
Alan Ritter,
Wei Xu
Abstract:
Zero-shot cross-lingual transfer utilizing multilingual LLMs has become a popular learning paradigm for low-resource languages with no labeled training data. However, for NLP tasks that involve fine-grained predictions on words and phrases, the performance of zero-shot cross-lingual transfer learning lags far behind supervised fine-tuning methods. Therefore, it is common to exploit translation and…
▽ More
Zero-shot cross-lingual transfer utilizing multilingual LLMs has become a popular learning paradigm for low-resource languages with no labeled training data. However, for NLP tasks that involve fine-grained predictions on words and phrases, the performance of zero-shot cross-lingual transfer learning lags far behind supervised fine-tuning methods. Therefore, it is common to exploit translation and label projection to further improve the performance by (1) translating training data that is available in a high-resource language (e.g., English) together with the gold labels into low-resource languages, and/or (2) translating test data in low-resource languages to a high-source language to run inference on, then projecting the predicted span-level labels back onto the original test data. However, state-of-the-art marker-based label projection methods suffer from translation quality degradation due to the extra label markers injected in the input to the translation model. In this work, we explore a new direction that leverages constrained decoding for label projection to overcome the aforementioned issues. Our new method not only can preserve the quality of translated texts but also has the versatility of being applicable to both translating training and translating test data strategies. This versatility is crucial as our experiments reveal that translating test data can lead to a considerable boost in performance compared to translating only training data. We evaluate on two cross-lingual transfer tasks, namely Named Entity Recognition and Event Argument Extraction, spanning 20 languages. The results demonstrate that our approach outperforms the state-of-the-art marker-based method by a large margin and also shows better performance than other label projection methods that rely on external word alignment.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Improved Instruction Ordering in Recipe-Grounded Conversation
Authors:
Duong Minh Le,
Ruohao Guo,
Wei Xu,
Alan Ritter
Abstract:
In this paper, we study the task of instructional dialogue and focus on the cooking domain. Analyzing the generated output of the GPT-J model, we reveal that the primary challenge for a recipe-grounded dialog system is how to provide the instructions in the correct order. We hypothesize that this is due to the model's lack of understanding of user intent and inability to track the instruction stat…
▽ More
In this paper, we study the task of instructional dialogue and focus on the cooking domain. Analyzing the generated output of the GPT-J model, we reveal that the primary challenge for a recipe-grounded dialog system is how to provide the instructions in the correct order. We hypothesize that this is due to the model's lack of understanding of user intent and inability to track the instruction state (i.e., which step was last instructed). Therefore, we propose to explore two auxiliary subtasks, namely User Intent Detection and Instruction State Tracking, to support Response Generation with improved instruction grounding. Experimenting with our newly collected dataset, ChattyChef, shows that incorporating user intent and instruction state information helps the response generation model mitigate the incorrect order issue. Furthermore, to investigate whether ChatGPT has completely solved this task, we analyze its outputs and find that it also makes mistakes (10.7% of the responses), about half of which are out-of-order instructions. We will release ChattyChef to facilitate further research in this area at: https://github.com/octaviaguo/ChattyChef.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
Tombo Propeller: Bio-Inspired Deformable Structure toward Collision-Accommodated Control for Drones
Authors:
Son Tien Bui,
Quan Khanh Luu,
Dinh Quang Nguyen,
Nhat Dinh Minh Le,
Giuseppe Loianno,
Van Anh Ho
Abstract:
There is a growing need for vertical take-off and landing vehicles, including drones, which are safe to use and can adapt to collisions. The risks of damage by collision, to humans, obstacles in the environment, and drones themselves, are significant. This has prompted a search into nature for a highly resilient structure that can inform a design of propellers to reduce those risks and enhance saf…
▽ More
There is a growing need for vertical take-off and landing vehicles, including drones, which are safe to use and can adapt to collisions. The risks of damage by collision, to humans, obstacles in the environment, and drones themselves, are significant. This has prompted a search into nature for a highly resilient structure that can inform a design of propellers to reduce those risks and enhance safety. Inspired by the flexibility and resilience of dragonfly wings, we propose a novel design for a biomimetic drone propeller called Tombo propeller. Here, we report on the design and fabrication process of this biomimetic propeller that can accommodate collisions and recover quickly, while maintaining sufficient thrust force to hover and fly. We describe the development of an aerodynamic model and experiments conducted to investigate performance characteristics for various configurations of the propeller morphology, and related properties, such as generated thrust force, thrust force deviation, collision force, recovery time, lift-to-drag ratio, and noise. Finally, we design and showcase a control strategy for a drone equipped with Tombo propellers that collides in mid-air with an obstacle and recovers from collision continuing flying. The results show that the maximum collision force generated by the proposed Tombo propeller is less than two-thirds that of a traditional rigid propeller, which suggests the concrete possibility to employ deformable propellers for drones flying in a cluttered environment. This research can contribute to morphological design of flying vehicles for agile and resilient performance.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Authors:
Nguyen Luong Tran,
Duong Minh Le,
Dat Quoc Nguyen
Abstract:
We present BARTpho with two versions, BARTpho-syllable and BARTpho-word, which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese. BARTpho uses the "large" architecture and the pre-training scheme of the sequence-to-sequence denoising autoencoder BART, thus it is especially suitable for generative NLP tasks. We conduct experiments to compare our BAR…
▽ More
We present BARTpho with two versions, BARTpho-syllable and BARTpho-word, which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese. BARTpho uses the "large" architecture and the pre-training scheme of the sequence-to-sequence denoising autoencoder BART, thus it is especially suitable for generative NLP tasks. We conduct experiments to compare our BARTpho with its competitor mBART on a downstream task of Vietnamese text summarization and show that: in both automatic and human evaluations, BARTpho outperforms the strong baseline mBART and improves the state-of-the-art. We further evaluate and compare BARTpho and mBART on the Vietnamese capitalization and punctuation restoration tasks and also find that BARTpho is more effective than mBART on these two tasks. We publicly release BARTpho to facilitate future research and applications of generative Vietnamese NLP tasks. Our BARTpho models are available at https://github.com/VinAIResearch/BARTpho
△ Less
Submitted 27 June, 2022; v1 submitted 20 September, 2021;
originally announced September 2021.
-
Architectural Archipelagos: Technical Debt in Long-Lived Software Research Platforms
Authors:
Marcelo Schmitt Laser,
Duc Minh Le,
Joshua Garcia,
Nenad Medvidović
Abstract:
This paper identifies a model of software evolution that is prevalent in large, long-lived academic research tool suites (3L-ARTS). This model results in an "archipelago" of related but haphazardly organized architectural "islands", and inherently induces technical debt. We illustrate the archipelago model with examples from two 3L-ARTS archipelagos identified in literature.
This paper identifies a model of software evolution that is prevalent in large, long-lived academic research tool suites (3L-ARTS). This model results in an "archipelago" of related but haphazardly organized architectural "islands", and inherently induces technical debt. We illustrate the archipelago model with examples from two 3L-ARTS archipelagos identified in literature.
△ Less
Submitted 16 April, 2021;
originally announced April 2021.
-
Architectural Decay as Predictor of Issue- and Change-Proneness
Authors:
Duc Minh Le,
Suhrid Karthik,
Marcelo Schmitt Laser,
Nenad Medvidovic
Abstract:
Architectural decay imposes real costs in terms of developer effort, system correctness, and performance. Over time, those problems are likely to be revealed as explicit implementation issues (defects, feature changes, etc.). Recent empirical studies have demonstrated that there is a significant correlation between architectural "smells" -- manifestations of architectural decay -- and implementati…
▽ More
Architectural decay imposes real costs in terms of developer effort, system correctness, and performance. Over time, those problems are likely to be revealed as explicit implementation issues (defects, feature changes, etc.). Recent empirical studies have demonstrated that there is a significant correlation between architectural "smells" -- manifestations of architectural decay -- and implementation issues. In this paper, we take a step further in exploring this phenomenon. We analyze the available development data from 10 open-source software systems and show that information regarding current architectural decay in these systems can be used to build models that accurately predict future issue-proneness and change-proneness of the systems' implementations. As a less intuitive result, we also show that, in cases where historical data for a system is unavailable, such data from other, unrelated systems can provide reasonably accurate issue- and change-proneness prediction capabilities.
△ Less
Submitted 19 February, 2021;
originally announced February 2021.