-
Adversarial Patch for 3D Local Feature Extractor
Authors:
Yu Wen Pao,
Li Chang Lai,
Hong-Yi Lin
Abstract:
Local feature extractors are the cornerstone of many computer vision tasks. However, their vulnerability to adversarial attacks can significantly compromise their effectiveness. This paper discusses approaches to attack sophisticated local feature extraction algorithms and models to achieve two distinct goals: (1) forcing a match between originally non-matching image regions, and (2) preventing a…
▽ More
Local feature extractors are the cornerstone of many computer vision tasks. However, their vulnerability to adversarial attacks can significantly compromise their effectiveness. This paper discusses approaches to attack sophisticated local feature extraction algorithms and models to achieve two distinct goals: (1) forcing a match between originally non-matching image regions, and (2) preventing a match between originally matching regions. At the end of the paper, we discuss the performance and drawbacks of different patch generation methods.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
A Huber Loss Minimization Approach to Mean Estimation under User-level Differential Privacy
Authors:
Puning Zhao,
Lifeng Lai,
Li Shen,
Qingming Li,
Jiafei Wu,
Zhe Liu
Abstract:
Privacy protection of users' entire contribution of samples is important in distributed systems. The most effective approach is the two-stage scheme, which finds a small interval first and then gets a refined estimate by clip** samples into the interval. However, the clip** operation induces bias, which is serious if the sample distribution is heavy-tailed. Besides, users with large local samp…
▽ More
Privacy protection of users' entire contribution of samples is important in distributed systems. The most effective approach is the two-stage scheme, which finds a small interval first and then gets a refined estimate by clip** samples into the interval. However, the clip** operation induces bias, which is serious if the sample distribution is heavy-tailed. Besides, users with large local sample sizes can make the sensitivity much larger, thus the method is not suitable for imbalanced users. Motivated by these challenges, we propose a Huber loss minimization approach to mean estimation under user-level differential privacy. The connecting points of Huber loss can be adaptively adjusted to deal with imbalanced users. Moreover, it avoids the clip** operation, thus significantly reducing the bias compared with the two-stage approach. We provide a theoretical analysis of our approach, which gives the noise strength needed for privacy protection, as well as the bound of mean squared error. The result shows that the new method is much less sensitive to the imbalance of user-wise sample sizes and the tail of sample distributions. Finally, we perform numerical experiments to validate our theoretical analysis.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
PipeOrgan: Efficient Inter-operation Pipelining with Flexible Spatial Organization and Interconnects
Authors:
Raveesh Garg,
Hyoukjun Kwon,
Eric Qin,
Yu-Hsin Chen,
Tushar Krishna,
Liangzhen Lai
Abstract:
Because of the recent trends in Deep Neural Networks (DNN) models being memory-bound, inter-operator pipelining for DNN accelerators is emerging as a promising optimization. Inter-operator pipelining reduces costly on-chip global memory and off-chip memory accesses by forwarding the output of a layer as the input of the next layer within the compute array, which is proven to be an effective optimi…
▽ More
Because of the recent trends in Deep Neural Networks (DNN) models being memory-bound, inter-operator pipelining for DNN accelerators is emerging as a promising optimization. Inter-operator pipelining reduces costly on-chip global memory and off-chip memory accesses by forwarding the output of a layer as the input of the next layer within the compute array, which is proven to be an effective optimization by previous works.
However, the design space of inter-operator pipelining is huge, and the space is not yet fully explored. In particular, identifying the right depth and granularity of pipelining (or no pipelining at all) is significantly dependent on the layer shapes and data volumes of weights and activations, and these are different even within a domain.
Moreover, works divide the substrate into large chunks and map one layer onto each chunk, which requires communicating halfway through or through the global buffer. However, for fine-grained inter-operation pipelining, placing the corresponding consumer of the next layer tile close to the producer tile of the current layer is a better way to exploit fine-grained spatial reuse.
In order to support variable number of layers (ie the right depth) and support multiple spatial organizations of layers (in accordance with the pipelining granularity) on the substrate, we propose PipeOrgan, a new class of spatial data organization strategy for energy efficient and congestion-free communication between the PEs for various pipeline depth and granularity. PipeOrgan takes advantage of flexible spatial organization and can allocate layers to PEs based on the granularity of pipelining. We also propose changes to the conventional mesh topology to improve the performance of coarse-grained allocation. PipeOrgan achieves 1.95x performance improvement over the state-of-the-art pipelined dataflow on XR-bench workloads.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Robust Risk-Sensitive Reinforcement Learning with Conditional Value-at-Risk
Authors:
Xinyi Ni,
Lifeng Lai
Abstract:
Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goa…
▽ More
Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goal of minimizing expected total discounted costs, in this paper, we analyze the robustness of CVaR-based risk-sensitive RL under RMDP. Firstly, we consider predetermined ambiguity sets. Based on the coherency of CVaR, we establish a connection between robustness and risk sensitivity, thus, techniques in risk-sensitive RL can be adopted to solve the proposed problem. Furthermore, motivated by the existence of decision-dependent uncertainty in real-world problems, we study problems with state-action-dependent ambiguity sets. To solve this, we define a new risk measure named NCVaR and build the equivalence of NCVaR optimization and robust CVaR optimization. We further propose value iteration algorithms and validate our approach in simulation experiments.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Target-Specific De Novo Peptide Binder Design with DiffPepBuilder
Authors:
Fanhao Wang,
Yuzhe Wang,
Laiyi Feng,
Changsheng Zhang,
Luhua Lai
Abstract:
Despite the exciting progress in target-specific de novo protein binder design, peptide binder design remains challenging due to the flexibility of peptide structures and the scarcity of protein-peptide complex structure data. In this study, we curated a large synthetic dataset, referred to as PepPC-F, from the abundant protein-protein interface data and developed DiffPepBuilder, a de novo target-…
▽ More
Despite the exciting progress in target-specific de novo protein binder design, peptide binder design remains challenging due to the flexibility of peptide structures and the scarcity of protein-peptide complex structure data. In this study, we curated a large synthetic dataset, referred to as PepPC-F, from the abundant protein-protein interface data and developed DiffPepBuilder, a de novo target-specific peptide binder generation method that utilizes an SE(3)-equivariant diffusion model trained on PepPC-F to co-design peptide sequences and structures. DiffPepBuilder also introduces disulfide bonds to stabilize the generated peptide structures. We tested DiffPepBuilder on 30 experimentally verified strong peptide binders with available protein-peptide complex structures. DiffPepBuilder was able to effectively recall the native structures and sequences of the peptide ligands and to generate novel peptide binders with improved binding free energy. We subsequently conducted de novo generation case studies on three targets. In both the regeneration test and case studies, DiffPepBuilder outperformed AfDesign and RFdiffusion coupled with ProteinMPNN, in terms of sequence and structure recall, interface quality, and structural diversity. Molecular dynamics simulations confirmed that the introduction of disulfide bonds enhanced the structural rigidity and binding performance of the generated peptides. As a general peptide binder de novo design tool, DiffPepBuilder can be used to design peptide binders for given protein targets with three dimensional and binding site information.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Authors:
Mostafa Elhoushi,
Akshat Shrivastava,
Diana Liskovich,
Basil Hosmer,
Bram Wasti,
Liangzhen Lai,
Anas Mahmoud,
Bilge Acun,
Saurabh Agarwal,
Ahmed Roman,
Ahmed A Aly,
Beidi Chen,
Carole-Jean Wu
Abstract:
We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit loss where all transformer layers share the same exit. Second, during inference, we show that this training recipe increases the accuracy of early exi…
▽ More
We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit loss where all transformer layers share the same exit. Second, during inference, we show that this training recipe increases the accuracy of early exit at earlier layers, without adding any auxiliary layers or modules to the model. Third, we present a novel self-speculative decoding solution where we exit at early layers and verify and correct with remaining layers of the model. Our proposed self-speculative decoding approach has less memory footprint than other speculative decoding approaches and benefits from shared compute and activations of the draft and verification stages. We run experiments on different Llama model sizes on different types of training: pretraining from scratch, continual pretraining, finetuning on specific data domain, and finetuning on specific task. We implement our inference solution and show speedups of up to 2.16x on summarization for CNN/DM documents, 1.82x on coding, and 2.0x on TOPv2 semantic parsing task.
△ Less
Submitted 29 April, 2024; v1 submitted 25 April, 2024;
originally announced April 2024.
-
A Survey of Distributed Graph Algorithms on Massive Graphs
Authors:
Lingkai Meng,
Yu Shao,
Long Yuan,
Longbin Lai,
Peng Cheng,
Xue Li,
Wenyuan Yu,
Wenjie Zhang,
Xuemin Lin,
**gren Zhou
Abstract:
Distributed processing of large-scale graph data has many practical applications and has been widely studied. In recent years, a lot of distributed graph processing frameworks and algorithms have been proposed. While many efforts have been devoted to analyzing these, with most analyzing them based on programming models, less research focuses on understanding their challenges in distributed environ…
▽ More
Distributed processing of large-scale graph data has many practical applications and has been widely studied. In recent years, a lot of distributed graph processing frameworks and algorithms have been proposed. While many efforts have been devoted to analyzing these, with most analyzing them based on programming models, less research focuses on understanding their challenges in distributed environments. Applying graph tasks to distributed environments is not easy, often facing numerous challenges through our analysis, including parallelism, load balancing, communication overhead, and bandwidth. In this paper, we provide an extensive overview of the current state-of-the-art in this field by outlining the challenges and solutions of distributed graph algorithms. We first conduct a systematic analysis of the inherent challenges in distributed graph processing, followed by presenting an overview of existing general solutions. Subsequently, we survey the challenges highlighted in recent distributed graph processing papers and the strategies adopted to address them. Finally, we discuss the current research trends and identify potential future opportunities.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Generally covariant geometric momentum and geometric potential for a Dirac fermion on a two-dimensional hypersurface
Authors:
Z. Li,
L. Q. Lai
Abstract:
Geometric momentum is the proper momentum for a moving particle constrained on a curved surface, which depends on the outer curvature and has observable effects. In the context of multi-component quantum states, geometric momentum should be rewritten as generally covariant geometric momentum. For a Dirac fermion constrained on a two-dimensional hypersurface, we give the generally covariant geometr…
▽ More
Geometric momentum is the proper momentum for a moving particle constrained on a curved surface, which depends on the outer curvature and has observable effects. In the context of multi-component quantum states, geometric momentum should be rewritten as generally covariant geometric momentum. For a Dirac fermion constrained on a two-dimensional hypersurface, we give the generally covariant geometric momentum, and show that on the pseudosphere and the helical surface there exist no curvature-induced geometric potentials. These results verify that the dynamical quantization conditions are effective in dealing with constrained systems on hypersurfaces, and one could obtain the generally convariant geometric momentum and the geometric potential of a spin particle constrained on surfaces with definite parametric equations.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
Improving SDSS Cosmological Constraints through $β$-Skeleton Weighted Correlation Functions
Authors:
Fenfen Yin,
Jiacheng Ding,
Limin Lai,
Wei Zhang,
Liang Xiao,
Zihan Wang,
Jaime Forero-Romero,
Le Zhang,
Xiao-Dong Li
Abstract:
The $β$-skeleton approach can be conveniently utilized to construct the cosmic web based on the spatial geometry distribution of galaxies, particularly in sparse samples. This method plays a key role in establishing the three-dimensional structure of the Universe and serves as a tool for quantitatively characterizing the nature of the cosmic web. This study is the first application of $β$-skeleton…
▽ More
The $β$-skeleton approach can be conveniently utilized to construct the cosmic web based on the spatial geometry distribution of galaxies, particularly in sparse samples. This method plays a key role in establishing the three-dimensional structure of the Universe and serves as a tool for quantitatively characterizing the nature of the cosmic web. This study is the first application of $β$-skeleton information as weights in mark weighted correlation functions (MCFs), presenting a novel statistical measure. We have applied the $β$-skeleton approach to the CMASS NGC galaxy samples from SDSS BOSS DR12 in the redshift interval $0.45 \leq z \leq 0.55$. Additionally, we applied this approach to three COLA cosmological simulations with different settings ($Ω_m=0.25, Ω_m=0.31, Ω_m=0.4$) for comparison. We measured three MCFs, each weighted by: i) the number of neighboring galaxies around each galaxy; ii) the average distance of each galaxy from its surrounding neighbors; iii) the reciprocal of the average distance of each galaxy from its surrounding neighbors. By comparing measurements and calculating corresponding $χ^2$ statistics, we observe high sensitivity to the cosmological parameter $Ω_m$ through a joint analysis of the two-point correlation and three MCFs.
△ Less
Submitted 25 March, 2024; v1 submitted 21 March, 2024;
originally announced March 2024.
-
The PSF Smoothing Effect on Concentration-Related Parameters of High Redshift Galaxies in HST and JWST
Authors:
Jia-Hui Wang,
Zhao-Yu Li,
Ming-Yang Zhuang,
Luis C. Ho,
Li-Min Lai
Abstract:
We perform a comprehensive investigation of the PSF smoothing effect on the measurement of concentration-related parameters ($C$, Gini, $M_{20}$) of high redshift galaxies in the HST and JWST surveys. Our sample contains massive galaxies from the CANDELS/EGS survey (0 < z < 2), and the CEERS survey (1 < z < 3). The non-parametric concentration-related parameters ($C$, Gini, $M_{20}$) and the model…
▽ More
We perform a comprehensive investigation of the PSF smoothing effect on the measurement of concentration-related parameters ($C$, Gini, $M_{20}$) of high redshift galaxies in the HST and JWST surveys. Our sample contains massive galaxies from the CANDELS/EGS survey (0 < z < 2), and the CEERS survey (1 < z < 3). The non-parametric concentration-related parameters ($C$, Gini, $M_{20}$) and the model-dependent parameters (n, Re) of these galaxies are derived from Statmorph and GALFIT, respectively. We try to evaluate the PSF smoothing effect by comparing the concentration-related parameters to the Sérsic index in both observations and mock images. We find that the concentration index is generally underestimated, especially for smaller galaxies with higher Sérsic index (eventually converging to the concentration index of the PSF). However, galaxies with lower Sérsic index ($n \leq 1$) or larger relative size are less affected by the PSF smoothing effect. The Gini coefficient and the absolute $M_{20}$ statistic also show similar behaviour as the concentration index. Caution should be taken for the possible correction of the concentration-related parameters, where both the relative size and the Sérsic index of the galaxy are important. Compared to the HST images, the PSF smoothing is much less severe for images in the CEERS survey due to the much higher spatial resolution. In fact, it is better to use the Sérsic index rather than the non-parametric morphology indicators to trace the light concentration for galaxies at high redshifts. From the single Sérsic modelling of the HST and JWST images, we also confirm that galaxies at higher redshift are more compact with smaller $R_e$. The lower mass galaxies are more disc-like ($n\sim1$) compared to the higher mass galaxies that are more spheroid dominated ($n\sim3$).
△ Less
Submitted 14 May, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Computer-Controlled 3D Freeform Surface Weaving
Authors:
Xiangjia Chen,
Lip M. Lai,
Zishun Liu,
Chengkai Dai,
Isaac C. W. Leung,
Charlie C. L. Wang,
Yeung Yam
Abstract:
In this paper, we present a new computer-controlled weaving technology that enables the fabrication of woven structures in the shape of given 3D surfaces by using threads in non-traditional materials with high bending-stiffness, allowing for multiple applications with the resultant woven fabrics. A new weaving machine and a new manufacturing process are developed to realize the function of 3D surf…
▽ More
In this paper, we present a new computer-controlled weaving technology that enables the fabrication of woven structures in the shape of given 3D surfaces by using threads in non-traditional materials with high bending-stiffness, allowing for multiple applications with the resultant woven fabrics. A new weaving machine and a new manufacturing process are developed to realize the function of 3D surface weaving by the principle of short-row sha**. A computational solution is investigated to convert input 3D freeform surfaces into the corresponding weaving operations (indicated as W-code) to guide the operation of this system. A variety of examples using cotton threads, conductive threads and optical fibres are fabricated by our prototype system to demonstrate its functionality.
△ Less
Submitted 8 May, 2024; v1 submitted 1 March, 2024;
originally announced March 2024.
-
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Authors:
Zechun Liu,
Changsheng Zhao,
Forrest Iandola,
Chen Lai,
Yuandong Tian,
Igor Fedorov,
Yunyang Xiong,
Ernie Chang,
Yangyang Shi,
Raghuraman Krishnamoorthi,
Liangzhen Lai,
Vikas Chandra
Abstract:
This paper addresses the growing need for efficient large language models (LLMs) on mobile devices, driven by increasing cloud costs and latency concerns. We focus on designing top-quality LLMs with fewer than a billion parameters, a practical choice for mobile deployment. Contrary to prevailing belief emphasizing the pivotal role of data and parameter quantity in determining model quality, our in…
▽ More
This paper addresses the growing need for efficient large language models (LLMs) on mobile devices, driven by increasing cloud costs and latency concerns. We focus on designing top-quality LLMs with fewer than a billion parameters, a practical choice for mobile deployment. Contrary to prevailing belief emphasizing the pivotal role of data and parameter quantity in determining model quality, our investigation underscores the significance of model architecture for sub-billion scale LLMs. Leveraging deep and thin architectures, coupled with embedding sharing and grouped-query attention mechanisms, we establish a strong baseline network denoted as MobileLLM, which attains a remarkable 2.7%/4.3% accuracy boost over preceding 125M/350M state-of-the-art models. Additionally, we propose an immediate block-wise weight-sharing approach with no increase in model size and only marginal latency overhead. The resultant models, denoted as MobileLLM-LS, demonstrate a further accuracy enhancement of 0.7%/0.8% than MobileLLM 125M/350M. Moreover, MobileLLM model family shows significant improvements compared to previous sub-billion models on chat benchmarks, and demonstrates close correctness to LLaMA-v2 7B in API calling tasks, highlighting the capability of small models for common on-device use cases.
△ Less
Submitted 26 June, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Not All Weights Are Created Equal: Enhancing Energy Efficiency in On-Device Streaming Speech Recognition
Authors:
Yang Li,
Yuan Shangguan,
Yuhao Wang,
Liangzhen Lai,
Ernie Chang,
Changsheng Zhao,
Yangyang Shi,
Vikas Chandra
Abstract:
Power consumption plays an important role in on-device streaming speech recognition, as it has a direct impact on the user experience. This study delves into how weight parameters in speech recognition models influence the overall power consumption of these models. We discovered that the impact of weight parameters on power consumption varies, influenced by factors including how often they are inv…
▽ More
Power consumption plays an important role in on-device streaming speech recognition, as it has a direct impact on the user experience. This study delves into how weight parameters in speech recognition models influence the overall power consumption of these models. We discovered that the impact of weight parameters on power consumption varies, influenced by factors including how often they are invoked and their placement in memory. Armed with this insight, we developed design guidelines aimed at optimizing on-device speech recognition models. These guidelines focus on minimizing power use without substantially affecting accuracy. Our method, which employs targeted compression based on the varying sensitivities of weight parameters, demonstrates superior performance compared to state-of-the-art compression methods. It achieves a reduction in energy usage of up to 47% while maintaining similar model accuracy and improving the real-time factor.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
A Graph-Native Query Optimization Framework
Authors:
Bingqing Lyu,
Xiaoli Zhou,
Longbin Lai,
Yufan Yang,
Yunkai Lou,
Wenyuan Yu,
**gren Zhou
Abstract:
Graph queries that combine pattern matching with relational operations, referred as PatRelQuery, are widely used in many real-world applications. It allows users to identify arbitrary patterns in a graph and further perform in-depth relational analysis on the results. To effectively support PatRelQuery, two key challenges need to be addressed: (1) how to optimize PatRelQuery in a unified framework…
▽ More
Graph queries that combine pattern matching with relational operations, referred as PatRelQuery, are widely used in many real-world applications. It allows users to identify arbitrary patterns in a graph and further perform in-depth relational analysis on the results. To effectively support PatRelQuery, two key challenges need to be addressed: (1) how to optimize PatRelQuery in a unified framework, and (2) how to handle the arbitrary type constraints in patterns in PatRelQuery. In this paper, we present a graph-native query optimization framework named GOpt, to tackle these issues. GOpt is built on top of a unified intermediate representation (IR) that is capable of capturing both graph and relational operations, thereby streamlining the optimization of PatRelQuery. To handle the arbitrary type constraints, GOpt employs an automatic type inference approach to identify implicit type constraints. Additionally, GOpt introduces a graph-native optimizer, which encompasses an extensive collection of optimization rules along with cost-based techniques tailored for arbitrary patterns, to optimize PatRelQuery. Through comprehensive experiments, we demonstrate that GOpt can achieve significant query performance improvements, in both crafted benchmarks and real-world applications.
△ Less
Submitted 5 February, 2024; v1 submitted 31 January, 2024;
originally announced January 2024.
-
Camouflage Adversarial Attacks on Multiple Agent Systems
Authors:
Ziqing Lu,
Guanlin Liu,
Lifeng Lai,
Weiyu Xu
Abstract:
The multi-agent reinforcement learning systems (MARL) based on the Markov decision process (MDP) have emerged in many critical applications. To improve the robustness/defense of MARL systems against adversarial attacks, the study of various adversarial attacks on reinforcement learning systems is very important. Previous works on adversarial attacks considered some possible features to attack in M…
▽ More
The multi-agent reinforcement learning systems (MARL) based on the Markov decision process (MDP) have emerged in many critical applications. To improve the robustness/defense of MARL systems against adversarial attacks, the study of various adversarial attacks on reinforcement learning systems is very important. Previous works on adversarial attacks considered some possible features to attack in MDP, such as the action poisoning attacks, the reward poisoning attacks, and the state perception attacks. In this paper, we propose a brand-new form of attack called the camouflage attack in the MARL systems. In the camouflage attack, the attackers change the appearances of some objects without changing the actual objects themselves; and the camouflaged appearances may look the same to all the targeted recipient (victim) agents. The camouflaged appearances can mislead the recipient agents to misguided actions. We design algorithms that give the optimal camouflage attacks minimizing the rewards of recipient agents. Our numerical and theoretical results show that camouflage attacks can rival the more conventional, but likely more difficult state perception attacks. We also investigate cost-constrained camouflage attacks and showed numerically how cost budgets affect the attack performance.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
DeepRLI: A Multi-objective Framework for Universal Protein--Ligand Interaction Prediction
Authors:
Haoyu Lin,
Shiwei Wang,
**tao Zhu,
Yibo Li,
Jianfeng Pei,
Luhua Lai
Abstract:
Protein (receptor)--ligand interaction prediction is a critical component in computer-aided drug design, significantly influencing molecular docking and virtual screening processes. Despite the development of numerous scoring functions in recent years, particularly those employing machine learning, accurately and efficiently predicting binding affinities for protein--ligand complexes remains a for…
▽ More
Protein (receptor)--ligand interaction prediction is a critical component in computer-aided drug design, significantly influencing molecular docking and virtual screening processes. Despite the development of numerous scoring functions in recent years, particularly those employing machine learning, accurately and efficiently predicting binding affinities for protein--ligand complexes remains a formidable challenge. Most contemporary methods are tailored for specific tasks, such as binding affinity prediction, binding pose prediction, or virtual screening, often failing to encompass all aspects. In this study, we put forward DeepRLI, a novel protein--ligand interaction prediction architecture. It encodes each protein--ligand complex into a fully connected graph, retaining the integrity of the topological and spatial structure, and leverages the improved graph transformer layers with cosine envelope as the central module of the neural network, thus exhibiting superior scoring power. In order to equip the model to generalize to conformations beyond the confines of crystal structures and to adapt to molecular docking and virtual screening tasks, we propose a multi-objective strategy, that is, the model outputs three scores for scoring and ranking, docking, and screening, and the training process optimizes these three objectives simultaneously. For the latter two objectives, we augment the dataset through a docking procedure, incorporate suitable physics-informed blocks and employ an effective contrastive learning approach. Eventually, our model manifests a balanced performance across scoring, ranking, docking, and screening, thereby demonstrating its ability to handle a range of tasks. Overall, this research contributes a multi-objective framework for universal protein--ligand interaction prediction, augmenting the landscape of structure-based drug design.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Interference-induced suppression of particle emission from a Bose-Einstein condensate in lattice with time-periodic modulations
Authors:
L. Q. Lai,
Z. Li
Abstract:
Collective emission of particles from a parametrically driven condensate has attracted significant experimental and theoretical attention due to the appealing visual effects and potential metrological applications. In this paper, we investigate the particle emission from a Bose-Einstein condensate confined in a one-dimensional lattice with periodically modulated interparticle interactions. We give…
▽ More
Collective emission of particles from a parametrically driven condensate has attracted significant experimental and theoretical attention due to the appealing visual effects and potential metrological applications. In this paper, we investigate the particle emission from a Bose-Einstein condensate confined in a one-dimensional lattice with periodically modulated interparticle interactions. We give the regimes for discrete modes, and find that the emission is distinctly suppressed. The configuration induces a broad band, but due to the interference of the matter waves few particles can be ejected. We further qualitatively model the emission process, and demonstrate the short-time behaviors. This engineering provides a way for manipulating the propagation of particles and the corresponding dynamics of condensates in lattices, and may find use in other nonequilibrium problems with time-periodic driving.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Accelerating Discovery of Novel and Bioactive Ligands With Pharmacophore-Informed Generative Models
Authors:
Weixin Xie,
Jianhang Zhang,
Qin Xie,
Chaojun Gong,
Youjun Xu,
Luhua Lai,
Jianfeng Pei
Abstract:
Deep generative models have gained significant advancements to accelerate drug discovery by generating bioactive chemicals against desired targets. Nevertheless, most generated compounds that have been validated for potent bioactivity often exhibit structural novelty levels that fall short of satisfaction, thereby providing limited inspiration to human medicinal chemists. The challenge faced by ge…
▽ More
Deep generative models have gained significant advancements to accelerate drug discovery by generating bioactive chemicals against desired targets. Nevertheless, most generated compounds that have been validated for potent bioactivity often exhibit structural novelty levels that fall short of satisfaction, thereby providing limited inspiration to human medicinal chemists. The challenge faced by generative models lies in their ability to produce compounds that are both bioactive and novel, rather than merely making minor modifications to known actives present in the training set. Recognizing the utility of pharmacophores in facilitating scaffold hop**, we developed TransPharmer, an innovative generative model that integrates ligand-based interpretable pharmacophore fingerprints with generative pre-training transformer (GPT) for de novo molecule generation. TransPharmer demonstrates superior performance across tasks involving unconditioned distribution learning, de novo generation and scaffold elaboration under pharmacophoric constraints. Its distinct exploration mode within the local chemical space renders it particularly useful for scaffold hop**, producing compounds that are structurally novel while pharmaceutically related. The efficacy of TransPharmer is validated through two case studies involving the dopamine receptor D2 (DRD2) and polo-like kinase 1 (PLK1). Notably in the case of PLK1, three out of four synthesized designed compounds exhibit submicromolar activities, with the most potent one, IIP0943, demonstrating a potency of 5.1 nM. Featuring a new scaffold of 4-(benzo[b]thiophen-7-yloxy)pyrimidine, IIP0943 also exhibits high selectivity for PLK1. It was demonstrated that TransPharmer is a powerful tool for discovery of novel and bioactive ligands.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
GraphScope Flex: LEGO-like Graph Computing Stack
Authors:
Tao He,
Shuxian Hu,
Longbin Lai,
Dongze Li,
Neng Li,
Xue Li,
Lexiao Liu,
Xiaojian Luo,
Binqing Lyu,
Ke Meng,
Sijie Shen,
Li Su,
Lei Wang,
**gbo Xu,
Wenyuan Yu,
Weibin Zeng,
Lei Zhang,
Siyuan Zhang,
**gren Zhou,
Xiaoli Zhou,
Diwen Zhu
Abstract:
Graph computing has become increasingly crucial in processing large-scale graph data, with numerous systems developed for this purpose. Two years ago, we introduced GraphScope as a system addressing a wide array of graph computing needs, including graph traversal, analytics, and learning in one system. Since its inception, GraphScope has achieved significant technological advancements and gained w…
▽ More
Graph computing has become increasingly crucial in processing large-scale graph data, with numerous systems developed for this purpose. Two years ago, we introduced GraphScope as a system addressing a wide array of graph computing needs, including graph traversal, analytics, and learning in one system. Since its inception, GraphScope has achieved significant technological advancements and gained widespread adoption across various industries. However, one key lesson from this journey has been understanding the limitations of a "one-size-fits-all" approach, especially when dealing with the diversity of programming interfaces, applications, and data storage formats in graph computing. In response to these challenges, we present GraphScope Flex, the next iteration of GraphScope. GraphScope Flex is designed to be both resource-efficient and cost-effective, while also providing flexibility and user-friendliness through its LEGO-like modularity. This paper explores the architectural innovations and fundamental design principles of GraphScope Flex, all of which are direct outcomes of the lessons learned during our ongoing development process. We validate the adaptability and efficiency of GraphScope Flex with extensive evaluations on synthetic and real-world datasets. The results show that GraphScope Flex achieves 2.4X throughput and up to 55.7X speedup over other systems on the LDBC Social Network and Graphalytics benchmarks, respectively. Furthermore, GraphScope Flex accomplishes up to a 2,400X performance gain in real-world applications, demonstrating its proficiency across a wide range of graph computing scenarios with increased effectiveness.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Parameterized steering criteria via correlation matrices
Authors:
Qing-Hua Zhang,
Lemin Lai,
Shao-Ming Fei
Abstract:
We study the steerability for arbitrary dimensional bipartite systems based on the correlation matrices given by local special unitary groups. We present families of steering criteria for bipartite quantum states in terms of parameterized correlation matrices. We show that these steering criteria may detect more steerable states than the existing steering criteria. The results are illustrated by d…
▽ More
We study the steerability for arbitrary dimensional bipartite systems based on the correlation matrices given by local special unitary groups. We present families of steering criteria for bipartite quantum states in terms of parameterized correlation matrices. We show that these steering criteria may detect more steerable states than the existing steering criteria. The results are illustrated by detailed examples.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Improving Constraint on $Ω_{m}$ from SDSS Using Marked Correlation Functions
Authors:
L. M. Lai,
J. C. Ding,
X. L. Luo,
Y. Z. Yang,
Z. H. Wang,
K. S. Liu,
G. F. Liu,
X. Wang,
Y. Zheng,
Z. Y. Li,
L. Zhang,
X. D. Li
Abstract:
Large-scale structure (LSS) surveys will increasingly provide stringent constraints on our cosmological models. Recently, the density-marked correlation function (MCF) has been introduced, offering an easily computable density-correlation statistic. Simulations have demonstrated that MCFs offer additional, independent constraints on cosmological models beyond the standard two-point correlation (2P…
▽ More
Large-scale structure (LSS) surveys will increasingly provide stringent constraints on our cosmological models. Recently, the density-marked correlation function (MCF) has been introduced, offering an easily computable density-correlation statistic. Simulations have demonstrated that MCFs offer additional, independent constraints on cosmological models beyond the standard two-point correlation (2PCF). In this study, we apply MCFs for the first time to SDSS CMASS data, aiming to investigate the statistical information regarding clustering and anisotropy properties in the Universe and assess the performance of various weighting schemes in MCFs. Upon analyzing the CMASS data, we observe that, by combining different weights ($α= [-0.2, 0, 0.2, 0.6]$), the MCFs provide a tight and independent constraint on the cosmological parameter $Ω_m$, yielding $Ω_m = 0.293 \pm0.006$ at the $1σ$ level, which represents a significant reduction in the statistical error by a factor of 3.4 compared to that from 2PCF. Our constraint is consistent with recent findings from the small-scale clustering of BOSS galaxies \cite{arXiv:2203.08999v2} within the 1$σ$ level. However, we also find that our estimate is lower than the Planck measurements by about 2.6$σ$, indicating the potential presence of new physics beyond the standard cosmological model if all the systematics are fully corrected. The method outlined in this study can be extended to other surveys and datasets, allowing for the constraint of other cosmological parameters. Additionally, it serves as a valuable tool for forthcoming emulator analysis on the Chinese Space Station Telescope (CSST).
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
DiffBindFR: An SE(3) Equivariant Network for Flexible Protein-Ligand Docking
Authors:
**tao Zhu,
Zhonghui Gu,
Jianfeng Pei,
Luhua Lai
Abstract:
Molecular docking, a key technique in structure-based drug design, plays pivotal roles in protein-ligand interaction modeling, hit identification and optimization, in which accurate prediction of protein-ligand binding mode is essential. Conventional docking approaches perform well in redocking tasks with known protein binding pocket conformation in the complex state. However, in real-world dockin…
▽ More
Molecular docking, a key technique in structure-based drug design, plays pivotal roles in protein-ligand interaction modeling, hit identification and optimization, in which accurate prediction of protein-ligand binding mode is essential. Conventional docking approaches perform well in redocking tasks with known protein binding pocket conformation in the complex state. However, in real-world docking scenario without knowing the protein binding conformation for a new ligand, accurately modeling the binding complex structure remains challenging as flexible docking is computationally expensive and inaccurate. Typical deep learning-based docking methods do not explicitly consider protein side chain conformations and fail to ensure the physical plausibility and detailed atomic interactions. In this study, we present DiffBindFR, a full-atom diffusion-based flexible docking model that operates over the product space of ligand overall movements and flexibility and pocket side chain torsion changes. We show that DiffBindFR has higher accuracy in producing native-like binding structures with physically plausible and detailed interactions than available docking methods. Furthermore, in the Apo and AlphaFold2 modeled structures, DiffBindFR demonstrates superior advantages in accurate ligand binding pose and protein binding conformation prediction, making it suitable for Apo and AlphaFold2 structure-based drug design. DiffBindFR provides a powerful flexible docking tool for modeling accurate protein-ligand binding structures.
△ Less
Submitted 19 December, 2023; v1 submitted 26 November, 2023;
originally announced November 2023.
-
Optimal Cost Constrained Adversarial Attacks For Multiple Agent Systems
Authors:
Ziqing Lu,
Guanlin Liu,
Lifeng Lai,
Weiyu Xu
Abstract:
Finding optimal adversarial attack strategies is an important topic in reinforcement learning and the Markov decision process. Previous studies usually assume one all-knowing coordinator (attacker) for whom attacking different recipient (victim) agents incurs uniform costs. However, in reality, instead of using one limitless central attacker, the attacks often need to be performed by distributed a…
▽ More
Finding optimal adversarial attack strategies is an important topic in reinforcement learning and the Markov decision process. Previous studies usually assume one all-knowing coordinator (attacker) for whom attacking different recipient (victim) agents incurs uniform costs. However, in reality, instead of using one limitless central attacker, the attacks often need to be performed by distributed attack agents. We formulate the problem of performing optimal adversarial agent-to-agent attacks using distributed attack agents, in which we impose distinct cost constraints on each different attacker-victim pair. We propose an optimal method integrating within-step static constrained attack-resource allocation optimization and between-step dynamic programming to achieve the optimal adversarial attack in a multi-agent system. Our numerical results show that the proposed attacks can significantly reduce the rewards received by the attacked agents.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
A Microwell-Based Microfluidic Device for Single-Cell Trap** and Magnetic Field Gradient Stimulation
Authors:
Richard Lee Lai
Abstract:
We develop a microfluidic platform for the long-term cultivation and observation of both THP-1 cells under different physiological conditions. First, we determine optimal seeding conditions and microwell geometry. Next, we observe changes in cell size and circularity. Results show that gradient magnetic forces on the order of 102 T/m results in stunted growth and irregular cell shapes. Finally, we…
▽ More
We develop a microfluidic platform for the long-term cultivation and observation of both THP-1 cells under different physiological conditions. First, we determine optimal seeding conditions and microwell geometry. Next, we observe changes in cell size and circularity. Results show that gradient magnetic forces on the order of 102 T/m results in stunted growth and irregular cell shapes. Finally, we observe the temporal change in ROS signals under control, static and gradient magnetic fields. For exposure to static and gradient magnetic fields, the peak in ROS signals occurs after 24 hours and 36 hours, respectively.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
XVO: Generalized Visual Odometry via Cross-Modal Self-Training
Authors:
Lei Lai,
Zhongkai Shangguan,
Jimuyang Zhang,
Eshed Ohn-Bar
Abstract:
We propose XVO, a semi-supervised learning method for training generalized monocular Visual Odometry (VO) models with robust off-the-self operation across diverse datasets and settings. In contrast to standard monocular VO approaches which often study a known calibration within a single dataset, XVO efficiently learns to recover relative pose with real-world scale from visual scene semantics, i.e.…
▽ More
We propose XVO, a semi-supervised learning method for training generalized monocular Visual Odometry (VO) models with robust off-the-self operation across diverse datasets and settings. In contrast to standard monocular VO approaches which often study a known calibration within a single dataset, XVO efficiently learns to recover relative pose with real-world scale from visual scene semantics, i.e., without relying on any known camera parameters. We optimize the motion estimation model via self-training from large amounts of unconstrained and heterogeneous dash camera videos available on YouTube. Our key contribution is twofold. First, we empirically demonstrate the benefits of semi-supervised training for learning a general-purpose direct VO regression network. Second, we demonstrate multi-modal supervision, including segmentation, flow, depth, and audio auxiliary prediction tasks, to facilitate generalized representations for the VO task. Specifically, we find audio prediction task to significantly enhance the semi-supervised learning process while alleviating noisy pseudo-labels, particularly in highly dynamic and out-of-domain video data. Our proposed teacher network achieves state-of-the-art performance on the commonly used KITTI benchmark despite no multi-frame optimization or knowledge of camera parameters. Combined with the proposed semi-supervised step, XVO demonstrates off-the-shelf knowledge transfer across diverse conditions on KITTI, nuScenes, and Argoverse without fine-tuning.
△ Less
Submitted 8 October, 2023; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Hidden phase uncovered by ultrafast carrier dynamics in thin Bi2O2Se
Authors:
Hao Li,
Adeela Nairan,
Xiaoran Niu,
Yuxiang Chen,
Huarui Sun,
Linqing Lai,
**gkai Qin,
Leyang Dang,
Guigen Wang,
Usman Khan,
Feng He
Abstract:
Bi2O2Se has attracted intensive attention due to its potential in electronics, optoelectronics, as well as ferroelectric applications. Despite that, there have only been a handful of experimental studies based on ultrafast spectroscopy to elucidate the carrier dynamics in Bi2O2Se thin films, Different groups have reported various ultrafast timescales and associated mechanisms across films of diffe…
▽ More
Bi2O2Se has attracted intensive attention due to its potential in electronics, optoelectronics, as well as ferroelectric applications. Despite that, there have only been a handful of experimental studies based on ultrafast spectroscopy to elucidate the carrier dynamics in Bi2O2Se thin films, Different groups have reported various ultrafast timescales and associated mechanisms across films of different thicknesses. A comprehensive understanding in relation to thickness and fluence is still lacking. In this work, we have systematically explored the thickness-dependent Raman spectroscopy and ultrafast carrier dynamics in chemical vapor deposition (CVD)-grown Bi2O2Se thin films on mica substrate with thicknesses varying from 22.44 nm down to 4.62 nm at both low and high pump fluence regions. Combining the thickness dependence and fluence dependence of the slow decay time, we demonstrate a ferroelectric transition in the thinner (< 8 nm) Bi2O2Se films, influenced by substrate-induced compressive strain and non-equilibrium states. Moreover, this transition can be manifested under highly non-equilibrium states. Our results deepen the understanding of the interplay between the ferroelectric phase and semiconducting characteristics of Bi2O2Se thin films, providing a new route to manipulate the ferroelectric transition.
△ Less
Submitted 23 January, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Authors:
Yang Li,
Liangzhen Lai,
Yuan Shangguan,
Forrest N. Iandola,
Zhaoheng Ni,
Ernie Chang,
Yangyang Shi,
Vikas Chandra
Abstract:
Transformer-based models excel in speech recognition. Existing efforts to optimize Transformer inference, typically for long-context applications, center on simplifying attention score calculations. However, streaming speech recognition models usually process a limited number of tokens each time, making attention score calculation less of a bottleneck. Instead, the bottleneck lies in the linear pr…
▽ More
Transformer-based models excel in speech recognition. Existing efforts to optimize Transformer inference, typically for long-context applications, center on simplifying attention score calculations. However, streaming speech recognition models usually process a limited number of tokens each time, making attention score calculation less of a bottleneck. Instead, the bottleneck lies in the linear projection layers of multi-head attention and feedforward networks, constituting a substantial portion of the model size and contributing significantly to computation, memory, and power usage.
To address this bottleneck, we propose folding attention, a technique targeting these linear layers, significantly reducing model size and improving memory and power efficiency. Experiments on on-device Transformer-based streaming speech recognition models show that folding attention reduces model size (and corresponding memory consumption) by up to 24% and power consumption by up to 23%, all without compromising model accuracy or computation overhead.
△ Less
Submitted 18 January, 2024; v1 submitted 14 September, 2023;
originally announced September 2023.
-
Distributed Dual Coordinate Ascent with Imbalanced Data on a General Tree Network
Authors:
Myung Cho,
Lifeng Lai,
Weiyu Xu
Abstract:
In this paper, we investigate the impact of imbalanced data on the convergence of distributed dual coordinate ascent in a tree network for solving an empirical loss minimization problem in distributed machine learning. To address this issue, we propose a method called delayed generalized distributed dual coordinate ascent that takes into account the information of the imbalanced data, and provide…
▽ More
In this paper, we investigate the impact of imbalanced data on the convergence of distributed dual coordinate ascent in a tree network for solving an empirical loss minimization problem in distributed machine learning. To address this issue, we propose a method called delayed generalized distributed dual coordinate ascent that takes into account the information of the imbalanced data, and provide the analysis of the proposed algorithm. Numerical experiments confirm the effectiveness of our proposed method in improving the convergence speed of distributed dual coordinate ascent in a tree network.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Minimax Optimal Q Learning with Nearest Neighbors
Authors:
Puning Zhao,
Lifeng Lai
Abstract:
Analyzing the Markov decision process (MDP) with continuous state spaces is generally challenging. A recent interesting work \cite{shah2018q} solves MDP with bounded continuous state space by a nearest neighbor $Q$ learning approach, which has a sample complexity of $\tilde{O}(\frac{1}{ε^{d+3}(1-γ)^{d+7}})$ for $ε$-accurate $Q$ function estimation with discount factor $γ$. In this paper, we propos…
▽ More
Analyzing the Markov decision process (MDP) with continuous state spaces is generally challenging. A recent interesting work \cite{shah2018q} solves MDP with bounded continuous state space by a nearest neighbor $Q$ learning approach, which has a sample complexity of $\tilde{O}(\frac{1}{ε^{d+3}(1-γ)^{d+7}})$ for $ε$-accurate $Q$ function estimation with discount factor $γ$. In this paper, we propose two new nearest neighbor $Q$ learning methods, one for the offline setting and the other for the online setting. We show that the sample complexities of these two methods are $\tilde{O}(\frac{1}{ε^{d+2}(1-γ)^{d+2}})$ and $\tilde{O}(\frac{1}{ε^{d+2}(1-γ)^{d+3}})$ for offline and online methods respectively, which significantly improve over existing results and have minimax optimal dependence over $ε$. We achieve such improvement by utilizing the samples more efficiently. In particular, the method in \cite{shah2018q} clears up all samples after each iteration, thus these samples are somewhat wasted. On the other hand, our offline method does not remove any samples, and our online method only removes samples with time earlier than $βt$ at time $t$ with $β$ being a tunable parameter, thus our methods significantly reduce the loss of information. Apart from the sample complexity, our methods also have additional advantages of better computational complexity, as well as suitability to unbounded state spaces.
△ Less
Submitted 17 June, 2024; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Global stability of first-order methods for coercive tame functions
Authors:
Cédric Josz,
Lexiao Lai
Abstract:
We consider first-order methods with constant step size for minimizing locally Lipschitz coercive functions that are tame in an o-minimal structure on the real field. We prove that if the method is approximated by subgradient trajectories, then the iterates eventually remain in a neighborhood of a connected component of the set of critical points. Under suitable method-dependent regularity assumpt…
▽ More
We consider first-order methods with constant step size for minimizing locally Lipschitz coercive functions that are tame in an o-minimal structure on the real field. We prove that if the method is approximated by subgradient trajectories, then the iterates eventually remain in a neighborhood of a connected component of the set of critical points. Under suitable method-dependent regularity assumptions, this result applies to the subgradient method with momentum, the stochastic subgradient method with random reshuffling and momentum, and the random-permutations cyclic coordinate descent method.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
Leveraging Optical Communication Fiber and AI for Distributed Water Pipe Leak Detection
Authors:
Huan Wu,
Huan-Feng Duan,
Wallace W. L. Lai,
Kun Zhu,
Xin Cheng,
Hao Yin,
Bin Zhou,
Chun-Cheung Lai,
Chao Lu,
Xiaoli Ding
Abstract:
Detecting leaks in water networks is a costly challenge. This article introduces a practical solution: the integration of optical network with water networks for efficient leak detection. Our approach uses a fiber-optic cable to measure vibrations, enabling accurate leak identification and localization by an intelligent algorithm. We also propose a method to access leak severity for prioritized re…
▽ More
Detecting leaks in water networks is a costly challenge. This article introduces a practical solution: the integration of optical network with water networks for efficient leak detection. Our approach uses a fiber-optic cable to measure vibrations, enabling accurate leak identification and localization by an intelligent algorithm. We also propose a method to access leak severity for prioritized repairs. Our solution detects even small leaks with flow rates as low as 0.027 L/s. It offers a cost-effective way to improve leak detection, enhance water management, and increase operational efficiency.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Audio-Visual Speech Enhancement Using Self-supervised Learning to Improve Speech Intelligibility in Cochlear Implant Simulations
Authors:
Richard Lee Lai,
Jen-Cheng Hou,
Mandar Gogate,
Kia Dashtipour,
Amir Hussain,
Yu Tsao
Abstract:
Individuals with hearing impairments face challenges in their ability to comprehend speech, particularly in noisy environments. The aim of this study is to explore the effectiveness of audio-visual speech enhancement (AVSE) in enhancing the intelligibility of vocoded speech in cochlear implant (CI) simulations. Notably, the study focuses on a challenged scenario where there is limited availability…
▽ More
Individuals with hearing impairments face challenges in their ability to comprehend speech, particularly in noisy environments. The aim of this study is to explore the effectiveness of audio-visual speech enhancement (AVSE) in enhancing the intelligibility of vocoded speech in cochlear implant (CI) simulations. Notably, the study focuses on a challenged scenario where there is limited availability of training data for the AVSE task. To address this problem, we propose a novel deep neural network framework termed Self-Supervised Learning-based AVSE (SSL-AVSE). The proposed SSL-AVSE combines visual cues, such as lip and mouth movements, from the target speakers with corresponding audio signals. The contextually combined audio and visual data are then fed into a Transformer-based SSL AV-HuBERT model to extract features, which are further processed using a BLSTM-based SE model. The results demonstrate several key findings. Firstly, SSL-AVSE successfully overcomes the issue of limited data by leveraging the AV-HuBERT model. Secondly, by fine-tuning the AV-HuBERT model parameters for the target SE task, significant performance improvements are achieved. Specifically, there is a notable enhancement in PESQ (Perceptual Evaluation of Speech Quality) from 1.43 to 1.67 and in STOI (Short-Time Objective Intelligibility) from 0.70 to 0.74. Furthermore, the performance of the SSL-AVSE was evaluated using CI vocoded speech to assess the intelligibility for CI users. Comparative experimental outcomes reveal that in the presence of dynamic noises encountered during human conversations, SSL-AVSE exhibits a substantial improvement. The NCM (Normal Correlation Matrix) values indicate an increase of 26.5% to 87.2% compared to the noisy baseline.
△ Less
Submitted 15 July, 2023;
originally announced July 2023.
-
Efficient Adversarial Attacks on Online Multi-agent Reinforcement Learning
Authors:
Guanlin Liu,
Lifeng Lai
Abstract:
Due to the broad range of applications of multi-agent reinforcement learning (MARL), understanding the effects of adversarial attacks against MARL model is essential for the safe applications of this model. Motivated by this, we investigate the impact of adversarial attacks on MARL. In the considered setup, there is an exogenous attacker who is able to modify the rewards before the agents receive…
▽ More
Due to the broad range of applications of multi-agent reinforcement learning (MARL), understanding the effects of adversarial attacks against MARL model is essential for the safe applications of this model. Motivated by this, we investigate the impact of adversarial attacks on MARL. In the considered setup, there is an exogenous attacker who is able to modify the rewards before the agents receive them or manipulate the actions before the environment receives them. The attacker aims to guide each agent into a target policy or maximize the cumulative rewards under some specific reward function chosen by the attacker, while minimizing the amount of manipulation on feedback and action. We first show the limitations of the action poisoning only attacks and the reward poisoning only attacks. We then introduce a mixed attack strategy with both the action poisoning and the reward poisoning. We show that the mixed attack strategy can efficiently attack MARL agents even if the attacker has no prior information about the underlying environment and the agents' algorithms.
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty
Authors:
Guanlin Liu,
Zhihan Zhou,
Han Liu,
Lifeng Lai
Abstract:
Robust reinforcement learning (RL) aims to find a policy that optimizes the worst-case performance in the face of uncertainties. In this paper, we focus on action robust RL with the probabilistic policy execution uncertainty, in which, instead of always carrying out the action specified by the policy, the agent will take the action specified by the policy with probability $1-ρ$ and an alternative…
▽ More
Robust reinforcement learning (RL) aims to find a policy that optimizes the worst-case performance in the face of uncertainties. In this paper, we focus on action robust RL with the probabilistic policy execution uncertainty, in which, instead of always carrying out the action specified by the policy, the agent will take the action specified by the policy with probability $1-ρ$ and an alternative adversarial action with probability $ρ$. We establish the existence of an optimal policy on the action robust MDPs with probabilistic policy execution uncertainty and provide the action robust Bellman optimality equation for its solution. Furthermore, we develop Action Robust Reinforcement Learning with Certificates (ARRLC) algorithm that achieves minimax optimal regret and sample complexity. Furthermore, we conduct numerical experiments to validate our approach's robustness, demonstrating that ARRLC outperforms non-robust RL algorithms and converges faster than the robust TD algorithm in the presence of action perturbations.
△ Less
Submitted 20 July, 2023; v1 submitted 14 July, 2023;
originally announced July 2023.
-
Probing new physics with polarized $τ$ and $Λ_c$ in quasielastic $ν_τ\!+\!n\!\to\! τ^-\!+\!Λ_c$ scattering process
Authors:
Ya-Ru Kong,
Li-Fen Lai,
Xin-Qiang Li,
Xin-Shuai Yan,
Ya-Dong Yang,
Dong-Hui Zheng
Abstract:
The absence of semitauonic decays of charmed hadrons makes the decay processes mediated by the quark-level $c\to d τ^+ ν_τ$ transition inadequate for probing a generic new physics (NP) with all kinds of Dirac structures. To fill in this gap, we consider in this paper the quasielastic neutrino scattering process $ν_τ+n\to τ^-+Λ_c$, and propose searching for NP through the polarizations of the $τ$ l…
▽ More
The absence of semitauonic decays of charmed hadrons makes the decay processes mediated by the quark-level $c\to d τ^+ ν_τ$ transition inadequate for probing a generic new physics (NP) with all kinds of Dirac structures. To fill in this gap, we consider in this paper the quasielastic neutrino scattering process $ν_τ+n\to τ^-+Λ_c$, and propose searching for NP through the polarizations of the $τ$ lepton and the $Λ_c$ baryon. In the framework of a general low-energy effective Lagrangian, we perform a comprehensive analysis of the (differential) cross sections and polarization vectors of the process both within the Standard Model and in various NP scenarios, and scrutinize possible NP signals. We also explore the influence on our findings due to the uncertainties and the different parametrizations of the $Λ_c \to N$ transition form factors, and show that they have become one of the major challenges to further constrain possible NP through the quasielastic scattering process.
△ Less
Submitted 14 November, 2023; v1 submitted 14 July, 2023;
originally announced July 2023.
-
Convergence of the momentum method for semialgebraic functions with locally Lipschitz gradients
Authors:
Cédric Josz,
Lexiao Lai,
Xiaopeng Li
Abstract:
We propose a new length formula that governs the iterates of the momentum method when minimizing differentiable semialgebraic functions with locally Lipschitz gradients. It enables us to establish local convergence, global convergence, and convergence to local minimizers without assuming global Lipschitz continuity of the gradient, coercivity, and a global growth condition, as is done in the liter…
▽ More
We propose a new length formula that governs the iterates of the momentum method when minimizing differentiable semialgebraic functions with locally Lipschitz gradients. It enables us to establish local convergence, global convergence, and convergence to local minimizers without assuming global Lipschitz continuity of the gradient, coercivity, and a global growth condition, as is done in the literature. As a result, we provide the first convergence guarantee of the momentum method starting from arbitrary initial points when applied to principal component analysis, matrix sensing, and linear neural networks.
△ Less
Submitted 7 January, 2024; v1 submitted 6 July, 2023;
originally announced July 2023.
-
Many $p$-adic odd zeta values are irrational
Authors:
Li Lai,
Johannes Sprang
Abstract:
For any prime $p$ and $\varepsilon>0$ we prove that for any sufficiently large positive odd integer $s$ at least $(c_p-\varepsilon) \sqrt{\frac{s}{\log s}}$ of the $p$-adic zeta values $ζ_p(3),ζ_p(5),\dots,ζ_p(s)$ are irrational. The constant $c_p$ is positive and does only depend on $p$. This result establishes a $p$-adic version of the elimination technique used by Fischler--Sprang--Zudilin and…
▽ More
For any prime $p$ and $\varepsilon>0$ we prove that for any sufficiently large positive odd integer $s$ at least $(c_p-\varepsilon) \sqrt{\frac{s}{\log s}}$ of the $p$-adic zeta values $ζ_p(3),ζ_p(5),\dots,ζ_p(s)$ are irrational. The constant $c_p$ is positive and does only depend on $p$. This result establishes a $p$-adic version of the elimination technique used by Fischler--Sprang--Zudilin and Lai--Yu to prove a similar result on classical zeta values. The main difficulty consists in proving the non-vanishing of the resulting linear forms. We overcome this problem by using a new irrationality criterion.
△ Less
Submitted 17 June, 2023;
originally announced June 2023.
-
The Brain Tumor Segmentation (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI
Authors:
Ahmed W. Moawad,
Anastasia Janas,
Ujjwal Baid,
Divya Ramakrishnan,
Rachit Saluja,
Nader Ashraf,
Leon Jekel,
Raisa Amiruddin,
Maruf Adewole,
Jake Albrecht,
Udunna Anazodo,
Sanjay Aneja,
Syed Muhammad Anwar,
Timothy Bergquist,
Evan Calabrese,
Veronica Chiang,
Verena Chung,
Gian Marco Marco Conte,
Farouk Dako,
James Eddy,
Ivan Ezhov,
Ariana Familiar,
Keyvan Farahani,
Juan Eugenio Iglesias,
Zhifan Jiang
, et al. (206 additional authors not shown)
Abstract:
The translation of AI-generated brain metastases (BM) segmentation into clinical practice relies heavily on diverse, high-quality annotated medical imaging datasets. The BraTS-METS 2023 challenge has gained momentum for testing and benchmarking algorithms using rigorously annotated internationally compiled real-world datasets. This study presents the results of the segmentation challenge and chara…
▽ More
The translation of AI-generated brain metastases (BM) segmentation into clinical practice relies heavily on diverse, high-quality annotated medical imaging datasets. The BraTS-METS 2023 challenge has gained momentum for testing and benchmarking algorithms using rigorously annotated internationally compiled real-world datasets. This study presents the results of the segmentation challenge and characterizes the challenging cases that impacted the performance of the winning algorithms. Untreated brain metastases on standard anatomic MRI sequences (T1, T2, FLAIR, T1PG) from eight contributed international datasets were annotated in stepwise method: published UNET algorithms, student, neuroradiologist, final approver neuroradiologist. Segmentations were ranked based on lesion-wise Dice and Hausdorff distance (HD95) scores. False positives (FP) and false negatives (FN) were rigorously penalized, receiving a score of 0 for Dice and a fixed penalty of 374 for HD95. Eight datasets comprising 1303 studies were annotated, with 402 studies (3076 lesions) released on Synapse as publicly available datasets to challenge competitors. Additionally, 31 studies (139 lesions) were held out for validation, and 59 studies (218 lesions) were used for testing. Segmentation accuracy was measured as rank across subjects, with the winning team achieving a LesionWise mean score of 7.9. Common errors among the leading teams included false negatives for small lesions and misregistration of masks in space.The BraTS-METS 2023 challenge successfully curated well-annotated, diverse datasets and identified common errors, facilitating the translation of BM segmentation across varied clinical environments and providing personalized volumetric reports to patients undergoing BM treatment.
△ Less
Submitted 17 June, 2024; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Multi-factor Sequential Re-ranking with Perception-Aware Diversification
Authors:
Yue Xu,
Hao Chen,
Zefan Wang,
Jianwen Yin,
Qijie Shen,
Dimin Wang,
Feiran Huang,
Lixiang Lai,
Tao Zhuang,
Junfeng Ge,
Xia Hu
Abstract:
Feed recommendation systems, which recommend a sequence of items for users to browse and interact with, have gained significant popularity in practical applications. In feed products, users tend to browse a large number of items in succession, so the previously viewed items have a significant impact on users' behavior towards the following items. Therefore, traditional methods that mainly focus on…
▽ More
Feed recommendation systems, which recommend a sequence of items for users to browse and interact with, have gained significant popularity in practical applications. In feed products, users tend to browse a large number of items in succession, so the previously viewed items have a significant impact on users' behavior towards the following items. Therefore, traditional methods that mainly focus on improving the accuracy of recommended items are suboptimal for feed recommendations because they may recommend highly similar items. For feed recommendation, it is crucial to consider both the accuracy and diversity of the recommended item sequences in order to satisfy users' evolving interest when consecutively viewing items. To this end, this work proposes a general re-ranking framework named Multi-factor Sequential Re-ranking with Perception-Aware Diversification (MPAD) to jointly optimize accuracy and diversity for feed recommendation in a sequential manner. Specifically, MPAD first extracts users' different scales of interests from their behavior sequences through graph clustering-based aggregations. Then, MPAD proposes two sub-models to respectively evaluate the accuracy and diversity of a given item by capturing users' evolving interest due to the ever-changing context and users' personal perception of diversity from an item sequence perspective. This is consistent with the browsing nature of the feed scenario. Finally, MPAD generates the return list by sequentially selecting optimal items from the candidate set to maximize the joint benefits of accuracy and diversity of the entire list. MPAD has been implemented in Taobao's homepage feed to serve the main traffic and provide services to recommend billions of items to hundreds of millions of users every day.
△ Less
Submitted 21 May, 2023;
originally announced May 2023.
-
Multi-channel Integrated Recommendation with Exposure Constraints
Authors:
Yue Xu,
Qijie Shen,
Jianwen Yin,
Zengde Deng,
Dimin Wang,
Hao Chen,
Lixiang Lai,
Tao Zhuang,
Junfeng Ge
Abstract:
Integrated recommendation, which aims at jointly recommending heterogeneous items from different channels in a main feed, has been widely applied to various online platforms. Though attractive, integrated recommendation requires the ranking methods to migrate from conventional user-item models to the new user-channel-item paradigm in order to better capture users' preferences on both item and chan…
▽ More
Integrated recommendation, which aims at jointly recommending heterogeneous items from different channels in a main feed, has been widely applied to various online platforms. Though attractive, integrated recommendation requires the ranking methods to migrate from conventional user-item models to the new user-channel-item paradigm in order to better capture users' preferences on both item and channel levels. Moreover, practical feed recommendation systems usually impose exposure constraints on different channels to ensure user experience. This leads to greater difficulty in the joint ranking of heterogeneous items. In this paper, we investigate the integrated recommendation task with exposure constraints in practical recommender systems. Our contribution is forth-fold. First, we formulate this task as a binary online linear programming problem and propose a two-layer framework named Multi-channel Integrated Recommendation with Exposure Constraints (MIREC) to obtain the optimal solution. Second, we propose an efficient online allocation algorithm to determine the optimal exposure assignment of different channels from a global view of all user requests over the entire time horizon. We prove that this algorithm reaches the optimal point under a regret bound of $ \mathcal{O}(\sqrt{T}) $ with linear complexity. Third, we propose a series of collaborative models to determine the optimal layout of heterogeneous items at each user request. The joint modeling of user interests, cross-channel correlation, and page context in our models aligns more with the browsing nature of feed products than existing models. Finally, we conduct extensive experiments on both offline datasets and online A/B tests to verify the effectiveness of MIREC. The proposed framework has now been implemented on the homepage of Taobao to serve the main traffic.
△ Less
Submitted 20 May, 2023;
originally announced May 2023.
-
Comparing Machines and Children: Using Developmental Psychology Experiments to Assess the Strengths and Weaknesses of LaMDA Responses
Authors:
Eliza Kosoy,
Emily Rose Reagan,
Leslie Lai,
Alison Gopnik,
Danielle Krettek Cobb
Abstract:
Developmental psychologists have spent decades devising experiments to test the intelligence and knowledge of infants and children, tracing the origin of crucial concepts and capacities. Moreover, experimental techniques in developmental psychology have been carefully designed to discriminate the cognitive capacities that underlie particular behaviors. We propose that using classical experiments f…
▽ More
Developmental psychologists have spent decades devising experiments to test the intelligence and knowledge of infants and children, tracing the origin of crucial concepts and capacities. Moreover, experimental techniques in developmental psychology have been carefully designed to discriminate the cognitive capacities that underlie particular behaviors. We propose that using classical experiments from child development is a particularly effective way to probe the computational abilities of AI models, in general, and LLMs in particular. First, the methodological techniques of developmental psychology, such as the use of novel stimuli to control for past experience or control conditions to determine whether children are using simple associations, can be equally helpful for assessing the capacities of LLMs. In parallel, testing LLMs in this way can tell us whether the information that is encoded in text is sufficient to enable particular responses, or whether those responses depend on other kinds of information, such as information from exploration of the physical world. In this work we adapt classical developmental experiments to evaluate the capabilities of LaMDA, a large language model from Google. We propose a novel LLM Response Score (LRS) metric which can be used to evaluate other language models, such as GPT. We find that LaMDA generates appropriate responses that are similar to those of children in experiments involving social understanding, perhaps providing evidence that knowledge of these domains is discovered through language. On the other hand, LaMDA's responses in early object and action understanding, theory of mind, and especially causal reasoning tasks are very different from those of young children, perhaps showing that these domains require more real-world, self-initiated exploration and cannot simply be learned from patterns in language input.
△ Less
Submitted 7 November, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
A Novel Reward Sha** Function for Single-Player Mahjong
Authors:
Kai Jun Chen,
Lok Him Lai,
Zi Iun Lai
Abstract:
Mahjong is a complex game with an intractably large state space with extremely sparse rewards, which poses challenges to develop an agent to play Mahjong. To overcome this, the ShangTing function was adopted as a reward sha** function. This was combined with a forward-search algorithm to create an agent capable of completing a winning hand in Single-player Mahjong (an average of 35 actions over…
▽ More
Mahjong is a complex game with an intractably large state space with extremely sparse rewards, which poses challenges to develop an agent to play Mahjong. To overcome this, the ShangTing function was adopted as a reward sha** function. This was combined with a forward-search algorithm to create an agent capable of completing a winning hand in Single-player Mahjong (an average of 35 actions over 10,000 games). To increase performance, we propose a novel bonus reward sha** function, which assigns higher relative values to synergistic Mahjong hands. In a simulated 1-v-1 battle, usage of the new reward function outperformed the default ShangTing function, winning an average of $1.37 over 1000 games.
△ Less
Submitted 6 May, 2023;
originally announced May 2023.
-
On the irrationality of certain $2$-adic zeta values
Authors:
Li Lai
Abstract:
Let $ζ_2(\cdot)$ be the Kubota-Leopoldt $2$-adic zeta function. We prove that, for every nonnegative integer $s$, there exists an odd integer $j$ in the interval $[s+3,3s+5]$ such that $ζ_2(j)$ is irrational. In particular, at least one of $ζ_2(7),ζ_2(9),ζ_2(11),ζ_2(13)$ is irrational.
Our approach is inspired by the recent work of Sprang. We construct explicit rational functions. The Volkenborn…
▽ More
Let $ζ_2(\cdot)$ be the Kubota-Leopoldt $2$-adic zeta function. We prove that, for every nonnegative integer $s$, there exists an odd integer $j$ in the interval $[s+3,3s+5]$ such that $ζ_2(j)$ is irrational. In particular, at least one of $ζ_2(7),ζ_2(9),ζ_2(11),ζ_2(13)$ is irrational.
Our approach is inspired by the recent work of Sprang. We construct explicit rational functions. The Volkenborn integrals of these rational functions' (higher-order) derivatives produce good linear combinations of $1$ and $2$-adic Hurwitz zeta values. The most difficult step is proving that certain Volkenborn integrals are nonzero, which is resolved by carefully manipulating the binomial coefficients.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
GPU-based Private Information Retrieval for On-Device Machine Learning Inference
Authors:
Maximilian Lam,
Jeff Johnson,
Wenjie Xiong,
Kiwan Maeng,
Udit Gupta,
Yang Li,
Liangzhen Lai,
Ilias Leontiadis,
Minsoo Rhu,
Hsien-Hsin S. Lee,
Vijay Janapa Reddi,
Gu-Yeon Wei,
David Brooks,
G. Edward Suh
Abstract:
On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the or…
▽ More
On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the order of 1-10 GBs of data, making them impractical to store on-device. To overcome this barrier, we propose the use of private information retrieval (PIR) to efficiently and privately retrieve embeddings from servers without sharing any private information. As off-the-shelf PIR algorithms are usually too computationally intensive to directly use for latency-sensitive inference tasks, we 1) propose novel GPU-based acceleration of PIR, and 2) co-design PIR with the downstream ML application to obtain further speedup. Our GPU acceleration strategy improves system throughput by more than $20 \times$ over an optimized CPU PIR implementation, and our PIR-ML co-design provides an over $5 \times$ additional throughput improvement at fixed model quality. Together, for various on-device ML applications such as recommendation and language modeling, our system on a single V100 GPU can serve up to $100,000$ queries per second -- a $>100 \times$ throughput improvement over a CPU-based baseline -- while maintaining model accuracy.
△ Less
Submitted 25 September, 2023; v1 submitted 25 January, 2023;
originally announced January 2023.
-
Synthesis-driven design of 3D molecules for structure-based drug discovery using geometric transformers
Authors:
Yibo Li,
Jianfeng Pei,
Luhua Lai
Abstract:
Finding drug-like compounds with high bioactivity is essential for drug discovery, but the task is complicated by the high cost of chemical synthesis and validation. With their outstanding performance in de novo drug design, deep generative models represent promising tools for tackling this challenge. In recently years, 3D molecule generative models have gained increasing attention due to their ab…
▽ More
Finding drug-like compounds with high bioactivity is essential for drug discovery, but the task is complicated by the high cost of chemical synthesis and validation. With their outstanding performance in de novo drug design, deep generative models represent promising tools for tackling this challenge. In recently years, 3D molecule generative models have gained increasing attention due to their ability to directly utilize the 3D interaction information between the target and ligand. However, it remains challenging to synthesize the molecules generated by these models, limiting the speed of bioactivity validation and further structure optimization. In this work, we propose DeepLigBuilder+, a deep generative model for 3D molecules that combines structure-based de novo drug design with a reaction-based generation framework. Besides producing 3D molecular structures, the model also proposes synthetic pathways for generated molecules, which greatly assists the retro-synthetic analysis. To achieve this, we developed a new way to enforce the synthesizability constraint using a tree-based organization of purchasable building blocks. This method enjoys high scalability and is compatible with existing atom-based generative models. Additionally, for structure-based design tasks, we developed an SE(3)-equivariant transformer conditioned on the shape and pharmacophore-based inputs, and combine it with the Monte Carlo tree search. Using the ATP-binding pocket of BTK and the NAD+ binding pocket of PHGDH for case studies, we demonstrate that DeepLigBuilder+ is capable of enriching drug-like molecules with high predicted binding affinity and desirable interaction modes while maintaining the synthesizability constraint. We believe that DeepLigBuilder+ is a powerful tool for accelerating the process of drug discovery.
△ Less
Submitted 31 December, 2022;
originally announced January 2023.
-
DREAM: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads
Authors:
Seah Kim,
Hyoukjun Kwon,
**ook Song,
Jihyuck Jo,
Yu-Hsin Chen,
Liangzhen Lai,
Vikas Chandra
Abstract:
Emerging real-time multi-model ML (RTMM) workloads such as AR/VR and drone control involve dynamic behaviors in various granularity; task, model, and layers within a model. Such dynamic behaviors introduce new challenges to the system software in an ML system since the overall system load is not completely predictable, unlike traditional ML workloads. In addition, RTMM workloads require real-time…
▽ More
Emerging real-time multi-model ML (RTMM) workloads such as AR/VR and drone control involve dynamic behaviors in various granularity; task, model, and layers within a model. Such dynamic behaviors introduce new challenges to the system software in an ML system since the overall system load is not completely predictable, unlike traditional ML workloads. In addition, RTMM workloads require real-time processing, involve highly heterogeneous models, and target resource-constrained devices. Under such circumstances, develo** an effective scheduler gains more importance to better utilize underlying hardware considering the unique characteristics of RTMM workloads. Therefore, we propose a new scheduler, DREAM, which effectively handles various dynamicity in RTMM workloads targeting multi-accelerator systems. DREAM quantifies the unique requirements for RTMM workloads and utilizes the quantified scores to drive scheduling decisions, considering the current system load and other inference jobs on different models and input frames. DREAM utilizes tunable parameters that provide fast and effective adaptivity to dynamic workload changes. In our evaluation of five scenarios of RTMM workload, DREAM reduces the overall UXCost, which is an equivalent metric of the energy-delay product (EDP) for RTMM defined in the paper, by 32.2% and 50.0% in the geometric mean (up to 80.8% and 97.6%) compared to state-of-the-art baselines, which shows the efficacy of our scheduling methodology.
△ Less
Submitted 20 September, 2023; v1 submitted 6 December, 2022;
originally announced December 2022.
-
Sufficient conditions for instability of the subgradient method with constant step size
Authors:
Cédric Josz,
Lexiao Lai
Abstract:
We provide sufficient conditions for instability of the subgradient method with constant step size around a local minimum of a locally Lipschitz semi-algebraic function. They are satisfied by several spurious local minima arising in robust principal component analysis and neural networks.
We provide sufficient conditions for instability of the subgradient method with constant step size around a local minimum of a locally Lipschitz semi-algebraic function. They are satisfied by several spurious local minima arising in robust principal component analysis and neural networks.
△ Less
Submitted 29 June, 2023; v1 submitted 27 November, 2022;
originally announced November 2022.
-
Lyapunov stability of the subgradient method with constant step size
Authors:
Cédric Josz,
Lexiao Lai
Abstract:
We consider the subgradient method with constant step size for minimizing locally Lipschitz semi-algebraic functions. In order to analyze the behavior of its iterates in the vicinity of a local minimum, we introduce a notion of discrete Lyapunov stability and propose necessary and sufficient conditions for stability.
We consider the subgradient method with constant step size for minimizing locally Lipschitz semi-algebraic functions. In order to analyze the behavior of its iterates in the vicinity of a local minimum, we introduce a notion of discrete Lyapunov stability and propose necessary and sufficient conditions for stability.
△ Less
Submitted 6 March, 2023; v1 submitted 27 November, 2022;
originally announced November 2022.
-
Nonsmooth rank-one matrix factorization landscape
Authors:
Cédric Josz,
Lexiao Lai
Abstract:
We provide the first positive result on the nonsmooth optimization landscape of robust principal component analysis, to the best of our knowledge. It is the object of several conjectures and remains mostly uncharted territory. We identify a necessary and sufficient condition for the absence of spurious local minima in the rank-one case. Our proof exploits the subdifferential regularity of the obje…
▽ More
We provide the first positive result on the nonsmooth optimization landscape of robust principal component analysis, to the best of our knowledge. It is the object of several conjectures and remains mostly uncharted territory. We identify a necessary and sufficient condition for the absence of spurious local minima in the rank-one case. Our proof exploits the subdifferential regularity of the objective function in order to eliminate the existence quantifier from the first-order optimality condition known as Fermat's rule.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse
Authors:
Hyoukjun Kwon,
Krishnakumar Nair,
Jamin Seo,
Jason Yik,
Debabrata Mohapatra,
Dongyuan Zhan,
**ook Song,
Peter Capak,
Peizhao Zhang,
Peter Vajda,
Colby Banbury,
Mark Mazumder,
Liangzhen Lai,
Ashish Sirasao,
Tushar Krishna,
Harshit Khaitan,
Vikas Chandra,
Vijay Janapa Reddi
Abstract:
Real-time multi-task multi-model (MTMM) workloads, a new form of deep learning inference workloads, are emerging for applications areas like extended reality (XR) to support metaverse use cases. These workloads combine user interactivity with computationally complex machine learning (ML) activities. Compared to standard ML applications, these ML workloads present unique difficulties and constraint…
▽ More
Real-time multi-task multi-model (MTMM) workloads, a new form of deep learning inference workloads, are emerging for applications areas like extended reality (XR) to support metaverse use cases. These workloads combine user interactivity with computationally complex machine learning (ML) activities. Compared to standard ML applications, these ML workloads present unique difficulties and constraints. Real-time MTMM workloads impose heterogeneity and concurrency requirements on future ML systems and devices, necessitating the development of new capabilities. This paper begins with a discussion of the various characteristics of these real-time MTMM ML workloads and presents an ontology for evaluating the performance of future ML hardware for XR systems. Next, we present XRBENCH, a collection of MTMM ML tasks, models, and usage scenarios that execute these models in three representative ways: cascaded, concurrent, and cascaded-concurrent for XR use cases. Finally, we emphasize the need for new metrics that capture the requirements properly. We hope that our work will stimulate research and lead to the development of a new generation of ML systems for XR use cases. XRBench is available as an open-source project: https://github.com/XRBench
△ Less
Submitted 19 May, 2023; v1 submitted 16 November, 2022;
originally announced November 2022.