Search | arXiv e-print repository

arXiv:2407.02031 [pdf, other]

SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules

Authors: Suyi Li, Lingyun Yang, Xiaoxiao Jiang, Hanfeng Lu, Zhipeng Di, Weiyi Lu, Jiawei Chen, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Li** Zhang, Wei Wang

Abstract: This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generatin… ▽ More This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generating images for commercial applications. Despite their efficacy, these add-on modules incur high loading overhead, prolong the serving latency, and swallow up expensive GPU resources. Driven by our characterization study, we present SwiftDiffusion, a system that efficiently generates high-quality images using stable diffusion models and add-on modules. To achieve this, SwiftDiffusion reconstructs the existing text-to-image serving workflow by identifying the opportunities for parallel computation and distributing ControlNet computations across multiple GPUs. Further, SwiftDiffusion thoroughly analyzes the dynamics of image generation and develops techniques to eliminate the overhead associated with LoRA loading and patching while preserving the image quality. Last, SwiftDiffusion proposes specialized optimizations in the backbone architecture of the stable diffusion models, which are also compatible with the efficient serving of add-on modules. Compared to state-of-the-art text-to-image serving systems, SwiftDiffusion reduces serving latency by up to 5x and improves serving throughput by up to 2x without compromising image quality. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.18518 [pdf, other]

APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

Authors: Zuxin Liu, Thai Hoang, Jianguo Zhang, Ming Zhu, Tian Lan, Shirley Kokane, Juntao Tan, Weiran Yao, Zhiwei Liu, Yihao Feng, Rithesh Murthy, Liangwei Yang, Silvio Savarese, Juan Carlos Niebles, Huan Wang, Shelby Heinecke, Caiming Xiong

Abstract: The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scal… ▽ More The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scalable and structured manner. Each data in our dataset is verified through three hierarchical stages: format checking, actual function executions, and semantic verification, ensuring its reliability and correctness. We demonstrate that models trained with our curated datasets, even with only 7B parameters, can achieve state-of-the-art performance on the Berkeley Function-Calling Benchmark, outperforming multiple GPT-4 models. Moreover, our 1B model achieves exceptional performance, surpassing GPT-3.5-Turbo and Claude-3 Haiku. We release a dataset containing 60,000 high-quality entries, aiming to advance the field of function-calling agent domains. The dataset is available on Huggingface: https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k and the project homepage: https://apigen-pipeline.github.io/ △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2405.19878 [pdf, other]

Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models

Authors: Zeyu Fang, Tian Lan

Abstract: Generative models such as diffusion have been employed as world models in offline reinforcement learning to generate synthetic data for more effective learning. Existing work either generates diffusion models one-time prior to training or requires additional interaction data to update it. In this paper, we propose a novel approach for offline reinforcement learning with closed-loop policy evaluati… ▽ More Generative models such as diffusion have been employed as world models in offline reinforcement learning to generate synthetic data for more effective learning. Existing work either generates diffusion models one-time prior to training or requires additional interaction data to update it. In this paper, we propose a novel approach for offline reinforcement learning with closed-loop policy evaluation and world-model adaptation. It iteratively leverages a guided diffusion world model to directly evaluate the offline target policy with actions drawn from it, and then performs an importance-sampled world model update to adaptively align the world model with the updated policy. We analyzed the performance of the proposed method and provided an upper bound on the return gap between our method and the real environment under an optimal policy. The result sheds light on various factors affecting learning performance. Evaluations in the D4RL environment show significant improvement over state-of-the-art baselines, especially when only random or medium-expertise demonstrations are available -- thus requiring improved alignment between the world model and offline policy evaluation. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.16657 [pdf, other]

ELG Spectroscopic Systematics Analysis of the DESI Data Release 1

Authors: Jiaxi Yu, Ashley J. Ross, Antoine Rocher, Otávio Alves, Arnaud de Mattia, Daniel Forero-Sánchez, Jean-Paul Kneib, Alex Krolewski, TingWen Lan, Michael Rashkovetskyi, Jessica Nicole Aguilar, Steven Ahlen, Stephen Bailey, David Brooks, Edmond Chaussidon, Todd Claybaugh, Axel de la Macorra, Arjun Dey, Biprateep Dey, Peter Doel, Kevin Fanning, Jaime E. Forero-Romero, Enrique Gaztañaga, Satya Gontcho A Gontcho, Klaus Honscheid , et al. (36 additional authors not shown)

Abstract: Dark Energy Spectroscopic Instrument (DESI) uses more than 2.4 million Emission Line Galaxies (ELGs) for 3D large-scale structure (LSS) analyses in its Data Release 1 (DR1). Such large statistics enable thorough research on systematic uncertainties. In this study, we focus on spectroscopic systematics of ELGs. The redshift success rate ($f_{\rm goodz}$) is the relative fraction of secure redshifts… ▽ More Dark Energy Spectroscopic Instrument (DESI) uses more than 2.4 million Emission Line Galaxies (ELGs) for 3D large-scale structure (LSS) analyses in its Data Release 1 (DR1). Such large statistics enable thorough research on systematic uncertainties. In this study, we focus on spectroscopic systematics of ELGs. The redshift success rate ($f_{\rm goodz}$) is the relative fraction of secure redshifts among all measurements. It depends on observing conditions, thus introduces non-cosmological variations to the LSS. We, therefore, develop the redshift failure weight ($w_{\rm zfail}$) and a per-fibre correction ($η_{\rm zfail}$) to mitigate these dependences. They have minor influences on the galaxy clustering. For ELGs with a secure redshift, there are two subtypes of systematics: 1) catastrophics (large) that only occur in a few samples; 2) redshift uncertainty (small) that exists for all samples. The catastrophics represent 0.26\% of the total DR1 ELGs, composed of the confusion between O\,\textsc{ii} and sky residuals, double objects, total catastrophics and others. We simulate the realistic 0.26\% catastrophics of DR1 ELGs, the hypothetical 1\% catastrophics, and the truncation of the contaminated $1.31<z<1.33$ in the \textsc{AbacusSummit} ELG mocks. Their $P_\ell$ show non-negligible bias from the uncontaminated mocks. But their influences on the redshift space distortions (RSD) parameters are smaller than $0.2σ$. The redshift uncertainty of \Yone ELGs is 8.5 km/s with a Lorentzian profile. The code for implementing the catastrophics and redshift uncertainty on mocks can be found in https://github.com/Jiaxi-Yu/modelling_spectro_sys. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16386 [pdf, other]

Variational Offline Multi-agent Skill Discovery

Authors: Jiayu Chen, Bhargav Ganguly, Tian Lan, Vaneet Aggarwal

Abstract: Skills are effective temporal abstractions established for sequential decision making tasks, which enable efficient hierarchical learning for long-horizon tasks and facilitate multi-task learning through their transferability. Despite extensive research, research gaps remain in multi-agent scenarios, particularly for automatically extracting subgroup coordination patterns in a multi-agent task. In… ▽ More Skills are effective temporal abstractions established for sequential decision making tasks, which enable efficient hierarchical learning for long-horizon tasks and facilitate multi-task learning through their transferability. Despite extensive research, research gaps remain in multi-agent scenarios, particularly for automatically extracting subgroup coordination patterns in a multi-agent task. In this case, we propose two novel auto-encoder schemes: VO-MASD-3D and VO-MASD-Hier, to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills, which firstly solves the aforementioned challenge. An essential algorithm component of these schemes is a dynamic grou** function that can automatically detect latent subgroups based on agent interactions in a task. Notably, our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining. Empirical evaluations on StarCraft tasks indicate that our approach significantly outperforms existing methods regarding applying skills in multi-agent reinforcement learning (MARL). Moreover, skills discovered using our method can effectively reduce the learning difficulty in MARL scenarios with delayed and sparse reward signals. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.14122 [pdf, other]

Modeling Other Players with Bayesian Beliefs for Games with Incomplete Information

Authors: Zuyuan Zhang, Mahdi Imani, Tian Lan

Abstract: Bayesian games model interactive decision-making where players have incomplete information -- e.g., regarding payoffs and private data on players' strategies and preferences -- and must actively reason and update their belief models (with regard to such information) using observation and interaction history. Existing work on counterfactual regret minimization have shown great success for games wit… ▽ More Bayesian games model interactive decision-making where players have incomplete information -- e.g., regarding payoffs and private data on players' strategies and preferences -- and must actively reason and update their belief models (with regard to such information) using observation and interaction history. Existing work on counterfactual regret minimization have shown great success for games with complete or imperfect information, but not for Bayesian games. To this end, we introduced a new CFR algorithm: Bayesian-CFR and analyze its regret bound with respect to Bayesian Nash Equilibria in Bayesian games. First, we present a method for updating the posterior distribution of beliefs about the game and other players' types. The method uses a kernel-density estimate and is shown to converge to the true distribution. Second, we define Bayesian regret and present a Bayesian-CFR minimization algorithm for computing the Bayesian Nash equilibrium. Finally, we extend this new approach to other existing algorithms, such as Bayesian-CFR+ and Deep Bayesian CFR. Experimental results show that our proposed solutions significantly outperform existing methods in classical Texas Hold'em games. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2105.08440 by other authors

arXiv:2405.13748 [pdf, other]

Monocular Gaussian SLAM with Language Extended Loop Closure

Authors: Tian Lan, Qinwei Lin, Haoqian Wang

Abstract: Recently,3DGaussianSplattinghasshowngreatpotentialin visual Simultaneous Localization And Map** (SLAM). Existing methods have achieved encouraging results on RGB-D SLAM, but studies of the monocular case are still scarce. Moreover, they also fail to correct drift errors due to the lack of loop closure and global optimization. In this paper, we present MG-SLAM, a monocular Gaussian SLAM with a la… ▽ More Recently,3DGaussianSplattinghasshowngreatpotentialin visual Simultaneous Localization And Map** (SLAM). Existing methods have achieved encouraging results on RGB-D SLAM, but studies of the monocular case are still scarce. Moreover, they also fail to correct drift errors due to the lack of loop closure and global optimization. In this paper, we present MG-SLAM, a monocular Gaussian SLAM with a language-extended loop closure module capable of performing drift-corrected tracking and high-fidelity reconstruction while achieving a high-level understanding of the environment. Our key idea is to represent the global map as 3D Gaussian and use it to guide the estimation of the scene geometry, thus mitigating the efforts of missing depth information. Further, an additional language-extended loop closure module which is based on CLIP feature is designed to continually perform global optimization to correct drift errors accumulated as the system runs. Our system shows promising results on multiple challenging datasets in both tracking and map** and even surpasses some existing RGB-D methods. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.08314 [pdf, other]

Probing the impact of radio-mode feedback on the properties of the cool circumgalactic medium

Authors: Yu-Ling Chang, Ting-Wen Lan, J. Xavier Prochaska, Lucas Napolitano, Abhijeet Anand, J. Aguilar, S. Ahlen, D. Brooks, T. Claybaugh, A. de la Macorra, Arjun Dey, P. Doel, S. Gontcho A Gontcho, J. Guy, S. Juneau, T. Kisner, A. Lambert, M. Landriau, L. Le Guillou, M. Manera, P. Martini, A. Meisner, R. Miquel, J. Moustakas, A. D. Myers , et al. (11 additional authors not shown)

Abstract: We explore the influence of radio-mode feedback on the properties of the cool circumgalactic medium (CGM). To this end, we assemble a statistical sample of approximately 30,000 radio galaxies with background quasars by combining optical spectroscopic measurements of luminous red galaxies (LRGs) and quasars from the year 1 dataset of Dark Energy Spectroscopic Instrument (DESI) and radio sources fro… ▽ More We explore the influence of radio-mode feedback on the properties of the cool circumgalactic medium (CGM). To this end, we assemble a statistical sample of approximately 30,000 radio galaxies with background quasars by combining optical spectroscopic measurements of luminous red galaxies (LRGs) and quasars from the year 1 dataset of Dark Energy Spectroscopic Instrument (DESI) and radio sources from the LOw-Frequency ARray Two-metre Sky Survey (LoTSS) DR2 catalog and the Very Large Array Sky Survey (VLASS) quick look catalog. Galaxies with similar optical properties but with no radio counterparts in LoTSS and VLASS are selected as the control group. We measure the cool CGM properties of radio galaxies and their control samples traced by MgII absorption lines, including covering fraction, rest equivalent width, and gas kinematics. Our results show no significant difference in the properties of gas around radio galaxies and their control sample, indicating that the operating radio-mode feedback of massive galaxies does not produce detectable effects on the properties of the cool CGM. Finally, we show that the CGM of radio galaxies contain a non-negligible amount of cool gas with approximately 10^10 solar masses. This abundance can place a stringent constraint on the radio-mode feedback models. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 20 pages, 12 figures

arXiv:2405.03967 [pdf, other]

SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory Systems

Authors: Kailash Gogineni, Sai Santosh Dayapule, Juan Gómez-Luna, Karthikeya Gogineni, Peng Wei, Tian Lan, Mohammad Sadrosadati, Onur Mutlu, Guru Venkataramani

Abstract: Reinforcement Learning (RL) trains agents to learn optimal behavior by maximizing reward signals from experience datasets. However, RL training often faces memory limitations, leading to execution latencies and prolonged training times. To overcome this, SwiftRL explores Processing-In-Memory (PIM) architectures to accelerate RL workloads. We achieve near-linear performance scaling by implementing… ▽ More Reinforcement Learning (RL) trains agents to learn optimal behavior by maximizing reward signals from experience datasets. However, RL training often faces memory limitations, leading to execution latencies and prolonged training times. To overcome this, SwiftRL explores Processing-In-Memory (PIM) architectures to accelerate RL workloads. We achieve near-linear performance scaling by implementing RL algorithms like Tabular Q-learning and SARSA on UPMEM PIM systems and optimizing for hardware. Our experiments on OpenAI GYM environments using UPMEM hardware demonstrate superior performance compared to CPU and GPU implementations. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.13836 [pdf, other]

MultiFun-DAG: Multivariate Functional Directed Acyclic Graph

Authors: Tian Lan, Ziyue Li, Junpeng Lin, Zhishuai Li, Lei Bai, Man Li, Fugee Tsung, Rui Zhao, Chen Zhang

Abstract: Directed Acyclic Graphical (DAG) models efficiently formulate causal relationships in complex systems. Traditional DAGs assume nodes to be scalar variables, characterizing complex systems under a facile and oversimplified form. This paper considers that nodes can be multivariate functional data and thus proposes a multivariate functional DAG (MultiFun-DAG). It constructs a hidden bilinear multivar… ▽ More Directed Acyclic Graphical (DAG) models efficiently formulate causal relationships in complex systems. Traditional DAGs assume nodes to be scalar variables, characterizing complex systems under a facile and oversimplified form. This paper considers that nodes can be multivariate functional data and thus proposes a multivariate functional DAG (MultiFun-DAG). It constructs a hidden bilinear multivariate function-to-function regression to describe the causal relationships between different nodes. Then an Expectation-Maximum algorithm is used to learn the graph structure as a score-based algorithm with acyclic constraints. Theoretical properties are diligently derived. Prudent numerical studies and a case study from urban traffic congestion analysis are conducted to show MultiFun-DAG's effectiveness. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2404.03002 [pdf, other]

DESI 2024 VI: Cosmological Constraints from the Measurements of Baryon Acoustic Oscillations

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, D. M. Alexander, M. Alvarez, O. Alves, A. Anand, U. Andrade, E. Armengaud, S. Avila, A. Aviles, H. Awan, B. Bahr-Kalus, S. Bailey, C. Baltay, A. Bault, J. Behera, S. BenZvi, A. Bera, F. Beutler, D. Bianchi, C. Blake, R. Blum , et al. (178 additional authors not shown)

Abstract: We present cosmological results from the measurement of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$α$ forest tracers from the first year of observations from the Dark Energy Spectroscopic Instrument (DESI), to be released in the DESI Data Release 1. DESI BAO provide robust measurements of the transverse comoving distance and Hubble rate, or their combination, relative to the s… ▽ More We present cosmological results from the measurement of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$α$ forest tracers from the first year of observations from the Dark Energy Spectroscopic Instrument (DESI), to be released in the DESI Data Release 1. DESI BAO provide robust measurements of the transverse comoving distance and Hubble rate, or their combination, relative to the sound horizon, in seven redshift bins from over 6 million extragalactic objects in the redshift range $0.1<z<4.2$. DESI BAO data alone are consistent with the standard flat $Λ$CDM cosmological model with a matter density $Ω_\mathrm{m}=0.295\pm 0.015$. Paired with a BBN prior and the robustly measured acoustic angular scale from the CMB, DESI requires $H_0=(68.52\pm0.62)$ km/s/Mpc. In conjunction with CMB anisotropies from Planck and CMB lensing data from Planck and ACT, we find $Ω_\mathrm{m}=0.307\pm 0.005$ and $H_0=(67.97\pm0.38)$ km/s/Mpc. Extending the baseline model with a constant dark energy equation of state parameter $w$, DESI BAO alone require $w=-0.99^{+0.15}_{-0.13}$. In models with a time-varying dark energy equation of state parametrized by $w_0$ and $w_a$, combinations of DESI with CMB or with SN~Ia individually prefer $w_0>-1$ and $w_a<0$. This preference is 2.6$σ$ for the DESI+CMB combination, and persists or grows when SN~Ia are added in, giving results discrepant with the $Λ$CDM model at the $2.5σ$, $3.5σ$ or $3.9σ$ levels for the addition of Pantheon+, Union3, or DES-SN5YR datasets respectively. For the flat $Λ$CDM model with the sum of neutrino mass $\sum m_ν$ free, combining the DESI and CMB data yields an upper limit $\sum m_ν< 0.072$ $(0.113)$ eV at 95% confidence for a $\sum m_ν>0$ $(\sum m_ν>0.059)$ eV prior. These neutrino-mass constraints are substantially relaxed in models beyond $Λ$CDM. [Abridged.] △ Less

Submitted 24 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: This DESI Collaboration Key Publication is part of the 2024 publication series using the first year of observations (see https://data.desi.lbl.gov/doc/papers). Typos corrected and a new figure and discussion added to Appendix A

arXiv:2404.03001 [pdf, other]

DESI 2024 IV: Baryon Acoustic Oscillations from the Lyman Alpha Forest

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, D. M. Alexander, M. Alvarez, O. Alves, A. Anand, U. Andrade, E. Armengaud, S. Avila, A. Aviles, H. Awan, S. Bailey, C. Baltay, A. Bault, J. Bautista, J. Behera, S. BenZvi, F. Beutler, D. Bianchi, C. Blake, R. Blum, S. Brieden , et al. (174 additional authors not shown)

Abstract: We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-$α$ (Ly$α$) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over $420\,000$ Ly$α$ forest spectra and their correlation with the spatial distribution of more than $700\,000$ quasars. An essential facet of this work is the development of a… ▽ More We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-$α$ (Ly$α$) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over $420\,000$ Ly$α$ forest spectra and their correlation with the spatial distribution of more than $700\,000$ quasars. An essential facet of this work is the development of a new analysis methodology on a blinded dataset. We conducted rigorous tests using synthetic data to ensure the reliability of our methodology and findings before unblinding. Additionally, we conducted multiple data splits to assess the consistency of the results and scrutinized various analysis approaches to confirm their robustness. For a given value of the sound horizon ($r_d$), we measure the expansion at $z_{\rm eff}=2.33$ with 2\% precision, $H(z_{\rm eff}) = (239.2 \pm 4.8) (147.09~{\rm Mpc} /r_d)$ km/s/Mpc. Similarly, we present a 2.4\% measurement of the transverse comoving distance to the same redshift, $D_M(z_{\rm eff}) = (5.84 \pm 0.14) (r_d/147.09~{\rm Mpc})$ Gpc. Together with other DESI BAO measurements at lower redshifts, these results are used in a companion paper to constrain cosmological parameters. △ Less

Submitted 12 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: This DESI Collaboration Key Publication is part of the 2024 publication series using the first year of observations (see https://data.desi.lbl.gov/doc/papers)

arXiv:2404.03000 [pdf, other]

DESI 2024 III: Baryon Acoustic Oscillations from Galaxies and Quasars

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, D. M. Alexander, M. Alvarez, O. Alves, A. Anand, U. Andrade, E. Armengaud, S. Avila, A. Aviles, H. Awan, S. Bailey, C. Baltay, A. Bault, J. Behera, S. BenZvi, F. Beutler, D. Bianchi, C. Blake, R. Blum, S. Brieden, A. Brodzeller , et al. (171 additional authors not shown)

Abstract: We present the DESI 2024 galaxy and quasar baryon acoustic oscillations (BAO) measurements using over 5.7 million unique galaxy and quasar redshifts in the range 0.1<z<2.1. Divided by tracer type, we utilize 300,017 galaxies from the magnitude-limited Bright Galaxy Survey with 0.1<z<0.4, 2,138,600 Luminous Red Galaxies with 0.4<z<1.1, 2,432,022 Emission Line Galaxies with 0.8<z<1.6, and 856,652 qu… ▽ More We present the DESI 2024 galaxy and quasar baryon acoustic oscillations (BAO) measurements using over 5.7 million unique galaxy and quasar redshifts in the range 0.1<z<2.1. Divided by tracer type, we utilize 300,017 galaxies from the magnitude-limited Bright Galaxy Survey with 0.1<z<0.4, 2,138,600 Luminous Red Galaxies with 0.4<z<1.1, 2,432,022 Emission Line Galaxies with 0.8<z<1.6, and 856,652 quasars with 0.8<z<2.1, over a ~7,500 square degree footprint. The analysis was blinded at the catalog-level to avoid confirmation bias. All fiducial choices of the BAO fitting and reconstruction methodology, as well as the size of the systematic errors, were determined on the basis of the tests with mock catalogs and the blinded data catalogs. We present several improvements to the BAO analysis pipeline, including enhancing the BAO fitting and reconstruction methods in a more physically-motivated direction, and also present results using combinations of tracers. We present a re-analysis of SDSS BOSS and eBOSS results applying the improved DESI methodology and find scatter consistent with the level of the quoted SDSS theoretical systematic uncertainties. With the total effective survey volume of ~ 18 Gpc$^3$, the combined precision of the BAO measurements across the six different redshift bins is ~0.52%, marking a 1.2-fold improvement over the previous state-of-the-art results using only first-year data. We detect the BAO in all of these six redshift bins. The highest significance of BAO detection is $9.1σ$ at the effective redshift of 0.93, with a constraint of 0.86% placed on the BAO scale. We find our measurements are systematically larger than the prediction of Planck-2018 LCDM model at z<0.8. We translate the results into transverse comoving distance and radial Hubble distance measurements, which are used to constrain cosmological models in our companion paper [abridged]. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: This DESI Collaboration Key Publication is part of the 2024 publication series using the first year of observations (see https://data.desi.lbl.gov/doc/papers)

arXiv:2403.15341 [pdf, other]

Collaborative AI Teaming in Unknown Environments via Active Goal Deduction

Authors: Zuyuan Zhang, Hanhan Zhou, Mahdi Imani, Taeyoung Lee, Tian Lan

Abstract: With the advancements of artificial intelligence (AI), we're seeing more scenarios that require AI to work closely with other agents, whose goals and strategies might not be known beforehand. However, existing approaches for training collaborative agents often require defined and known reward signals and cannot address the problem of teaming with unknown agents that often have latent objectives/re… ▽ More With the advancements of artificial intelligence (AI), we're seeing more scenarios that require AI to work closely with other agents, whose goals and strategies might not be known beforehand. However, existing approaches for training collaborative agents often require defined and known reward signals and cannot address the problem of teaming with unknown agents that often have latent objectives/rewards. In response to this challenge, we propose teaming with unknown agents framework, which leverages kernel density Bayesian inverse learning method for active goal deduction and utilizes pre-trained, goal-conditioned policies to enable zero-shot policy adaptation. We prove that unbiased reward estimates in our framework are sufficient for optimal teaming with unknown agents. We further evaluate the framework of redesigned multi-agent particle and StarCraft II micromanagement environments with diverse unknown agents of different behaviors/rewards. Empirical results demonstrate that our framework significantly advances the teaming performance of AI and unknown agents in a wide range of collaborative scenarios. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.01954 [pdf, other]

DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation

Authors: Chen Xu, Tian Lan, Changlong Yu, Wei Wang, Jun Gao, Yu Ji, Qunxi Dong, Kun Qian, Piji Li, Wei Bi, Bin Hu

Abstract: Constrained decoding approaches aim to control the meaning or style of text generated by a Pre-trained Language Model (PLM) using specific target words during inference. However, these methods often guide plausible continuations by greedily selecting targets, which, while completing the task, may disrupt the natural patterns of human language generation. In this work, we propose a novel decoding f… ▽ More Constrained decoding approaches aim to control the meaning or style of text generated by a Pre-trained Language Model (PLM) using specific target words during inference. However, these methods often guide plausible continuations by greedily selecting targets, which, while completing the task, may disrupt the natural patterns of human language generation. In this work, we propose a novel decoding framework, DECIDER, which enables us to program rules on how we complete tasks to control a PLM. Differing from previous work, our framework transforms the encouragement of target words into the encouragement of all words that satisfy the rule. Specifically, DECIDER is a dual system where a PLM is equipped with a First-OrderLogic (FOL) reasoner to express and evaluate the rules, and a decision function to merge the outputs from both systems to steer the generation. Experiments on CommonGen and PersonaChat demonstrate that DECIDER can effectively follow given rules to achieve generation tasks in a more human-like manner. △ Less

Submitted 7 July, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: Submitted to IEEE TKDE (Major Revision), 13 pages, 6 figures

arXiv:2403.01890 [pdf, other]

Aerial Tensile Perching and Disentangling Mechanism for Long-Term Environmental Monitoring

Authors: Tian Lan, Luca Romanello, Mirko Kovac, Sophie F. Armanini, Basaran Bahadir Kocer

Abstract: Aerial robots show significant potential for forest canopy research and environmental monitoring by providing data collection capabilities at high spatial and temporal resolutions. However, limited flight endurance hinders their application. Inspired by natural perching behaviours, we propose a multi-modal aerial robot system that integrates tensile perching for energy conservation and a suspended… ▽ More Aerial robots show significant potential for forest canopy research and environmental monitoring by providing data collection capabilities at high spatial and temporal resolutions. However, limited flight endurance hinders their application. Inspired by natural perching behaviours, we propose a multi-modal aerial robot system that integrates tensile perching for energy conservation and a suspended actuated pod for data collection. The system consists of a quadrotor drone, a slewing ring mechanism allowing 360° tether rotation, and a streamlined pod with two ducted propellers connected via a tether. Winding and unwinding the tether allows the pod to move within the canopy, and activating the propellers allows the tether to be wrapped around branches for perching or disentangling. We experimentally determined the minimum counterweights required for stable perching under various conditions. Building on this, we devised and evaluated multiple perching and disentangling strategies. Comparisons of perching and disentangling manoeuvres demonstrate energy savings that could be further maximized with the use of the pod or tether winding. These approaches can reduce energy consumption to only 22\% and 1.5\%, respectively, compared to a drone disentangling manoeuvre. We also calculated the minimum idle time required by the proposed system after the system perching and motor shut down to save energy on a mission, which is 48.9\% of the operating time. Overall, the integrated system expands the operational capabilities and enhances the energy efficiency of aerial robots for long-term monitoring tasks. △ Less

Submitted 5 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: 7 pages, 8 figures, Accepted in IEEE International Conference on Robotics and Automation (ICRA) 2024

arXiv:2403.01642 [pdf]

Blue and Green-Mode Energy-Efficient Chemiresistive Sensor Array Realized by Rapid Ensemble Learning

Authors: Zeheng Wang, James Cooper, Muhammad Usman, Timothy van der Laan

Abstract: The rapid advancement of Internet of Things (IoT) necessitates the development of optimized Chemiresistive Sensor (CRS) arrays that are both energy-efficient and capable. This study introduces a novel optimization strategy that employs a rapid ensemble learning-based model committee approach to achieve these goals. Utilizing machine learning models such as Elastic Net Regression, Random Forests, a… ▽ More The rapid advancement of Internet of Things (IoT) necessitates the development of optimized Chemiresistive Sensor (CRS) arrays that are both energy-efficient and capable. This study introduces a novel optimization strategy that employs a rapid ensemble learning-based model committee approach to achieve these goals. Utilizing machine learning models such as Elastic Net Regression, Random Forests, and XGBoost, among others, the strategy identifies the most impactful sensors in a CRS array for accurate classification: A weighted voting mechanism is introduced to aggregate the models' opinions in sensor selection, thereby setting up wo distinct working modes, termed "Blue" and "Green". The Blue mode operates with all sensors for maximum detection capability, while the Green mode selectively activates only key sensors, significantly reducing energy consumption without compromising detection accuracy. The strategy is validated through theoretical calculations and Monte Carlo simulations, demonstrating its effectiveness and accuracy. The proposed optimization strategy not only elevates the detection capability of CRS arrays but also brings it closer to theoretical limits, promising significant implications for the development of low-cost, easily fabricable next-generation IoT sensor terminals. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: First version before submission

arXiv:2403.01577 [pdf, ps, other]

Torus algebra and logical operators at low energy

Authors: Ying Chan, Tian Lan, Linqian Wu

Abstract: Given a modular tensor category $\mathscr{C}$, we construct an associative algebra $\mathrm{Tor({\mathscr{C}}})$, which we call the torus algebra. We prove that the torus algebra is semisimple by explicitly constructing all the simple modules. Suppose that a topological ordered phase described by $\mathscr{C}$ is put on a torus. Physically, each simple module over $\mathrm{Tor({\mathscr{C}}})$ con… ▽ More Given a modular tensor category $\mathscr{C}$, we construct an associative algebra $\mathrm{Tor({\mathscr{C}}})$, which we call the torus algebra. We prove that the torus algebra is semisimple by explicitly constructing all the simple modules. Suppose that a topological ordered phase described by $\mathscr{C}$ is put on a torus. Physically, each simple module over $\mathrm{Tor({\mathscr{C}}})$ consists of the low energy states on the torus with one anyon excitation, or equivalently, the ground states on a punctured torus where the anyon is enclosed by the puncture. Elements in $\mathrm{Tor({\mathscr{C}}})$ can be physically interpreted as anyon hop** processes on the torus. We give the precise formula how an arbitrary logical operator on the low energy states on a torus can be realized by moving anyons on the torus. Our work thus provides a theoretical proposal that the low energy states on a torus can serve as topological qudits and one can arbitrarily manipulate them by moving anyons around. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: 22 pages, 1 figure

arXiv:2402.19253 [pdf, ps, other]

Condensation Completion and Defects in 2+1D Topological Orders

Authors: Gen Yue, Longye Wang, Tian Lan

Abstract: We review the condensation completion of a modular tensor category, which yields a fusion 2-category of codimension-1 and higher defects in a $2+1$D topological order. We apply the condensation completion to $2+1$D toric code model and a $\mathbbm Z_4$ chiral topological order. In both cases, we explicitly enumerate the $1$d and $0$d defects present in these topological orders, along with their fu… ▽ More We review the condensation completion of a modular tensor category, which yields a fusion 2-category of codimension-1 and higher defects in a $2+1$D topological order. We apply the condensation completion to $2+1$D toric code model and a $\mathbbm Z_4$ chiral topological order. In both cases, we explicitly enumerate the $1$d and $0$d defects present in these topological orders, along with their fusion rules. We also talk about other applications of condensation completion: alternative interpretations of condensation completion of a braided fusion category; condensation completion of the category of symmetry charges and its correspondence to gapped phases with symmetry; for a topological order $\cC$, one can also find all gapped boundaries of the stacking of $\cC$ with its time-reversal conjugate through computing the condensation completion of $\cC$. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.15538 [pdf, other]

AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System

Authors: Zhiwei Liu, Weiran Yao, Jianguo Zhang, Liangwei Yang, Zuxin Liu, Juntao Tan, Prafulla K. Choubey, Tian Lan, Jason Wu, Huan Wang, Shelby Heinecke, Caiming Xiong, Silvio Savarese

Abstract: The booming success of LLMs initiates rapid development in LLM agents. Though the foundation of an LLM agent is the generative model, it is critical to devise the optimal reasoning strategies and agent architectures. Accordingly, LLM agent research advances from the simple chain-of-thought prompting to more complex ReAct and Reflection reasoning strategy; agent architecture also evolves from singl… ▽ More The booming success of LLMs initiates rapid development in LLM agents. Though the foundation of an LLM agent is the generative model, it is critical to devise the optimal reasoning strategies and agent architectures. Accordingly, LLM agent research advances from the simple chain-of-thought prompting to more complex ReAct and Reflection reasoning strategy; agent architecture also evolves from single agent generation to multi-agent conversation, as well as multi-LLM multi-agent group chat. However, with the existing intricate frameworks and libraries, creating and evaluating new reasoning strategies and agent architectures has become a complex challenge, which hinders research investigation into LLM agents. Thus, we open-source a new AI agent library, AgentLite, which simplifies this process by offering a lightweight, user-friendly platform for innovating LLM agent reasoning, architectures, and applications with ease. AgentLite is a task-oriented framework designed to enhance the ability of agents to break down tasks and facilitate the development of multi-agent systems. Furthermore, we introduce multiple practical applications developed with AgentLite to demonstrate its convenience and flexibility. Get started now at: \url{https://github.com/SalesforceAIResearch/AgentLite}. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: preprint. Library is available at https://github.com/SalesforceAIResearch/AgentLite

arXiv:2402.15506 [pdf, other]

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Authors: Jianguo Zhang, Tian Lan, Rithesh Murthy, Zhiwei Liu, Weiran Yao, Juntao Tan, Thai Hoang, Liangwei Yang, Yihao Feng, Zuxin Liu, Tulika Awalgaonkar, Juan Carlos Niebles, Silvio Savarese, Shelby Heinecke, Huan Wang, Caiming Xiong

Abstract: Autonomous agents powered by large language models (LLMs) have garnered significant research attention. However, fully harnessing the potential of LLMs for agent-based tasks presents inherent challenges due to the heterogeneous nature of diverse data sources featuring multi-turn trajectories. In this paper, we introduce \textbf{AgentOhana} as a comprehensive solution to address these challenges. \… ▽ More Autonomous agents powered by large language models (LLMs) have garnered significant research attention. However, fully harnessing the potential of LLMs for agent-based tasks presents inherent challenges due to the heterogeneous nature of diverse data sources featuring multi-turn trajectories. In this paper, we introduce \textbf{AgentOhana} as a comprehensive solution to address these challenges. \textit{AgentOhana} aggregates agent trajectories from distinct environments, spanning a wide array of scenarios. It meticulously standardizes and unifies these trajectories into a consistent format, streamlining the creation of a generic data loader optimized for agent training. Leveraging the data unification, our training pipeline maintains equilibrium across different data sources and preserves independent randomness across devices during dataset partitioning and model training. Additionally, we present \textbf{xLAM-v0.1}, a large action model tailored for AI agents, which demonstrates exceptional performance across various benchmarks. Begin the exploration at \url{https://github.com/SalesforceAIResearch/xLAM}. △ Less

Submitted 20 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: Add GitHub repo link at \url{https://github.com/SalesforceAIResearch/xLAM} and HuggingFace model link at \url{https://huggingface.co/Salesforce/xLAM-v0.1-r}

arXiv:2402.13777 [pdf, other]

Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions

Authors: Jiayu Chen, Bhargav Ganguly, Yang Xu, Yongsheng Mei, Tian Lan, Vaneet Aggarwal

Abstract: Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline… ▽ More Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Notably, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning. △ Less

Submitted 25 May, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: We restructured the paper and added more discussion

arXiv:2402.13764 [pdf, other]

CriticBench: Evaluating Large Language Models as Critic

Authors: Tian Lan, Wenwei Zhang, Chen Xu, Heyan Huang, Dahua Lin, Kai Chen, Xian-ling Mao

Abstract: Critique ability are crucial in the scalable oversight and self-improvement of Large Language Models (LLMs). While many recent studies explore the critique ability of LLMs to judge and refine flaws in generations, how to comprehensively and reliably measure the critique abilities of LLMs is under-explored. This paper introduces CriticBench, a novel benchmark designed to comprehensively and reliabl… ▽ More Critique ability are crucial in the scalable oversight and self-improvement of Large Language Models (LLMs). While many recent studies explore the critique ability of LLMs to judge and refine flaws in generations, how to comprehensively and reliably measure the critique abilities of LLMs is under-explored. This paper introduces CriticBench, a novel benchmark designed to comprehensively and reliably evaluate four key critique ability dimensions of LLMs: feedback, comparison, refinement and meta-feedback. CriticBench encompasses nine diverse tasks, each assessing the LLMs' ability to critique responses at varying levels of quality granularity. Our extensive evaluations of open-source and closed-source LLMs reveal intriguing relationships between the critique ability and tasks, response qualities, and model scales. Datasets, resources and evaluation toolkit for CriticBench will be publicly released at https://github.com/open-compass/CriticBench. △ Less

Submitted 22 February, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

arXiv:2402.12417 [pdf]

Predicting trucking accidents with truck drivers 'safety climate perception across companies: A transfer learning approach

Authors: Kailai Sun, Tianxiang Lan, Say Hong Kam, Yang Miang Goh, Yueng-Hsiang Huang

Abstract: There is a rising interest in using artificial intelligence (AI)-powered safety analytics to predict accidents in the trucking industry. Companies may face the practical challenge, however, of not having enough data to develop good safety analytics models. Although pretrained models may offer a solution for such companies, existing safety research using transfer learning has mostly focused on comp… ▽ More There is a rising interest in using artificial intelligence (AI)-powered safety analytics to predict accidents in the trucking industry. Companies may face the practical challenge, however, of not having enough data to develop good safety analytics models. Although pretrained models may offer a solution for such companies, existing safety research using transfer learning has mostly focused on computer vision and natural language processing, rather than accident analytics. To fill the above gap, we propose a pretrain-then-fine-tune transfer learning approach to help any company leverage other companies' data to develop AI models for a more accurate prediction of accident risk. We also develop SafeNet, a deep neural network algorithm for classification tasks suitable for accident prediction. Using the safety climate survey data from seven trucking companies with different data sizes, we show that our proposed approach results in better model performance compared to training the model from scratch using only the target company's data. We also show that for the transfer learning model to be effective, the pretrained model should be developed with larger datasets from diverse sources. The trucking industry may, thus, consider pooling safety analytics data from a wide range of companies to develop pretrained models and share them within the industry for better knowledge and resource transfer. The above contributions point to the promise of advanced safety analytics to make the industry safer and more sustainable. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: submitted to journal: accident analysis and prevention

arXiv:2402.10941 [pdf, other]

Text2Data: Low-Resource Data Generation with Textual Control

Authors: Shiyu Wang, Yihao Feng, Tian Lan, Ning Yu, Yu Bai, Ran Xu, Huan Wang, Caiming Xiong, Silvio Savarese

Abstract: Natural language serves as a common and straightforward control signal for humans to interact seamlessly with machines. Recognizing the importance of this interface, the machine learning community is investing considerable effort in generating data that is semantically coherent with textual instructions. While strides have been made in text-to-data generation spanning image editing, audio synthesi… ▽ More Natural language serves as a common and straightforward control signal for humans to interact seamlessly with machines. Recognizing the importance of this interface, the machine learning community is investing considerable effort in generating data that is semantically coherent with textual instructions. While strides have been made in text-to-data generation spanning image editing, audio synthesis, video creation, and beyond, low-resource areas characterized by expensive annotations or complex data structures, such as molecules, motion dynamics, and time series, often lack textual labels. This deficiency impedes supervised learning, thereby constraining the application of advanced generative models for text-to-data tasks. In response to these challenges in the low-resource scenario, we propose Text2Data, a novel approach that utilizes unlabeled data to understand the underlying data distribution through an unsupervised diffusion model. Subsequently, it undergoes controllable finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting. Comprehensive experiments demonstrate that Text2Data is able to achieve enhanced performance regarding controllability across various modalities, including molecules, motions and time series, when compared to existing baselines. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: We propose a method that can achieve text-to-data generation under low-resource situation

arXiv:2401.14604 [pdf, other]

doi 10.1088/1741-4326/ad39d9

Effects of Magnetic Helicity on 3D Equilibria and Self-Organized States in KTX Reversed Field Pinch

Authors: Ke Liu, Guodong Yu, Yuhua Huang, Wenzhe Mao, Yidong Xie, Xianyi Nie, Hong Li, Tao Lan, **lin Xie, Weixing Ding, Wandong Liu, Ge Zhuang, Caoxiang Zhu

Abstract: The RFP is a toroidal magnetic configuration in which plasmas can spontaneously transform into different self-organized states. Among various states, the QSH state has a dominant component for the magnetic field and significantly improves confinement. Many theoretical and experimental efforts have investigated the transitions among different states. This paper employs the MRxMHD model to study the… ▽ More The RFP is a toroidal magnetic configuration in which plasmas can spontaneously transform into different self-organized states. Among various states, the QSH state has a dominant component for the magnetic field and significantly improves confinement. Many theoretical and experimental efforts have investigated the transitions among different states. This paper employs the MRxMHD model to study the properties of QSH and other states. The SPEC is used to compute MHD equilibria for the KTX. The toroidal volume of KTX is partitioned into two subvolumes by an internal transport barrier. The geometry of this barrier is adjusted to achieve force balance across the interface, ensuring that the plasma in each subvolume is force-free and that magnetic helicity is conserved. By varying the parameters, we generate distinct self-organized states in KTX. Our findings highlight the crucial role of magnetic helicity in sha** these states. In states with low magnetic helicity in both subvolumes, the plasma exhibits axisymmetric behavior. With increasing core helicity, the plasma gradually transforms from an axisymmetric state to a double-axis helical state and finally to a single-helical-axis state. Elevated core magnetic helicity leads to a more pronounced dominant mode of the boundary magnetic field and a reduced core magnetic shear. This is consistent with previous experimental and numerical results in other RFP devices. We find a linear relationship between the plasma current and helicity in different self-organized states. Our findings suggest that KTX may enter the QSH state when the toroidal current reaches 0.72 MA. This study demonstrates that the stellarator equilibrium code SPEC unveils crucial RFP equilibrium properties, rendering it applicable to a broad range of RFP devices and other toroidal configurations. △ Less

Submitted 6 April, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

arXiv:2401.14544 [pdf, other]

Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

Authors: Yongsheng Mei, Mahdi Imani, Tian Lan

Abstract: Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel… ▽ More Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel maximum a posteriori inference of Gaussian Cox processes. It leverages the Laplace approximation and change of kernel technique to transform the problem into a new reproducing kernel Hilbert space, where it becomes more tractable computationally. It enables us to obtain both a functional posterior of the latent intensity function and the covariance of the posterior, thus extending existing works that often focus on specific link functions or estimating the posterior mean. Using the result, we propose a BO framework based on the Gaussian Cox process model and further develop a Nyström approximation for efficient computation. Extensive evaluations on various synthetic and real-world datasets demonstrate significant improvement over state-of-the-art inference solutions for Gaussian Cox processes, as well as effective BO with a wide range of acquisition functions designed through the underlying Gaussian Cox process model. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: 2024 International Conference on Learning Representations (ICLR)

arXiv:2312.15958 [pdf, other]

Category of SET orders

Authors: Tian Lan, Gen Yue, Longye Wang

Abstract: We propose the representation principle to study physical systems with a given symmetry. In the context of symmetry enriched topological orders, we give the appropriate representation category, the category of SET orders. For fusion n-category symmetries, we show that the category of SET orders encodes almost all information about the interplay between symmetry and topological orders, in a natural… ▽ More We propose the representation principle to study physical systems with a given symmetry. In the context of symmetry enriched topological orders, we give the appropriate representation category, the category of SET orders. For fusion n-category symmetries, we show that the category of SET orders encodes almost all information about the interplay between symmetry and topological orders, in a natural and canonical way. These information include defects and boundaries of SET orders, symmetry charges, explicit and spontaneous symmetry breaking, stacking of SET orders, gauging of generalized symmetry, as well as quantum currents (SymTFT or symmetry TO). We also provide a detailed categorical algorithm to compute the generalized gauging. In particular, we proved that gauging is always reversible, as a special type of Morita-equivalence. The explicit data for ungauging, the inverse to gauging, is given. △ Less

Submitted 1 July, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

Comments: 21 pages, 8 figures, 1 table. Major revision: The perspective of representation principle is proposed, and the generalized gauging arises as a natural substructure of the category of SET orders

arXiv:2312.15947 [pdf, ps, other]

On a class of fusion 2-category symmetry: condensation completion of braided fusion category

Authors: Wenjie Xi, Tian Lan, Longye Wang, Chenjie Wang, Wei-Qiang Chen

Abstract: Recently, many studies are focused on generalized global symmetry, a mixture of both invertible and non-invertible symmetries in various space-time dimensions. The complete structure of generalized global symmetry is described by higher fusion category theory. In this paper, We first review the construction of fusion 2-category symmetry $Σ\cal B$ where $\cal B$ is a a braided fusion category. In p… ▽ More Recently, many studies are focused on generalized global symmetry, a mixture of both invertible and non-invertible symmetries in various space-time dimensions. The complete structure of generalized global symmetry is described by higher fusion category theory. In this paper, We first review the construction of fusion 2-category symmetry $Σ\cal B$ where $\cal B$ is a a braided fusion category. In particular, we elaborate on the monoidal structure of $Σ\cal B$ which determines fusion rules and controls the dynamics of topological operators/defects. We then take $Σ\mathrm{sVec}$ as an example to demonstrate how we calculate fusion rule, quantum dimension and 10j-symbol of the fusion 2-category. With our algorithm, all these data can be efficiently encoded and computed in computer program. The complete program will be uploaded to github soon. Our work can be thought as explicitly computing the representation theory of $\cal B$, in analogy to, for example the representation theory of $SU(2)$. The choice of basis bimodule maps are in analogy to the Clebsch-Gordon coefficients and the 10j-symbol are in analogy to the 6j-symbol. △ Less

Submitted 10 May, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

Comments: 42 pages, 3 figures, All the 10j-symbols of $Σ\mathrm{sVec}$ and the complete computer program has been uploaded on github: https://github.com/WJXI/2sVec.git

arXiv:2312.15555 [pdf, other]

ConcaveQ: Non-Monotonic Value Function Factorization via Concave Representations in Deep Multi-Agent Reinforcement Learning

Authors: Huiqun Li, Hanhan Zhou, Yifei Zou, Dongxiao Yu, Tian Lan

Abstract: Value function factorization has achieved great success in multi-agent reinforcement learning by optimizing joint action-value functions through the maximization of factorized per-agent utilities. To ensure Individual-Global-Maximum property, existing works often focus on value factorization using monotonic functions, which are known to result in restricted representation expressiveness. In this p… ▽ More Value function factorization has achieved great success in multi-agent reinforcement learning by optimizing joint action-value functions through the maximization of factorized per-agent utilities. To ensure Individual-Global-Maximum property, existing works often focus on value factorization using monotonic functions, which are known to result in restricted representation expressiveness. In this paper, we analyze the limitations of monotonic factorization and present ConcaveQ, a novel non-monotonic value function factorization approach that goes beyond monotonic mixing functions and employs neural network representations of concave mixing functions. Leveraging the concave property in factorization, an iterative action selection scheme is developed to obtain optimal joint actions during training. It is used to update agents' local policy networks, enabling fully decentralized execution. The effectiveness of the proposed ConcaveQ is validated across scenarios involving multi-agent predator-prey environment and StarCraft II micromanagement tasks. Empirical results exhibit significant improvement of ConcaveQ over state-of-the-art multi-agent reinforcement learning approaches. △ Less

Submitted 24 December, 2023; originally announced December 2023.

Comments: Accepted at AAAI 2024

Journal ref: AAAI 2024

arXiv:2312.11742 [pdf, other]

ACCL+: an FPGA-Based Collective Engine for Distributed Applications

Authors: Zhenhao He, Dario Korolija, Yu Zhu, Benjamin Ramhorst, Tristan Laan, Lucian Petrica, Michaela Blott, Gustavo Alonso

Abstract: FPGAs are increasingly prevalent in cloud deployments, serving as Smart NICs or network-attached accelerators. Despite their potential, develo** distributed FPGA-accelerated applications remains cumbersome due to the lack of appropriate infrastructure and communication abstractions. To facilitate the development of distributed applications with FPGAs, in this paper we propose ACCL+, an open-sour… ▽ More FPGAs are increasingly prevalent in cloud deployments, serving as Smart NICs or network-attached accelerators. Despite their potential, develo** distributed FPGA-accelerated applications remains cumbersome due to the lack of appropriate infrastructure and communication abstractions. To facilitate the development of distributed applications with FPGAs, in this paper we propose ACCL+, an open-source versatile FPGA-based collective communication library. Portable across different platforms and supporting UDP, TCP, as well as RDMA, ACCL+ empowers FPGA applications to initiate direct FPGA-to-FPGA collective communication. Additionally, it can serve as a collective offload engine for CPU applications, freeing the CPU from networking tasks. It is user-extensible, allowing new collectives to be implemented and deployed without having to re-synthesize the FPGA circuit. We evaluated ACCL+ on an FPGA cluster with 100 Gb/s networking, comparing its performance against software MPI over RDMA. The results demonstrate ACCL+'s significant advantages for FPGA-based distributed applications and highly competitive performance for CPU applications. We showcase ACCL+'s dual role with two use cases: seamlessly integrating as a collective offload engine to distribute CPU-based vector-matrix multiplication, and serving as a crucial and efficient component in designing fully FPGA-based distributed deep-learning recommendation inference. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.07696 [pdf, ps, other]

Real-time Network Intrusion Detection via Decision Transformers

Authors: **gdi Chen, Hanhan Zhou, Yongsheng Mei, Gina Adam, Nathaniel D. Bastian, Tian Lan

Abstract: Many cybersecurity problems that require real-time decision-making based on temporal observations can be abstracted as a sequence modeling problem, e.g., network intrusion detection from a sequence of arriving packets. Existing approaches like reinforcement learning may not be suitable for such cybersecurity decision problems, since the Markovian property may not necessarily hold and the underlyin… ▽ More Many cybersecurity problems that require real-time decision-making based on temporal observations can be abstracted as a sequence modeling problem, e.g., network intrusion detection from a sequence of arriving packets. Existing approaches like reinforcement learning may not be suitable for such cybersecurity decision problems, since the Markovian property may not necessarily hold and the underlying network states are often not observable. In this paper, we cast the problem of real-time network intrusion detection as casual sequence modeling and draw upon the power of the transformer architecture for real-time decision-making. By conditioning a causal decision transformer on past trajectories, consisting of the rewards, network packets, and detection decisions, our proposed framework will generate future detection decisions to achieve the desired return. It enables decision transformers to be applied to real-time network intrusion detection, as well as a novel tradeoff between the accuracy and timeliness of detection. The proposed solution is evaluated on public network intrusion detection datasets and outperforms several baseline algorithms using reinforcement learning and sequence modeling, in terms of detection accuracy and timeliness. △ Less

Submitted 16 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.07060 [pdf, other]

Layered Randomized Quantization for Communication-Efficient and Privacy-Preserving Distributed Learning

Authors: Guangfeng Yan, Tan Li, Tian Lan, Kui Wu, Linqi Song

Abstract: Next-generation wireless networks, such as edge intelligence and wireless distributed learning, face two critical challenges: communication efficiency and privacy protection. In this work, our focus is on addressing these issues in a distributed learning framework. We consider a new approach that simultaneously achieves communication efficiency and privacy protection by exploiting the privacy adva… ▽ More Next-generation wireless networks, such as edge intelligence and wireless distributed learning, face two critical challenges: communication efficiency and privacy protection. In this work, our focus is on addressing these issues in a distributed learning framework. We consider a new approach that simultaneously achieves communication efficiency and privacy protection by exploiting the privacy advantage offered by quantization. Specifically, we use a quantization scheme called \textbf{Gau}ssian \textbf{L}ayered \textbf{R}andomized \textbf{Q}uantization (Gau-LRQ) that compresses the raw model gradients using a layer multishift coupler. By adjusting the parameters of Gau-LRQ, we shape the quantization error to follow the expected Gaussian distribution, thus ensuring client-level differential privacy (CLDP). We demonstrate the effectiveness of our proposed Gau-LRQ in the distributed stochastic gradient descent (SGD) framework and theoretically quantify the trade-offs between communication, privacy, and convergence performance. We further improve the convergence performance by enabling dynamic private budget and quantization bit allocation. We achieve this by using an optimization formula that minimizes convergence error subject to the privacy budget constraint. We evaluate our approach on multiple datasets, including MNIST, CIFAR-10, and CIFAR-100, and show that our proposed method outperforms the baselines in terms of learning performance under various privacy constraints. Moreover, we observe that dynamic privacy allocation yields additional accuracy improvements for the models compared to the fixed scheme. △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.02515 [pdf, other]

ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a Single GPU

Authors: Zhengmao Ye, Dengchun Li, **gqi Tian, Tingfeng Lan, Jie Zuo, Lei Duan, Hui Lu, Yexi Jiang, Jian Sha, Ke Zhang, Mingjie Tang

Abstract: Transformer-based large language models (LLMs) have demonstrated outstanding performance across diverse domains, particularly when fine-turned for specific domains. Recent studies suggest that the resources required for fine-tuning LLMs can be economized through parameter-efficient methods such as Low-Rank Adaptation (LoRA). While LoRA effectively reduces computational burdens and resource demands… ▽ More Transformer-based large language models (LLMs) have demonstrated outstanding performance across diverse domains, particularly when fine-turned for specific domains. Recent studies suggest that the resources required for fine-tuning LLMs can be economized through parameter-efficient methods such as Low-Rank Adaptation (LoRA). While LoRA effectively reduces computational burdens and resource demands, it currently supports only a single-job fine-tuning setup. In this paper, we present ASPEN, a high-throughput framework for fine-tuning LLMs. ASPEN efficiently trains multiple jobs on a single GPU using the LoRA method, leveraging shared pre-trained model and adaptive scheduling. ASPEN is compatible with transformer-based language models like LLaMA and ChatGLM, etc. Experiments show that ASPEN saves 53% of GPU memory when training multiple LLaMA-7B models on NVIDIA A100 80GB GPU and boosts training throughput by about 17% compared to existing methods when training with various pre-trained models on different GPUs. The adaptive scheduling algorithm reduces turnaround time by 24%, end-to-end training latency by 12%, prioritizing jobs and preventing out-of-memory issues. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: 14 pages, 14 figures

arXiv:2311.17630 [pdf, other]

Optimization in Mobile Augmented Reality Systems for the Metaverse over Wireless Communications

Authors: Tianming Lan, Jun Zhao

Abstract: As the essential technical support for Metaverse, Mobile Augmented Reality (MAR) has attracted the attention of many researchers. MAR applications rely on real-time processing of visual and audio data, and thus those heavy workloads can quickly drain the battery of a mobile device. To address such problem, edge-based solutions have appeared for handling some tasks that require more computing power… ▽ More As the essential technical support for Metaverse, Mobile Augmented Reality (MAR) has attracted the attention of many researchers. MAR applications rely on real-time processing of visual and audio data, and thus those heavy workloads can quickly drain the battery of a mobile device. To address such problem, edge-based solutions have appeared for handling some tasks that require more computing power. However, such strategies introduce a new trade-off: reducing the network latency and overall energy consumption requires limiting the size of the data sent to the edge server, which, in turn, results in lower accuracy. In this paper, we design an edge-based MAR system and propose a mathematical model to describe it and analyze the trade-off between latency, accuracy, server resources allocation and energy consumption. Furthermore, an algorithm named LEAO is proposed to solve this problem. We evaluate the performance of the LEAO and other related algorithms across various simulation scenarios. The results demonstrate the superiority of the LEAO algorithm. Finally, our work provides insight into optimization problem in edge-based MAR system for Metaverse. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: This paper appears in IEEE Global Communications Conference (GLOBECOM) 2023

arXiv:2311.16018 [pdf, other]

RIDE: Real-time Intrusion Detection via Explainable Machine Learning Implemented in a Memristor Hardware Architecture

Authors: **gdi Chen, Lei Zhang, Joseph Riem, Gina Adam, Nathaniel D. Bastian, Tian Lan

Abstract: Deep Learning (DL) based methods have shown great promise in network intrusion detection by identifying malicious network traffic behavior patterns with high accuracy, but their applications to real-time, packet-level detections in high-speed communication networks are challenging due to the high computation time and resource requirements of Deep Neural Networks (DNNs), as well as lack of explaina… ▽ More Deep Learning (DL) based methods have shown great promise in network intrusion detection by identifying malicious network traffic behavior patterns with high accuracy, but their applications to real-time, packet-level detections in high-speed communication networks are challenging due to the high computation time and resource requirements of Deep Neural Networks (DNNs), as well as lack of explainability. To this end, we propose a packet-level network intrusion detection solution that makes novel use of Recurrent Autoencoders to integrate an arbitrary-length sequence of packets into a more compact joint feature embedding, which is fed into a DNN-based classifier. To enable explainability and support real-time detections at micro-second speed, we further develop a Software-Hardware Co-Design approach to efficiently realize the proposed solution by converting the learned detection policies into decision trees and implementing them using an emerging architecture based on memristor devices. By jointly optimizing associated software and hardware constraints, we show that our approach leads to an extremely efficient, real-time solution with high detection accuracy at the packet level. Evaluation results on real-world datasets (e.g., UNSW and CIC-IDS datasets) demonstrate nearly three-nines detection accuracy with a substantial speedup of nearly four orders of magnitude. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.13235 [pdf]

Strong Light-Matter Coupling Facilitated Charge Carrier Transport in Cavity Organic Solar Cells

Authors: Yahui Tang, Alexandra Stuart, Timothy van der Laan, Girish Lakhwani

Abstract: Strong light-matter coupling has shown great potential for modifying the electro-optical properties of semiconducting materials in recent years. In the strong coupling regime, excitons and cavity photons form new states named exciton-polaritons, with their properties a hybrid of each constituent. Herein, we report strong coupling observed in solution-processed donor:acceptor bulk-heterojunction or… ▽ More Strong light-matter coupling has shown great potential for modifying the electro-optical properties of semiconducting materials in recent years. In the strong coupling regime, excitons and cavity photons form new states named exciton-polaritons, with their properties a hybrid of each constituent. Herein, we report strong coupling observed in solution-processed donor:acceptor bulk-heterojunction organic solar cells (OSCs) evidenced by the observed Rabi splitting of ~300 meV and the effects of strong coupling on OSC operations. Combining the transient photovoltage decay measurement and nanosecond transient absorption spectroscopy, our results reveal that the effective charge carrier lifetimes are longer in cavity devices, attributed to the reduced bimolecular recombination. It is also found that access to CT state(s) of higher energy is enabled in cavity devices. This study demonstrates that strong coupling can effectively modify the device- and photo-physics in OSCs and opens a new pathway for engineering more efficient OSCs. △ Less

Submitted 22 November, 2023; originally announced November 2023.

arXiv:2310.19841 [pdf]

An interpretable clustering approach to safety climate analysis: examining driver group distinction in safety climate perceptions

Authors: Kailai Sun, Tianxiang Lan, Yang Miang Goh, Sufiana Safiena, Yueng-Hsiang Huang, Bailey Lytle, Yimin He

Abstract: The transportation industry, particularly the trucking sector, is prone to workplace accidents and fatalities. Accidents involving large trucks accounted for a considerable percentage of overall traffic fatalities. Recognizing the crucial role of safety climate in accident prevention, researchers have sought to understand its factors and measure its impact within organizations. While existing data… ▽ More The transportation industry, particularly the trucking sector, is prone to workplace accidents and fatalities. Accidents involving large trucks accounted for a considerable percentage of overall traffic fatalities. Recognizing the crucial role of safety climate in accident prevention, researchers have sought to understand its factors and measure its impact within organizations. While existing data-driven safety climate studies have made remarkable progress, clustering employees based on their safety climate perception is innovative and has not been extensively utilized in research. Identifying clusters of drivers based on their safety climate perception allows the organization to profile its workforce and devise more impactful interventions. The lack of utilizing the clustering approach could be due to difficulties interpreting or explaining the factors influencing employees' cluster membership. Moreover, existing safety-related studies did not compare multiple clustering algorithms, resulting in potential bias. To address these issues, this study introduces an interpretable clustering approach for safety climate analysis. This study compares 5 algorithms for clustering truck drivers based on their safety climate perceptions. It proposes a novel method for quantitatively evaluating partial dependence plots (QPDP). To better interpret the clustering results, this study introduces different interpretable machine learning measures (SHAP, PFI, and QPDP). Drawing on data collected from more than 7,000 American truck drivers, this study significantly contributes to the scientific literature. It highlights the critical role of supervisory care promotion in distinguishing various driver groups. The Python code is available at https://github.com/NUS-DBE/truck-driver-safety-climate. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: Submitted to Journal:Accident Analysis and Prevention

arXiv:2310.10226 [pdf, other]

Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective

Authors: Huayang Li, Tian Lan, Zihao Fu, Deng Cai, Lemao Liu, Nigel Collier, Taro Watanabe, Yixuan Su

Abstract: There are a number of diverging hypotheses about the neural text degeneration problem, i.e., generating repetitive and dull loops, which makes this problem both interesting and confusing. In this work, we aim to advance our understanding by presenting a straightforward and fundamental explanation from the data perspective. Our preliminary investigation reveals a strong correlation between the dege… ▽ More There are a number of diverging hypotheses about the neural text degeneration problem, i.e., generating repetitive and dull loops, which makes this problem both interesting and confusing. In this work, we aim to advance our understanding by presenting a straightforward and fundamental explanation from the data perspective. Our preliminary investigation reveals a strong correlation between the degeneration issue and the presence of repetitions in training data. Subsequent experiments also demonstrate that by selectively drop** out the attention to repetitive words in training data, degeneration can be significantly minimized. Furthermore, our empirical analysis illustrates that prior works addressing the degeneration issue from various standpoints, such as the high-inflow words, the likelihood objective, and the self-reinforcement phenomenon, can be interpreted by one simple explanation. That is, penalizing the repetitions in training data is a common and fundamental factor for their effectiveness. Moreover, our experiments reveal that penalizing the repetitions in training data remains critical even when considering larger model sizes and instruction tuning. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted to NeurIPS 2023

arXiv:2310.08670 [pdf, other]

Every Parameter Matters: Ensuring the Convergence of Federated Learning with Dynamic Heterogeneous Models Reduction

Authors: Hanhan Zhou, Tian Lan, Guru Venkataramani, Wenbo Ding

Abstract: Cross-device Federated Learning (FL) faces significant challenges where low-end clients that could potentially make unique contributions are excluded from training large models due to their resource bottlenecks. Recent research efforts have focused on model-heterogeneous FL, by extracting reduced-size models from the global model and applying them to local clients accordingly. Despite the empirica… ▽ More Cross-device Federated Learning (FL) faces significant challenges where low-end clients that could potentially make unique contributions are excluded from training large models due to their resource bottlenecks. Recent research efforts have focused on model-heterogeneous FL, by extracting reduced-size models from the global model and applying them to local clients accordingly. Despite the empirical success, general theoretical guarantees of convergence on this method remain an open question. This paper presents a unifying framework for heterogeneous FL algorithms with online model extraction and provides a general convergence analysis for the first time. In particular, we prove that under certain sufficient conditions and for both IID and non-IID data, these algorithms converge to a stationary point of standard FL for general smooth cost functions. Moreover, we introduce the concept of minimum coverage index, together with model reduction noise, which will determine the convergence of heterogeneous federated learning, and therefore we advocate for a holistic approach that considers both factors to enhance the efficiency of heterogeneous federated learning. △ Less

Submitted 26 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2309.12606 [pdf, other]

Stable Reconstruction of Anisotropic Objects from Near-Field Electromagnetic Data

Authors: Tran H. Lan, Dinh-Liem Nguyen

Abstract: This paper addresses the electromagnetic inverse scattering problem of determining the location and shape of anisotropic objects from near-field data. We investigate both cases involving the Helmholtz equation and Maxwell's equations for this inverse problem. Our study focuses on develo** efficient imaging functionals that enable a fast and stable recovery of the anisotropic object. The implemen… ▽ More This paper addresses the electromagnetic inverse scattering problem of determining the location and shape of anisotropic objects from near-field data. We investigate both cases involving the Helmholtz equation and Maxwell's equations for this inverse problem. Our study focuses on develo** efficient imaging functionals that enable a fast and stable recovery of the anisotropic object. The implementation of the imaging functionals is simple and avoids the need to solve an ill-posed problem. The resolution analysis of the imaging functionals is conducted using the Green representation formula. Furthermore, we establish stability estimates for these imaging functionals when noise is present in the data. To illustrate the effectiveness of the methods, we present numerical examples showcasing their performance. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: 22 pages

arXiv:2309.04707 [pdf, other]

Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior from an Exploratory Perspective

Authors: Muzhe Guo, Feixu Yu, Tian Lan, Fang **

Abstract: Reinforcement learning (RL) is a powerful tool for solving complex decision-making problems, but its lack of transparency and interpretability has been a major challenge in domains where decisions have significant real-world consequences. In this paper, we propose a novel Advantage Actor-Critic with Reasoner (A2CR), which can be easily applied to Actor-Critic-based RL models and make them interpre… ▽ More Reinforcement learning (RL) is a powerful tool for solving complex decision-making problems, but its lack of transparency and interpretability has been a major challenge in domains where decisions have significant real-world consequences. In this paper, we propose a novel Advantage Actor-Critic with Reasoner (A2CR), which can be easily applied to Actor-Critic-based RL models and make them interpretable. A2CR consists of three interconnected networks: the Policy Network, the Value Network, and the Reasoner Network. By predefining and classifying the underlying purpose of the actor's actions, A2CR automatically generates a more comprehensive and interpretable paradigm for understanding the agent's decision-making process. It offers a range of functionalities such as purpose-based saliency, early failure detection, and model supervision, thereby promoting responsible and trustworthy RL. Evaluations conducted in action-rich Super Mario Bros environments yield intriguing findings: Reasoner-predicted label proportions decrease for ``Breakout" and increase for ``Hovering" as the exploration level of the RL algorithm intensifies. Additionally, purpose-based saliencies are more focused and comprehensible. △ Less

Submitted 9 September, 2023; originally announced September 2023.

arXiv:2308.14897 [pdf, other]

Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning

Authors: Hanhan Zhou, Tian Lan, Vaneet Aggarwal

Abstract: Offline reinforcement learning aims to utilize datasets of previously gathered environment-action interaction records to learn a policy without access to the real environment. Recent work has shown that offline reinforcement learning can be formulated as a sequence modeling problem and solved via supervised learning with approaches such as decision transformer. While these sequence-based methods a… ▽ More Offline reinforcement learning aims to utilize datasets of previously gathered environment-action interaction records to learn a policy without access to the real environment. Recent work has shown that offline reinforcement learning can be formulated as a sequence modeling problem and solved via supervised learning with approaches such as decision transformer. While these sequence-based methods achieve competitive results over return-to-go methods, especially on tasks that require longer episodes or with scarce rewards, importance sampling is not considered to correct the policy bias when dealing with off-policy data, mainly due to the absence of behavior policy and the use of deterministic evaluation policies. To this end, we propose DPE: an RL algorithm that blends offline sequence modeling and offline reinforcement learning with Double Policy Estimation (DPE) in a unified framework with statistically proven properties on variance reduction. We validate our method in multiple tasks of OpenAI Gym with D4RL benchmarks. Our method brings a performance improvements on selected methods which outperforms SOTA baselines in several tasks, demonstrating the advantages of enabling double policy estimation for sequence-modeled reinforcement learning. △ Less

Submitted 28 August, 2023; originally announced August 2023.

arXiv:2308.03358 [pdf, other]

RGMComm: Return Gap Minimization via Discrete Communications in Multi-Agent Reinforcement Learning

Authors: **gdi Chen, Tian Lan, Carlee Joe-Wong

Abstract: Communication is crucial for solving cooperative Multi-Agent Reinforcement Learning tasks in partially observable Markov Decision Processes. Existing works often rely on black-box methods to encode local information/features into messages shared with other agents, leading to the generation of continuous messages with high communication overhead and poor interpretability. Prior attempts at discrete… ▽ More Communication is crucial for solving cooperative Multi-Agent Reinforcement Learning tasks in partially observable Markov Decision Processes. Existing works often rely on black-box methods to encode local information/features into messages shared with other agents, leading to the generation of continuous messages with high communication overhead and poor interpretability. Prior attempts at discrete communication methods generate one-hot vectors trained as part of agents' actions and use the Gumbel softmax operation for calculating message gradients, which are all heuristic designs that do not provide any quantitative guarantees on the expected return. This paper establishes an upper bound on the return gap between an ideal policy with full observability and an optimal partially observable policy with discrete communication. This result enables us to recast multi-agent communication into a novel online clustering problem over the local observations at each agent, with messages as cluster labels and the upper bound on the return gap as clustering loss. To minimize the return gap, we propose the Return-Gap-Minimization Communication (RGMComm) algorithm, which is a surprisingly simple design of discrete message generation functions and is integrated with reinforcement learning through the utilization of a novel Regularized Information Maximization loss function, which incorporates cosine-distance as the clustering metric. Evaluations show that RGMComm significantly outperforms state-of-the-art multi-agent communication baselines and can achieve nearly optimal returns with few-bit messages that are naturally interpretable. △ Less

Submitted 18 December, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

arXiv:2308.00258 [pdf, other]

AQUILA: Communication Efficient Federated Learning with Adaptive Quantization in Device Selection Strategy

Authors: Zihao Zhao, Yuzhu Mao, Zhenpeng Shi, Yang Liu, Tian Lan, Wenbo Ding, Xiao-** Zhang

Abstract: The widespread adoption of Federated Learning (FL), a privacy-preserving distributed learning methodology, has been impeded by the challenge of high communication overheads, typically arising from the transmission of large-scale models. Existing adaptive quantization methods, designed to mitigate these overheads, operate under the impractical assumption of uniform device participation in every tra… ▽ More The widespread adoption of Federated Learning (FL), a privacy-preserving distributed learning methodology, has been impeded by the challenge of high communication overheads, typically arising from the transmission of large-scale models. Existing adaptive quantization methods, designed to mitigate these overheads, operate under the impractical assumption of uniform device participation in every training round. Additionally, these methods are limited in their adaptability due to the necessity of manual quantization level selection and often overlook biases inherent in local devices' data, thereby affecting the robustness of the global model. In response, this paper introduces AQUILA (adaptive quantization in device selection strategy), a novel adaptive framework devised to effectively handle these issues, enhancing the efficiency and robustness of FL. AQUILA integrates a sophisticated device selection method that prioritizes the quality and usefulness of device updates. Utilizing the exact global model stored by devices, it enables a more precise device selection criterion, reduces model deviation, and limits the need for hyperparameter adjustments. Furthermore, AQUILA presents an innovative quantization criterion, optimized to improve communication efficiency while assuring model convergence. Our experiments demonstrate that AQUILA significantly decreases communication costs compared to existing methods, while maintaining comparable model performance across diverse non-homogeneous FL settings, such as Non-IID data and heterogeneous model architectures. △ Less

Submitted 4 October, 2023; v1 submitted 31 July, 2023; originally announced August 2023.

arXiv:2307.11629 [pdf, other]

Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs

Authors: Jiayu Chen, **gdi Chen, Tian Lan, Vaneet Aggarwal

Abstract: Covering skill (a.k.a., option) discovery has been developed to improve the exploration of RL in single-agent scenarios with sparse reward signals, through connecting the most distant states in the embedding space provided by the Fiedler vector of the state transition graph. Given that joint state space grows exponentially with the number of agents in multi-agent systems, existing researches still… ▽ More Covering skill (a.k.a., option) discovery has been developed to improve the exploration of RL in single-agent scenarios with sparse reward signals, through connecting the most distant states in the embedding space provided by the Fiedler vector of the state transition graph. Given that joint state space grows exponentially with the number of agents in multi-agent systems, existing researches still relying on single-agent skill discovery either become prohibitive or fail to directly discover joint skills that improve the connectivity of the joint state space. In this paper, we propose multi-agent skill discovery which enables the ease of decomposition. Our key idea is to approximate the joint state space as a Kronecker graph, based on which we can directly estimate its Fiedler vector using the Laplacian spectrum of individual agents' transition graphs. Further, considering that directly computing the Laplacian spectrum is intractable for tasks with infinite-scale state spaces, we further propose a deep learning extension of our method by estimating eigenfunctions through NN-based representation learning techniques. The evaluation on multi-agent tasks built with simulators like Mujoco, shows that the proposed algorithm can successfully identify multi-agent skills, and significantly outperforms the state-of-the-art. Codes are available at: https://github.itap.purdue.edu/Clan-labs/Scalable_MAOD_via_KP. △ Less

Submitted 20 August, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

Comments: Accepted to NeurIPS 2022. arXiv admin note: substantial text overlap with arXiv:2201.08227

arXiv:2307.06962 [pdf, other]

Copy Is All You Need

Authors: Tian Lan, Deng Cai, Yan Wang, Heyan Huang, Xian-Ling Mao

Abstract: The dominant text generation models compose the output by sequentially selecting words from a fixed vocabulary. In this paper, we formulate text generation as progressively copying text segments (e.g., words or phrases) from an existing text collection. We compute the contextualized representations of meaningful text segments and index them using efficient vector search toolkits. The task of text… ▽ More The dominant text generation models compose the output by sequentially selecting words from a fixed vocabulary. In this paper, we formulate text generation as progressively copying text segments (e.g., words or phrases) from an existing text collection. We compute the contextualized representations of meaningful text segments and index them using efficient vector search toolkits. The task of text generation is then decomposed into a series of copy-and-paste operations: at each time step, we seek suitable text spans from the text collection rather than selecting from a standalone vocabulary. Experiments on the standard language modeling benchmark (WikiText-103) show that our approach achieves better generation quality according to both automatic and human evaluations. Besides, its inference efficiency is comparable to token-level autoregressive models thanks to the reduction of decoding steps. We also show that our approach allows for effective domain adaptation by simply switching to domain-specific text collection without extra training. Finally, we observe that our approach attains additional performance gains by simply scaling up to larger text collections, again without further training.\footnote{Our source codes are publicly available at \url{https://github.com/gmftbyGMFTBY/Copyisallyouneed}.} △ Less

Submitted 13 July, 2023; originally announced July 2023.

Journal ref: The Eleventh International Conference on Learning Representations (ICLR 2023)

arXiv:2307.02099 [pdf, other]

The Predictability of Stock Price: Empirical Study onTick Data in Chinese Stock Market

Authors: Yueshan Chen, Xingyu Xu, Tian Lan, Sihai Zhang

Abstract: Whether or not stocks are predictable has been a topic of concern for decades.The efficient market hypothesis (EMH) says that it is difficult for investors to make extra profits by predicting stock prices, but this may not be true, especially for the Chinese stock market. Therefore, we explore the predictability of the Chinese stock market based on tick data, a widely studied high-frequency data.… ▽ More Whether or not stocks are predictable has been a topic of concern for decades.The efficient market hypothesis (EMH) says that it is difficult for investors to make extra profits by predicting stock prices, but this may not be true, especially for the Chinese stock market. Therefore, we explore the predictability of the Chinese stock market based on tick data, a widely studied high-frequency data. We obtain the predictability of 3, 834 Chinese stocks by adopting the concept of true entropy, which is calculated by Limpel-Ziv data compression method. The Markov chain model and the diffusion kernel model are used to compare the upper bounds on predictability, and it is concluded that there is still a significant performance gap between the forecasting models used and the theoretical upper bounds.Our work shows that more than 73% of stocks have prediction accuracy greater than 70% and RMSE less than 2 CNY under different quantification intervals with different models. We further take Spearman's correlation to reveal that the average stock price and price volatility may have a negative impact on prediction accuracy, which may be helpful for stock investors. △ Less

Submitted 5 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

arXiv:2306.17054 [pdf, other]

Two-tiered Online Optimization of Region-wide Datacenter Resource Allocation via Deep Reinforcement Learning

Authors: Chang-Lin Chen, Hanhan Zhou, Jiayu Chen, Mohammad Pedramfar, Vaneet Aggarwal, Tian Lan, Zheqing Zhu, Chi Zhou, Tim Gasser, Pol Mauri Ruiz, Vijay Menon, Neeraj Kumar, Hongbo Dong

Abstract: This paper addresses the important need for advanced techniques in continuously allocating workloads on shared infrastructures in data centers, a problem arising due to the growing popularity and scale of cloud computing. It particularly emphasizes the scarcity of research ensuring guaranteed capacity in capacity reservations during large-scale failures. To tackle these issues, the paper presents… ▽ More This paper addresses the important need for advanced techniques in continuously allocating workloads on shared infrastructures in data centers, a problem arising due to the growing popularity and scale of cloud computing. It particularly emphasizes the scarcity of research ensuring guaranteed capacity in capacity reservations during large-scale failures. To tackle these issues, the paper presents scalable solutions for resource management. It builds on the prior establishment of capacity reservation in cluster management systems and the two-level resource allocation problem addressed by the Resource Allowance System (RAS). Recognizing the limitations of Mixed Integer Linear Programming (MILP) for server assignment in a dynamic environment, this paper proposes the use of Deep Reinforcement Learning (DRL), which has been successful in achieving long-term optimal results for time-varying systems. A novel two-level design that utilizes a DRL-based algorithm is introduced to solve optimal server-to-reservation assignment, taking into account of fault tolerance, server movement minimization, and network affinity requirements due to the impracticality of directly applying DRL algorithms to large-scale instances with millions of decision variables. The paper explores the interconnection of these levels and the benefits of such an approach for achieving long-term optimal results in the context of large-scale cloud systems. We further show in the experiment section that our two-level DRL approach outperforms the MIP solver and heuristic approaches and exhibits significantly reduced computation time compared to the MIP solver. Specifically, our two-level DRL approach performs 15% better than the MIP solver on minimizing the overall cost. Also, it uses only 26 seconds to execute 30 rounds of decision making, while the MIP solver needs nearly an hour. △ Less

Submitted 29 June, 2023; originally announced June 2023.

arXiv:2306.14616 [pdf]

A Cu3BHT-Graphene van der Waals Heterostructure with Strong Interlayer Coupling

Authors: Zhiyong Wang, Shuai Fu, Wenjie Zhang, Baokun Liang, Tsai Jung Liu, Mike Hambsch, Jonas F. Pöhls, Yufeng Wu, Jianjun Zhang, Tianshu Lan, Xiaodong Li, Haoyuan Qi, Miroslav Polozij, Stefan C. B. Mannsfeld, Ute Kaiser, Mischa Bonn, R. Thomas Weitz, Thomas Heine, Stuart S. P. Parkin, Hai I Wang, Renhao Dong, Xinliang Feng

Abstract: Two dimensional van der Waals heterostructures (2D are of significant interest due to their intriguing physical properties that are critically defined by the constituent monolayers and their interlayer coupling . However, typical inorganic 2 D vdWhs fall into the weakly coupled region, limiting efficient interfacial charge flow crucial for develo** high performance quantum opto electronics. Here… ▽ More Two dimensional van der Waals heterostructures (2D are of significant interest due to their intriguing physical properties that are critically defined by the constituent monolayers and their interlayer coupling . However, typical inorganic 2 D vdWhs fall into the weakly coupled region, limiting efficient interfacial charge flow crucial for develo** high performance quantum opto electronics. Here, we demonstrate strong interlayer coupling in Cu3 BHT (BHT = benzenehexathiol) graphene vdWhs an organic inorganic bilayer characterized by prominent interlayer charge transfer Monolayer Cu3 BHT with a Kagome lattice is synthesized on the water surface and then coupled with graphene to produce a cm2 scale 2D vdWh. Spectroscopic and electrical studies, along with theoretical calculation s show significant hole transfer from monolayer Cu3 BHT to graphene upon contact , being characteristic fingerprints for strong interlayer coupling This study unveils the great potential of integrating highly pi-conjugated 2D coordination polymers (2DCPs) into 2D vdWhs to explor e intriguing physical phenomena. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Showing 1–50 of 255 results for author: Laan, T