Skip to main content

Showing 1–50 of 123 results for author: Jiao, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18900  [pdf, other

    cs.CY cs.AI

    The Rise of Artificial Intelligence in Educational Measurement: Opportunities and Ethical Challenges

    Authors: Okan Bulut, Maggie Beiting-Parrish, Jodi M. Casabianca, Sharon C. Slater, Hong Jiao, Dan Song, Christopher M. Ormerod, Deborah Gbemisola Fabiyi, Rodica Ivan, Cole Walsh, Oscar Rios, Joshua Wilson, Seyma N. Yildirim-Erbasli, Tarid Wongvorachan, Joyce Xinle Liu, Bin Tan, Polina Morilova

    Abstract: The integration of artificial intelligence (AI) in educational measurement has revolutionized assessment methods, enabling automated scoring, rapid content analysis, and personalized feedback through machine learning and natural language processing. These advancements provide timely, consistent feedback and valuable insights into student performance, thereby enhancing the assessment experience. Ho… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 59 pages, 3 figures, a joint work of the Special Interest Group on Artificial Intelligence in Measurement and Education (AIME) from the National Council of Measurement in Education (NCME)

  2. arXiv:2406.06443  [pdf, other

    cs.LG cs.CL cs.CR

    LLM Dataset Inference: Did you train on my dataset?

    Authors: Pratyush Maini, Hengrui Jia, Nicolas Papernot, Adam Dziedzic

    Abstract: The proliferation of large language models (LLMs) in the real world has come with a rise in copyright cases against companies for training their models on unlicensed data from the internet. Recent works have presented methods to identify if individual text sequences were members of the model's training data, known as membership inference attacks (MIAs). We demonstrate that the apparent success of… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Code is available at \href{https://github.com/pratyushmaini/llm_dataset_inference/

  3. arXiv:2406.04594  [pdf, other

    cs.DC cs.AI cs.LG

    Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

    Authors: Jianbo Dong, Bin Luo, Jun Zhang, Pengcheng Zhang, Fei Feng, Yikai Zhu, Ang Liu, Zian Chen, Yi Shi, Hairong Jiao, Gang Lu, Yu Guan, Ennan Zhai, Wencong Xiao, Hanyu Zhao, Man Yuan, Siran Yang, Xiang Li, Jiamang Wang, Rui Men, Jianwei Zhang, Huang Zhong, Dennis Cai, Yuan Xie, Binzhang Fu

    Abstract: The emergence of Large Language Models (LLMs) has necessitated the adoption of parallel training techniques, involving the deployment of thousands of GPUs to train a single model. Unfortunately, we have found that the efficiency of current parallel training is often suboptimal, largely due to the following two main issues. Firstly, hardware failures are inevitable, leading to interruptions in the… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2406.01014  [pdf, other

    cs.CL cs.CV

    Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

    Authors: Junyang Wang, Haiyang Xu, Haitao Jia, Xi Zhang, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

    Abstract: Mobile device operation tasks are increasingly becoming a popular multi-modal AI application scenario. Current Multi-modal Large Language Models (MLLMs), constrained by their training data, lack the capability to function effectively as operation assistants. Instead, MLLM-based agents, which enhance capabilities through tool invocation, are gradually being applied to this scenario. However, the tw… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 22 pages, 11 figures, 10 Tables

  5. arXiv:2406.00440  [pdf, other

    cs.CV

    Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture

    Authors: Xuanchen Li, Yuhao Cheng, Xingyu Ren, Haozhe Jia, Di Xu, Wenhan Zhu, Yichao Yan

    Abstract: 4D head capture aims to generate dynamic topological meshes and corresponding texture maps from videos, which is widely utilized in movies and games for its ability to simulate facial muscle movements and recover dynamic textures in pore-squeezing. The industry often adopts the method involving multi-view stereo and non-rigid alignment. However, this approach is prone to errors and heavily reliant… ▽ More

    Submitted 1 July, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

  6. arXiv:2405.20641  [pdf, other

    cs.CR

    Query Provenance Analysis for Robust and Efficient Query-based Black-box Attack Defense

    Authors: Shaofei Li, Ziqi Zhang, Haomin Jia, Ding Li, Yao Guo, Xiangqun Chen

    Abstract: Query-based black-box attacks have emerged as a significant threat to machine learning systems, where adversaries can manipulate the input queries to generate adversarial examples that can cause misclassification of the model. To counter these attacks, researchers have proposed Stateful Defense Models (SDMs) for detecting adversarial query sequences and rejecting queries that are "similar" to the… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  7. arXiv:2405.00438  [pdf, other

    cs.LG cs.CL

    MetaRM: Shifted Distributions Alignment via Meta-Learning

    Authors: Shihan Dou, Yan Liu, Enyu Zhou, Tianlong Li, Haoxiang Jia, Limao Xiong, Xin Zhao, Junjie Ye, Rui Zheng, Tao Gui, Qi Zhang, Xuan**g Huang

    Abstract: The success of Reinforcement Learning from Human Feedback (RLHF) in language model alignment is critically dependent on the capability of the reward model (RM). However, as the training process progresses, the output distribution of the policy model shifts, leading to the RM's reduced ability to distinguish between responses. This issue is further compounded when the RM, trained on a specific data… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 11 pages, 6 figures. arXiv admin note: text overlap with arXiv:2401.06080

  8. arXiv:2405.00428  [pdf, other

    cs.SE

    CC2Vec: Combining Typed Tokens with Contrastive Learning for Effective Code Clone Detection

    Authors: Shihan Dou, Yueming Wu, Haoxiang Jia, Yuhao Zhou, Yan Liu, Yang Liu

    Abstract: With the development of the open source community, the code is often copied, spread, and evolved in multiple software systems, which brings uncertainty and risk to the software system (e.g., bug propagation and copyright infringement). Therefore, it is important to conduct code clone detection to discover similar code pairs. Many approaches have been proposed to detect code clones where token-base… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 21 pages, 7 figures

  9. arXiv:2404.17701  [pdf, other

    cs.AR cs.LG physics.ins-det

    Embedded FPGA Developments in 130nm and 28nm CMOS for Machine Learning in Particle Detector Readout

    Authors: Julia Gonski, Aseem Gupta, Haoyi Jia, Hyunjoon Kim, Lorenzo Rota, Larry Ruckman, Angelo Dragone, Ryan Herbst

    Abstract: Embedded field programmable gate array (eFPGA) technology allows the implementation of reconfigurable logic within the design of an application-specific integrated circuit (ASIC). This approach offers the low power and efficiency of an ASIC along with the ease of FPGA configuration, particularly beneficial for the use case of machine learning in the data pipeline of next-generation collider experi… ▽ More

    Submitted 1 July, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: 16 pages, 12 figures

  10. arXiv:2404.13991  [pdf, other

    cs.NI

    5GC$^2$ache: Improving 5G UPF Performance via Cache Optimization

    Authors: Haonan Jia, Meng Wang, Biyi Li, Yirui Liu, Junchen Guo, Pengyu Zhang

    Abstract: Last Level Cache (LLC) is a precious and critical resource that impacts the performance of applications running on top of CPUs. In this paper, we reveal the significant impact of LLC on the performance of the 5G user plane function (UPF) when running a cloudified 5G core on general-purposed servers. With extensive measurements showing that the throughput can degrade by over 50\% when the precious… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  11. arXiv:2404.13430  [pdf, other

    physics.chem-ph cs.LG

    React-OT: Optimal Transport for Generating Transition State in Chemical Reactions

    Authors: Chenru Duan, Guan-Horng Liu, Yuanqi Du, Tianrong Chen, Qiyuan Zhao, Haojun Jia, Carla P. Gomes, Evangelos A. Theodorou, Heather J. Kulik

    Abstract: Transition states (TSs) are transient structures that are key in understanding reaction mechanisms and designing catalysts but challenging to be captured in experiments. Alternatively, many optimization algorithms have been developed to search for TSs computationally. Yet the cost of these algorithms driven by quantum chemistry methods (usually density functional theory) is still high, posing chal… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 5 figures, 1 table

  12. arXiv:2404.01941  [pdf, other

    cs.CV

    LPSNet: End-to-End Human Pose and Shape Estimation with Lensless Imaging

    Authors: Haoyang Ge, Qiao Feng, Hailong Jia, Xiongzheng Li, Xiangjun Yin, You Zhou, **gyu Yang, Kun Li

    Abstract: Human pose and shape (HPS) estimation with lensless imaging is not only beneficial to privacy protection but also can be used in covert surveillance scenarios due to the small size and simple structure of this device. However, this task presents significant challenges due to the inherent ambiguity of the captured measurements and lacks effective methods for directly estimating human pose and shape… ▽ More

    Submitted 8 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024. More results available at https://cic.tju.edu.cn/faculty/likun/projects/LPSNet

  13. arXiv:2403.14487  [pdf, other

    cs.CV

    DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing

    Authors: Yueru Jia, Yuhui Yuan, Aosong Cheng, Chuke Wang, Ji Li, Huizhu Jia, Shanghang Zhang

    Abstract: Recently, how to achieve precise image editing has attracted increasing attention, especially given the remarkable success of text-to-image generation models. To unify various spatial-aware image editing abilities into one framework, we adopt the concept of layers from the design domain to manipulate objects flexibly with various operations. The key insight is to transform the spatial-aware image… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: technical report, 15 pages, webpage: https://design-edit.github.io/

  14. arXiv:2403.13248  [pdf, other

    cs.CV

    Mora: Enabling Generalist Video Generation via A Multi-Agent Framework

    Authors: Zhengqing Yuan, Ruoxi Chen, Zhaoxu Li, Haolong Jia, Lifang He, Chi Wang, Lichao Sun

    Abstract: Sora is the first large-scale generalist video generation model that garnered significant attention across society. Since its launch by OpenAI in February 2024, no other video generation models have paralleled {Sora}'s performance or its capacity to support a broad spectrum of video generation tasks. Additionally, there are only a few fully published video generation models, with the majority bein… ▽ More

    Submitted 22 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  15. arXiv:2403.12363  [pdf, other

    cs.CR cs.NI

    E-DoH: Elegantly Detecting the Depths of Open DoH Service on the Internet

    Authors: Cong Dong, Jiahai Yang, Yun Li, Yue Wu, Yufan Chen, Chenglong Li, Haoran Jiao, Xia Yin, Yuling Liu

    Abstract: In recent years, DNS over Encrypted (DoE) methods have been regarded as a novel trend within the realm of the DNS ecosystem. In these DoE methods, DNS over HTTPS (DoH) provides encryption to protect data confidentiality while providing better obfuscation to avoid censorship by multiplexing port 443 with web services. This development introduced certain inconveniences in discovering publicly availa… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  16. arXiv:2403.07500  [pdf, other

    cs.CV cs.AI

    Block-wise LoRA: Revisiting Fine-grained LoRA for Effective Personalization and Stylization in Text-to-Image Generation

    Authors: Likun Li, Haoqi Zeng, Changpeng Yang, Haozhe Jia, Di Xu

    Abstract: The objective of personalization and stylization in text-to-image is to instruct a pre-trained diffusion model to analyze new concepts introduced by users and incorporate them into expected styles. Recently, parameter-efficient fine-tuning (PEFT) approaches have been widely adopted to address this task and have greatly propelled the development of this field. Despite their popularity, existing eff… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  17. arXiv:2403.01444  [pdf, other

    cs.CV

    3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos

    Authors: Jiakai Sun, Han Jiao, Guangyuan Li, Zhanjie Zhang, Lei Zhao, Wei Xing

    Abstract: Constructing photo-realistic Free-Viewpoint Videos (FVVs) of dynamic scenes from multi-view videos remains a challenging endeavor. Despite the remarkable advancements achieved by current neural rendering techniques, these methods generally require complete video sequences for offline training and are not capable of real-time rendering. To address these constraints, we introduce 3DGStream, a method… ▽ More

    Submitted 11 June, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: CVPR 2024 Accepted (Highlight). Project Page: https://sjojok.github.io/3dgstream

  18. arXiv:2403.00486  [pdf, other

    cs.CV

    Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching

    Authors: Xianqi Wang, Gangwei Xu, Hao Jia, Xin Yang

    Abstract: Stereo matching methods based on iterative optimization, like RAFT-Stereo and IGEV-Stereo, have evolved into a cornerstone in the field of stereo matching. However, these methods struggle to simultaneously capture high-frequency information in edges and low-frequency information in smooth regions due to the fixed receptive field. As a result, they tend to lose details, blur edges, and produce fals… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  19. arXiv:2402.15721  [pdf, other

    cs.AI cs.CL

    Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models

    Authors: Chaoya Jiang, Wei Ye, Mengfan Dong, Hongrui Jia, Haiyang Xu, Ming Yan, Ji Zhang, Shikun Zhang

    Abstract: Large Vision Language Models exhibit remarkable capabilities but struggle with hallucinations inconsistencies between images and their descriptions. Previous hallucination evaluation studies on LVLMs have identified hallucinations in terms of objects, attributes, and relations but overlooked complex hallucinations that create an entire narrative around a fictional entity. In this paper, we introdu… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  20. arXiv:2402.10616  [pdf

    cs.CR

    Credential Control Balance: A Universal Blockchain Account Model Abstract From Bank to Bitcoin, Ethereum External Owned Account and Account Abstraction

    Authors: Huifeng Jiao, Dr. Nathapon Udomlertsakul, Dr. Anukul Tamprasirt

    Abstract: Blockchain market value peaked at $3 trillion, fell to $1 trillion, then recovered to $1.5 trillion and is rising again. Blockchain accounts secure most on-chain assets in this huge market (Web-12). This paper initiates a universal blockchain account model from a comprehensive review of blockchain account development, encompassing both academic and industry perspectives. This paper uses a model an… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 22 pages, 15 figures, conference paper(Thailand International College Conference 2024)

  21. arXiv:2402.09264  [pdf, other

    cs.LG cs.HC

    UR2M: Uncertainty and Resource-Aware Event Detection on Microcontrollers

    Authors: Hong Jia, Young D. Kwon, Dong Ma, Nhat Pham, Lorena Qendro, Tam Vu, Cecilia Mascolo

    Abstract: Traditional machine learning techniques are prone to generating inaccurate predictions when confronted with shifts in the distribution of data between the training and testing phases. This vulnerability can lead to severe consequences, especially in applications such as mobile healthcare. Uncertainty estimation has the potential to mitigate this issue by assessing the reliability of a model's outp… ▽ More

    Submitted 12 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  22. arXiv:2402.01391  [pdf, other

    cs.SE cs.CL

    StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

    Authors: Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuan**g Huang, Tao Gui

    Abstract: The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code generation quality. However, the lengthy code generated by LLMs in response to complex human requirements makes RL exploration a challenge. Also, since the unit te… ▽ More

    Submitted 5 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 13 pages, 5 figures

  23. arXiv:2401.09984  [pdf, other

    cs.CL

    Gradable ChatGPT Translation Evaluation

    Authors: Hui Jiao, Bei Peng, Lu Zong, Xiaojun Zhang, Xinwei Li

    Abstract: ChatGPT, as a language model based on large-scale pre-training, has exerted a profound influence on the domain of machine translation. In ChatGPT, a "Prompt" refers to a segment of text or instruction employed to steer the model towards generating a specific category of response. The design of the translation prompt emerges as a key aspect that can wield influence over factors such as the style, p… ▽ More

    Submitted 4 June, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: Publish in the journal Procesamiento del Lenguaje Natural

  24. arXiv:2401.02413  [pdf, other

    stat.ML cs.LG

    Simulation-Based Inference with Quantile Regression

    Authors: He Jia

    Abstract: We present Neural Quantile Estimation (NQE), a novel Simulation-Based Inference (SBI) method based on conditional quantile regression. NQE autoregressively learns individual one dimensional quantiles for each posterior dimension, conditioned on the data and previous posterior dimensions. Posterior samples are obtained by interpolating the predicted quantiles using monotonic cubic Hermite spline, w… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: 8+13 pages, 7+7 figures

  25. arXiv:2401.02326  [pdf, other

    cs.CV

    ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation

    Authors: Xinyang Pu, Hecheng Jia, Linghao Zheng, Feng Wang, Feng Xu

    Abstract: In the realm of artificial intelligence, the emergence of foundation models, backed by high computing capabilities and extensive data, has been revolutionary. Segment Anything Model (SAM), built on the Vision Transformer (ViT) model with millions of parameters and vast training dataset SA-1B, excels in various segmentation scenarios relying on its significance of semantic information and generaliz… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  26. arXiv:2401.01165  [pdf, other

    cs.LG eess.SP

    Reinforcement Learning for SAR View Angle Inversion with Differentiable SAR Renderer

    Authors: Yanni Wang, Hecheng Jia, Shilei Fu, Hui** Lin, Feng Xu

    Abstract: The electromagnetic inverse problem has long been a research hotspot. This study aims to reverse radar view angles in synthetic aperture radar (SAR) images given a target model. Nonetheless, the scarcity of SAR data, combined with the intricate background interference and imaging mechanisms, limit the applications of existing learning-based approaches. To address these challenges, we propose an in… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

  27. arXiv:2312.16995  [pdf, other

    cs.CV

    FlowDA: Unsupervised Domain Adaptive Framework for Optical Flow Estimation

    Authors: Miaojie Feng, Longliang Liu, Hao Jia, Gangwei Xu, Xin Yang

    Abstract: Collecting real-world optical flow datasets is a formidable challenge due to the high cost of labeling. A shortage of datasets significantly constrains the real-world performance of optical flow models. Building virtual datasets that resemble real scenarios offers a potential solution for performance enhancement, yet a domain gap separates virtual and real datasets. This paper introduces FlowDA, a… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 11 pages, 5 figures

  28. arXiv:2312.16807  [pdf, other

    cs.NI eess.SY

    Efficient Interference Graph Estimation via Concurrent Flooding

    Authors: Haifeng Jia, Yichen Wei, Zhan Wang, Jiani **, Haorui Li, Yibo Pi

    Abstract: Traditional wisdom for network management allocates network resources separately for the measurement and data transmission tasks. Heavy measurement tasks may take up resources for data transmission and significantly reduce network performance. It is therefore challenging for interference graphs, deemed as incurring heavy measurement overhead, to be used in practice in wireless networks. To address… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: Accepted by International Conference on Embedded Wireless Systems and Networking 2023 (EWSN'23), 7 pages with 9 figures, equal contribution by Haifeng Jia and Yichen Wei

    ACM Class: C.2

  29. arXiv:2312.06193  [pdf, other

    cs.CV

    DisControlFace: Disentangled Control for Personalized Facial Image Editing

    Authors: Haozhe Jia, Yan Li, Hengfei Cui, Di Xu, Changpeng Yang, Yuwang Wang, Tao Yu

    Abstract: In this work, we focus on exploring explicit fine-grained control of generative facial image editing, all while generating faithful and consistent personalized facial appearances. We identify the key challenge of this task as the exploration of disentangled conditional control in the generation process, and accordingly propose a novel diffusion-based framework, named DisControlFace, comprising two… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  30. arXiv:2312.03790  [pdf, other

    cs.CV

    Memory-Efficient Optical Flow via Radius-Distribution Orthogonal Cost Volume

    Authors: Gangwei Xu, Shujun Chen, Hao Jia, Miaojie Feng, Xin Yang

    Abstract: The full 4D cost volume in Recurrent All-Pairs Field Transforms (RAFT) or global matching by Transformer achieves impressive performance for optical flow estimation. However, their memory consumption increases quadratically with input resolution, rendering them impractical for high-resolution images. In this paper, we present MeFlow, a novel memory-efficient method for high-resolution optical flow… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 10 pages, 9 figures

  31. arXiv:2312.03179  [pdf, other

    hep-ex cs.LG quant-ph

    CaloQVAE : Simulating high-energy particle-calorimeter interactions using hybrid quantum-classical generative models

    Authors: Sehmimul Hoque, Hao Jia, Abhishek Abhishek, Mojde Fadaie, J. Quetzalcoatl Toledo-Marín, Tiago Vale, Roger G. Melko, Maximilian Swiatlowski, Wojciech T. Fedorko

    Abstract: The Large Hadron Collider's high luminosity era presents major computational challenges in the analysis of collision events. Large amounts of Monte Carlo (MC) simulation will be required to constrain the statistical uncertainties of the simulated datasets below these of the experimental data. Modelling of high-energy particles propagating through the calorimeter section of the detector is the most… ▽ More

    Submitted 10 May, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: 6 pages, 3 figures

    MSC Class: 81P68; 68T07; 81V99

  32. arXiv:2311.18712  [pdf, other

    cs.CL

    CoRec: An Easy Approach for Coordination Recognition

    Authors: Qing Wang, Haojie Jia, Wenfei Song, Qi Li

    Abstract: In this paper, we observe and address the challenges of the coordination recognition task. Most existing methods rely on syntactic parsers to identify the coordinators in a sentence and detect the coordination boundaries. However, state-of-the-art syntactic parsers are slow and suffer from errors, especially for long and complicated sentences. To better solve the problems, we propose a pipeline mo… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: Accepted by EMNLP 2023 Main Conference (oral presentation)

  33. arXiv:2311.16567  [pdf, other

    cs.CV

    MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

    Authors: Yang Zhao, Yanwu Xu, Zhisheng Xiao, Haolin Jia, Tingbo Hou

    Abstract: The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed. In this paper, we propose \textbf{MobileDiffusion}, a highly efficient text-to-image diffusion model obtained through extensive optimizations in both architecture and sampling techniques. We conduct a comprehensive examination of model architecture des… ▽ More

    Submitted 12 June, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  34. arXiv:2311.11420  [pdf, other

    cs.LG cs.AI cs.CV

    LifeLearner: Hardware-Aware Meta Continual Learning System for Embedded Computing Platforms

    Authors: Young D. Kwon, Jagmohan Chauhan, Hong Jia, Stylianos I. Venieris, Cecilia Mascolo

    Abstract: Continual Learning (CL) allows applications such as user personalization and household robots to learn on the fly and adapt to context. This is an important feature when context, actions, and users change. However, enabling CL on resource-constrained embedded systems is challenging due to the limited labeled data, memory, and computing capacity. In this paper, we propose LifeLearner, a hardware-aw… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

    Comments: Accepted for publication at SenSys 2023

  35. arXiv:2311.10463  [pdf, other

    eess.IV cs.CV

    Correlation-Distance Graph Learning for Treatment Response Prediction from rs-fMRI

    Authors: Xiatian Zhang, Sisi Zheng, Hubert P. H. Shum, Haozheng Zhang, Nan Song, Mingkang Song, Hongxiao Jia

    Abstract: Resting-state fMRI (rs-fMRI) functional connectivity (FC) analysis provides valuable insights into the relationships between different brain regions and their potential implications for neurological or psychiatric disorders. However, specific design efforts to predict treatment response from rs-fMRI remain limited due to difficulties in understanding the current brain state and the underlying mech… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: Proceedings of the 2023 International Conference on Neural Information Processing (ICONIP)

  36. arXiv:2311.07397  [pdf, other

    cs.CL cs.CV

    AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation

    Authors: Junyang Wang, Yuhang Wang, Guohai Xu, **g Zhang, Yukai Gu, Haitao Jia, Jiaqi Wang, Haiyang Xu, Ming Yan, Ji Zhang, Jitao Sang

    Abstract: Despite making significant progress in multi-modal tasks, current Multi-modal Large Language Models (MLLMs) encounter the significant challenge of hallucinations, which may lead to harmful consequences. Therefore, evaluating MLLMs' hallucinations is becoming increasingly important in model improvement and practical application deployment. Previous works are limited in high evaluation costs (e.g.,… ▽ More

    Submitted 23 February, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: 14 pages, 9 figures

  37. arXiv:2311.02340  [pdf, other

    cs.CV

    MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo Matching

    Authors: Miaojie Feng, Junda Cheng, Hao Jia, Longliang Liu, Gangwei Xu, Qingyong Hu, Xin Yang

    Abstract: Stereo matching is a fundamental task in scene comprehension. In recent years, the method based on iterative optimization has shown promise in stereo matching. However, the current iteration framework employs a single-peak lookup, which struggles to handle the multi-peak problem effectively. Additionally, the fixed search range used during the iteration process limits the final convergence effects… ▽ More

    Submitted 27 January, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: Accepted to 3DV 2024

  38. arXiv:2310.05620  [pdf, other

    cs.CL

    LAiW: A Chinese Legal Large Language Models Benchmark

    Authors: Yongfu Dai, Duanyu Feng, Jimin Huang, Haochen Jia, Qianqian Xie, Yifang Zhang, Weiguang Han, Wei Tian, Hao Wang

    Abstract: General and legal domain LLMs have demonstrated strong performance in various tasks of LegalAI. However, the current evaluations of these LLMs in LegalAI are defined by the experts of computer science, lacking consistency with the logic of legal practice, making it difficult to judge their practical capabilities. To address this challenge, we are the first to build the Chinese legal LLMs benchmark… ▽ More

    Submitted 18 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

  39. arXiv:2308.10334  [pdf, other

    cs.CV

    Coordinate Transformer: Achieving Single-stage Multi-person Mesh Recovery from Videos

    Authors: Haoyuan Li, Haoye Dong, Hanchao Jia, Dong Huang, Michael C. Kampffmeyer, Liang Lin, Xiaodan Liang

    Abstract: Multi-person 3D mesh recovery from videos is a critical first step towards automatic perception of group behavior in virtual reality, physical therapy and beyond. However, existing approaches rely on multi-stage paradigms, where the person detection and tracking stages are performed in a multi-person setting, while temporal dynamics are only modeled for one person at a time. Consequently, their pe… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  40. arXiv:2308.06817  [pdf, other

    cs.IT

    The Asymptotic Capacity of $X$-Secure $T$-Private Linear Computation with Graph Based Replicated Storage

    Authors: Haobo Jia, Zhuqing Jia

    Abstract: The problem of $X$-secure $T$-private linear computation with graph based replicated storage (GXSTPLC) is to enable the user to retrieve a linear combination of messages privately from a set of $N$ distributed servers where every message is only allowed to store among a subset of servers subject to an $X$-security constraint, i.e., any groups of up to $X$ colluding servers must reveal nothing abou… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

    Comments: 39 pages, 2 figures

  41. arXiv:2308.01191  [pdf, other

    cs.SE

    Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey

    Authors: Shihan Dou, Junjie Shan, Haoxiang Jia, Wenhao Deng, Zhiheng Xi, Wei He, Yueming Wu, Tao Gui, Yang Liu, Xuan**g Huang

    Abstract: Code cloning, the duplication of code fragments, is common in software development. While some reuse aids productivity, excessive cloning hurts maintainability and introduces bugs. Hence, automatic code clone detection is vital. Meanwhile, large language models (LLMs) possess diverse code-related knowledge, making them versatile for various software engineering challenges. However, LLMs' performan… ▽ More

    Submitted 5 August, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: 13 pages, 3 figures

  42. arXiv:2307.16651  [pdf, other

    cs.LG

    UDAMA: Unsupervised Domain Adaptation through Multi-discriminator Adversarial Training with Noisy Labels Improves Cardio-fitness Prediction

    Authors: Yu Wu, Dimitris Spathis, Hong Jia, Ignacio Perez-Pozuelo, Tomas Gonzales, Soren Brage, Nicholas Wareham, Cecilia Mascolo

    Abstract: Deep learning models have shown great promise in various healthcare monitoring applications. However, most healthcare datasets with high-quality (gold-standard) labels are small-scale, as directly collecting ground truth is often costly and time-consuming. As a result, models developed and validated on small-scale datasets often suffer from overfitting and do not generalize well to unseen scenario… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: Accepted at Machine Learning for Healthcare (MLHC) 2023

  43. arXiv:2307.00310  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD

    Authors: Anvith Thudi, Hengrui Jia, Casey Meehan, Ilia Shumailov, Nicolas Papernot

    Abstract: Differentially private stochastic gradient descent (DP-SGD) is the canonical approach to private deep learning. While the current privacy analysis of DP-SGD is known to be tight in some settings, several empirical results suggest that models trained on common benchmark datasets leak significantly less privacy for many datapoints. Yet, despite past attempts, a rigorous explanation for why this is t… ▽ More

    Submitted 15 November, 2023; v1 submitted 1 July, 2023; originally announced July 2023.

  44. arXiv:2306.03460  [pdf, other

    cs.LG cs.CL cs.HC

    Natural Language Commanding via Program Synthesis

    Authors: Apurva Gandhi, Thong Q. Nguyen, Huitian Jiao, Robert Steen, Ameya Bhatawdekar

    Abstract: We present Semantic Interpreter, a natural language-friendly AI system for productivity software such as Microsoft Office that leverages large language models (LLMs) to execute user intent across application features. While LLMs are excellent at understanding user intent expressed as natural language, they are not sufficient for fulfilling application-specific user intent that requires more than t… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  45. arXiv:2305.12481  [pdf, ps, other

    cs.CR

    Compact Lattice Gadget and Its Applications to Hash-and-Sign Signatures

    Authors: Yang Yu, Huiwen Jia, Xiaoyun Wang

    Abstract: This work aims to improve the practicality of gadget-based cryptosystems, with a focus on hash-and-sign signatures. To this end, we develop a compact gadget framework in which the used gadget is a square matrix instead of the short and fat one used in previous constructions. To work with this compact gadget, we devise a specialized gadget sampler, called semi-random sampler, to compute the approxi… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: Accepted to Crypto 2023

  46. arXiv:2304.06174  [pdf, other

    physics.chem-ph cs.LG

    Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model

    Authors: Chenru Duan, Yuanqi Du, Haojun Jia, Heather J. Kulik

    Abstract: Transition state (TS) search is key in chemistry for elucidating reaction mechanisms and exploring reaction networks. The search for accurate 3D TS structures, however, requires numerous computationally intensive quantum chemistry calculations due to the complexity of potential energy surfaces. Here, we developed an object-aware SE(3) equivariant diffusion model that satisfies all physical symmetr… ▽ More

    Submitted 30 October, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: 5 figures and 1 table

  47. arXiv:2303.10894  [pdf, other

    cs.CV

    M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation

    Authors: Xiaoqi Zhao, Hongpeng Jia, Youwei Pang, Long Lv, Feng Tian, Lihe Zhang, Weibing Sun, Huchuan Lu

    Abstract: Accurate medical image segmentation is critical for early medical diagnosis. Most existing methods are based on U-shape structure and use element-wise addition or concatenation to fuse different level features progressively in decoder. However, both the two operations easily generate plenty of redundant information, which will weaken the complementarity between different level features, resulting… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: Submitted to IEEE TMI

  48. arXiv:2303.08365  [pdf, other

    cs.DC

    Gamify Stencil Dwarf on Cloud for Democratizing Scientific Computing

    Authors: Kun Li, Zhichun Li, Yuetao Chen, Zixuan Wang, Yiwei Zhang, Liang Yuan, Haipeng Jia, Yunquan Zhang, Ting Cao, Mao Yang

    Abstract: Stencil computation is one of the most important kernels in various scientific computing. Nowadays, most Stencil-driven scientific computing still relies heavily on supercomputers, suffering from expensive access, poor scalability, and duplicated optimizations. This paper proposes Tetris, the first system for high-performance Stencil on heterogeneous CPU+GPU, towards democratizing Stencil-driven… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

  49. arXiv:2302.12289  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Beyond Moments: Robustly Learning Affine Transformations with Asymptotically Optimal Error

    Authors: He Jia, Pravesh K . Kothari, Santosh S. Vempala

    Abstract: We present a polynomial-time algorithm for robustly learning an unknown affine transformation of the standard hypercube from samples, an important and well-studied setting for independent component analysis (ICA). Specifically, given an $ε$-corrupted sample from a distribution $D$ obtained by applying an unknown affine transformation $x \rightarrow Ax+s$ to the uniform distribution on a $d$-dimens… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

  50. Towards Blind Watermarking: Combining Invertible and Non-invertible Mechanisms

    Authors: Rui Ma, Mengxi Guo, Yi Hou, Fan Yang, Yuan Li, Huizhu Jia, Xiaodong Xie

    Abstract: Blind watermarking provides powerful evidence for copyright protection, image authentication, and tampering identification. However, it remains a challenge to design a watermarking model with high imperceptibility and robustness against strong noise attacks. To resolve this issue, we present a framework Combining the Invertible and Non-invertible (CIN) mechanisms. The CIN is composed of the invert… ▽ More

    Submitted 24 December, 2022; originally announced December 2022.

    Comments: 9 pages, 9 figures, 5 tables