Skip to main content

Showing 1–50 of 784 results for author: Loe

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03122  [pdf, other

    cs.RO

    IntentionNet: Map-Lite Visual Navigation at the Kilometre Scale

    Authors: Wei Gao, Bo Ai, Joel Loo, Vinay, David Hsu

    Abstract: This work explores the challenges of creating a scalable and robust robot navigation system that can traverse both indoor and outdoor environments to reach distant goals. We propose a navigation system architecture called IntentionNet that employs a monolithic neural network as the low-level planner/controller, and uses a general interface that we call intentions to steer the controller. The paper… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2407.02824  [pdf, other

    cs.SE

    Exploring the Capabilities of LLMs for Code Change Related Tasks

    Authors: Lishui Fan, Jiakun Liu, Zhongxin Liu, David Lo, Xin Xia, Shan** Li

    Abstract: Developers deal with code-change-related tasks daily, e.g., reviewing code. Pre-trained code and code-change-oriented models have been adapted to help developers with such tasks. Recently, large language models (LLMs) have shown their effectiveness in code-related tasks. However, existing LLMs for code focus on general code syntax and semantics rather than the differences between two code versions… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  3. arXiv:2407.02473  [pdf, other

    cs.RO

    Open Scene Graphs for Open World Object-Goal Navigation

    Authors: Joel Loo, Zhanxin Wu, David Hsu

    Abstract: How can we build robots for open-world semantic navigation tasks, like searching for target objects in novel scenes? While foundation models have the rich knowledge and generalisation needed for these tasks, a suitable scene representation is needed to connect them into a complete robot system. We address this with Open Scene Graphs (OSGs), a topo-semantic representation that retains and organises… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  4. arXiv:2407.02290  [pdf, other

    cs.SI

    A systematic comparison of measures for k-anonymity in networks

    Authors: Rachel G. de Jong, Mark P. J. van der Loo, Frank W. Takes

    Abstract: Privacy-aware sharing of network data is a difficult task due to the interconnectedness of individuals in networks. An important part of this problem is the inherently difficult question of how in a particular situation the privacy of an individual node should be measured. To that end, in this paper we propose a set of aspects that one should consider when choosing a measure for privacy. These asp… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  5. arXiv:2407.00225  [pdf, other

    cs.SE

    Large-scale, Independent and Comprehensive study of the power of LLMs for test case generation

    Authors: Wendkûuni C. Ouédraogo, Kader Kaboré, Haoye Tian, Yewei Song, Anil Koyuncu, Jacques Klein, David Lo, Tegawendé F. Bissyandé

    Abstract: Unit testing, crucial for identifying bugs in code modules like classes and methods, is often neglected by developers due to time constraints. Automated test generation techniques have emerged to address this, but often lack readability and require developer intervention. Large Language Models (LLMs), like GPT and Mistral, show promise in software engineering, including in test generation. However… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  6. arXiv:2406.19867  [pdf, other

    cs.SI cs.CY

    Sampled Datasets Risk Substantial Bias in the Identification of Political Polarization on Social Media

    Authors: Gabriele Di Bona, Emma Fraxanet, Björn Komander, Andrea Lo Sasso, Virginia Morini, Antoine Vendeville, Max Falkenberg, Alessandro Galeazzi

    Abstract: Following recent policy changes by X (Twitter) and other social media platforms, user interaction data has become increasingly difficult to access. These restrictions are impeding robust research pertaining to social and political phenomena online, which is critical due to the profound impact social media platforms may have on our societies. Here, we investigate the reliability of polarization mea… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  7. arXiv:2406.18219  [pdf, other

    cs.CL cs.LG

    A Closer Look into Mixture-of-Experts in Large Language Models

    Authors: Ka Man Lo, Zeyu Huang, Zihan Qiu, Zili Wang, Jie Fu

    Abstract: Mixture-of-experts (MoE) is gaining increasing attention due to its unique properties and remarkable performance, especially for language tasks. By sparsely activating a subset of parameters for each token, MoE architecture could increase the model size without sacrificing computational efficiency, achieving a better trade-off between performance and training costs. However, the underlying mechani… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  8. arXiv:2406.17654  [pdf, other

    cs.RO cs.AI

    MDHA: Multi-Scale Deformable Transformer with Hybrid Anchors for Multi-View 3D Object Detection

    Authors: Michelle Adeline, Junn Yong Loo, Vishnu Monn Baskaran

    Abstract: Multi-view 3D object detection is a crucial component of autonomous driving systems. Contemporary query-based methods primarily depend either on dataset-specific initialization of 3D anchors, introducing bias, or utilize dense attention mechanisms, which are computationally inefficient and unscalable. To overcome these issues, we present MDHA, a novel sparse query-based framework, which constructs… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  9. arXiv:2406.16746  [pdf, other

    cs.LG cs.AI cs.CL

    The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

    Authors: Shayne Longpre, Stella Biderman, Alon Albalak, Hailey Schoelkopf, Daniel McDuff, Sayash Kapoor, Kevin Klyman, Kyle Lo, Gabriel Ilharco, Nay San, Maribeth Rauh, Aviya Skowron, Bertie Vidgen, Laura Weidinger, Arvind Narayanan, Victor Sanh, David Adelani, Percy Liang, Rishi Bommasani, Peter Henderson, Sasha Luccioni, Yacine Jernite, Luca Soldaini

    Abstract: Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation,… ▽ More

    Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  10. arXiv:2406.16264  [pdf, other

    cs.CL cs.AI

    One Thousand and One Pairs: A "novel" challenge for long-context language models

    Authors: Marzena Karpinska, Katherine Thai, Kyle Lo, Tanya Goyal, Mohit Iyyer

    Abstract: Synthetic long-context LLM benchmarks (e.g., "needle-in-the-haystack") test only surface-level retrieval capabilities, but how well can long-context LLMs retrieve, synthesize, and reason over information across book-length inputs? We address this question by creating NoCha, a dataset of 1,001 minimally different pairs of true and false claims about 67 recently-published English fictional books, wr… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: preprint, 29 pages

  11. arXiv:2406.15877  [pdf, other

    cs.SE cs.AI cs.CL

    BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

    Authors: Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen Gong, Thong Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman Jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu , et al. (8 additional authors not shown)

    Abstract: Automated software engineering has been greatly empowered by the recent advances in Large Language Models (LLMs) for programming. While current benchmarks have shown that LLMs can perform various software engineering tasks like human developers, the majority of their evaluations are limited to short and self-contained algorithmic tasks. Solving challenging and practical programming tasks requires… ▽ More

    Submitted 26 June, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: 44 pages, 14 figures, 7 tables, built with love by the BigCode community :)

  12. arXiv:2406.11794  [pdf, other

    cs.LG cs.CL

    DataComp-LM: In search of the next generation of training sets for language models

    Authors: Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner , et al. (34 additional authors not shown)

    Abstract: We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with dat… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Project page: https://www.datacomp.ai/dclm/

  13. arXiv:2406.11271  [pdf, other

    cs.CV cs.LG

    MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

    Authors: Anas Awadalla, Le Xue, Oscar Lo, Manli Shu, Hannah Lee, Etash Kumar Guha, Matt Jordan, Sheng Shen, Mohamed Awadalla, Silvio Savarese, Caiming Xiong, Ran Xu, Ye** Choi, Ludwig Schmidt

    Abstract: Multimodal interleaved datasets featuring free-form interleaved sequences of images and text are crucial for training frontier large multimodal models (LMMs). Despite the rapid progression of open-source LMMs, there remains a pronounced scarcity of large-scale, diverse open-source multimodal interleaved datasets. In response, we introduce MINT-1T, the most extensive and diverse open-source Multimo… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  14. arXiv:2406.08832  [pdf, other

    quant-ph cs.NI

    Multiplexed Quantum Communication with Surface and Hypergraph Product Codes

    Authors: Shin Nishio, Nicholas Connolly, Nicolò Lo Piparo, William John Munro, Thomas Rowan Scruby, Kae Nemoto

    Abstract: Connecting multiple processors via quantum interconnect technologies could help to overcome issues of scalability in single-processor quantum computers. Transmission via these interconnects can be performed more efficiently using quantum multiplexing, where information is encoded in high-dimensional photonic degrees of freedom. We explore the effects of multiplexing on logical error rates in surfa… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 12 pages + 12-page appendices, 19 figures

    ACM Class: E.4; C.2; G.2

  15. arXiv:2406.07835  [pdf, other

    cs.CL cs.AI

    SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

    Authors: David Wadden, Kejian Shi, Jacob Morrison, Aakanksha Naik, Shruti Singh, Nitzan Barzilay, Kyle Lo, Tom Hope, Luca Soldaini, Shannon Zejiang Shen, Doug Downey, Hannaneh Hajishirzi, Arman Cohan

    Abstract: We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed t… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS Datasets and Benchmarks 2024

  16. arXiv:2406.06386  [pdf, other

    cs.CV

    FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography

    Authors: Julia Yang, Alina Jade Barnett, Jon Donnelly, Satvik Kishore, Jerry Fang, Fides Regina Schwartz, Chaofan Chen, Joseph Y. Lo, Cynthia Rudin

    Abstract: Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are unsuitable and there is a call in these fields to make interpretable models. Recent work in interpretable computer vision provides transparency t… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 8 pages, 6 figures, Accepted for oral presentation at the 2024 CVPR Workshop on Domain adaptation, Explainability, Fairness in AI for Medical Image Analysis (DEF-AI-MIA)

  17. arXiv:2406.05712  [pdf, other

    cs.SE

    Demystifying the Characteristics for Smart Contract Upgrades

    Authors: Ye Liu, Shuo Li, Xiuheng Wu, Yi Li, Zhiyang Chen, David Lo

    Abstract: Upgradable smart contracts play an important role in the decentralized application ecosystem, to support routine maintenance, security patching, and feature additions. In this paper, we conduct an empirical study on proxy-based upgradable smart contracts to understand the characteristics of contract upgrading. Through our study on 57,118 open source proxy contracts, we found that 583 contracts hav… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  18. arXiv:2406.02523  [pdf, other

    cs.RO cs.AI cs.LG

    RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

    Authors: Soroush Nasiriany, Abhiram Maddukuri, Lance Zhang, Adeet Parikh, Aaron Lo, Abhishek Joshi, Ajay Mandlekar, Yuke Zhu

    Abstract: Recent advancements in Artificial Intelligence (AI) have largely been propelled by scaling. In Robotics, scaling is hindered by the lack of access to massive robot datasets. We advocate using realistic physical simulation as a means to scale environments, tasks, and datasets for robot learning methods. We present RoboCasa, a large-scale simulation framework for training generalist robots in everyd… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: RSS 2024

  19. arXiv:2406.00720  [pdf, other

    cs.IT

    Age-Gain-Dependent Random Access for Event-Driven Periodic Updating

    Authors: Yuqing Zhu, Yiwen Zhu, Aoyu Gong, Yan Lin, Yuan-Hsuan Lo, Yi** Zhang

    Abstract: This paper considers utilizing the knowledge of age gains to reduce the network average age of information (AoI) in random access with event-driven periodic updating for the first time. Built on the form of slotted ALOHA, we require each device to determine its age gain threshold and transmission probability in an easily implementable decentralized manner, so that the unavoided contention can be l… ▽ More

    Submitted 27 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  20. arXiv:2405.20305  [pdf, other

    cs.CV

    Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models

    Authors: Himangi Mittal, Nakul Agarwal, Shao-Yuan Lo, Kwonjoon Lee

    Abstract: We introduce PlausiVL, a large video-language model for anticipating action sequences that are plausible in the real-world. While significant efforts have been made towards anticipating future actions, prior approaches do not take into account the aspect of plausibility in an action sequence. To address this limitation, we explore the generative capability of a large video-language model in our wo… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  21. Wavefront Threading Enables Effective High-Level Synthesis

    Authors: Blake Pelton, Adam Sapek, Ken Eguro, Daniel Lo, Alessandro Forin, Matt Humphrey, **wen Xi, David Cox, Rajas Karandikar, Johannes de Fine Licht, Evgeny Babin, Adrian Caulfield, Doug Burger

    Abstract: Digital systems are growing in importance and computing hardware is growing more heterogeneous. Hardware design, however, remains laborious and expensive, in part due to the limitations of conventional hardware description languages (HDLs) like VHDL and Verilog. A longstanding research goal has been programming hardware like software, with high-level languages that can generate efficient hardware… ▽ More

    Submitted 10 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted to PLDI'24

  22. arXiv:2405.19413  [pdf, other

    cs.CV cs.AI

    VisTA-SR: Improving the Accuracy and Resolution of Low-Cost Thermal Imaging Cameras for Agriculture

    Authors: Heesup Yun, Sassoum Lo, Christine H. Diepenbrock, Brian N. Bailey, J. Mason Earles

    Abstract: Thermal cameras are an important tool for agricultural research because they allow for non-invasive measurement of plant temperature, which relates to important photochemical, hydraulic, and agronomic traits. Utilizing low-cost thermal cameras can lower the barrier to introducing thermal imaging in agricultural research and production. This paper presents an approach to improve the temperature acc… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  23. arXiv:2405.16746  [pdf, other

    cs.SE

    Ecosystem of Large Language Models for Code

    Authors: Zhou Yang, Jieke Shi, David Lo

    Abstract: The availability of vast amounts of publicly accessible data of source code and the advances in modern language models, coupled with increasing computational resources, have led to a remarkable surge in the development of large language models for code (LLM4Code, for short). The interaction between code datasets and models gives rise to a complex ecosystem characterized by intricate dependencies t… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Working in progress

  24. arXiv:2405.16545  [pdf, other

    cs.RO

    VICtoR: Learning Hierarchical Vision-Instruction Correlation Rewards for Long-horizon Manipulation

    Authors: Kuo-Han Hung, Pang-Chi Lo, Jia-Fong Yeh, Han-Yuan Hsu, Yi-Ting Chen, Winston H. Hsu

    Abstract: We study reward models for long-horizon manipulation tasks by learning from action-free videos and language instructions, which we term the visual-instruction correlation (VIC) problem. Recent advancements in cross-modality modeling have highlighted the potential of reward modeling through visual and language correlations. However, existing VIC methods face challenges in learning rewards for long-… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  25. arXiv:2405.14502  [pdf, other

    cs.DB cs.DC

    DEX: Scalable Range Indexing on Disaggregated Memory [Extended Version]

    Authors: Baotong Lu, Kaisong Huang, Chieh-Jan Mike Liang, Tianzheng Wang, Eric Lo

    Abstract: Memory disaggregation can potentially allow memory-optimized range indexes such as B+-trees to scale beyond one machine while attaining high hardware utilization and low cost. Designing scalable indexes on disaggregated memory, however, is challenging due to rudimentary caching, unprincipled offloading and excessive inconsistency among servers. This paper proposes DEX, a new scalable B+-tree for… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 16 pages; To appear at VLDB 2024

  26. arXiv:2405.13744  [pdf, other

    cs.CR cs.NI cs.SI

    A Privacy Measure Turned Upside Down? Investigating the Use of HTTP Client Hints on the Web

    Authors: Stephan Wiefling, Marian Hönscheid, Luigi Lo Iacono

    Abstract: HTTP client hints are a set of standardized HTTP request headers designed to modernize and potentially replace the traditional user agent string. While the user agent string exposes a wide range of information about the client's browser and device, client hints provide a controlled and structured approach for clients to selectively disclose their capabilities and preferences to servers. Essentiall… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 12 pages, 6 figures, 5 tables

    Journal ref: The 19th International Conference on Availability, Reliability and Security (ARES 2024), July 30-August 2, 2024, Vienna, Austria. ACM

  27. arXiv:2405.11708  [pdf, other

    cs.LG cs.CV

    Adaptive Batch Normalization Networks for Adversarial Robustness

    Authors: Shao-Yuan Lo, Vishal M. Patel

    Abstract: Deep networks are vulnerable to adversarial examples. Adversarial Training (AT) has been a standard foundation of modern adversarial defense approaches due to its remarkable effectiveness. However, AT is extremely time-consuming, refraining it from wide deployment in practical applications. In this paper, we aim at a non-AT defense: How to design a defense method that gets rid of AT but is still r… ▽ More

    Submitted 26 May, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: Accepted at IEEE International Conference on Advanced Video and Signal-based Surveillance (AVSS) 2024

  28. arXiv:2405.11191  [pdf, other

    cs.DB cs.LG

    Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines

    Authors: Chaokun Chang, Eric Lo, Chunxiao Ye

    Abstract: Machine learning inference pipelines commonly encountered in data science and industries often require real-time responsiveness due to their user-facing nature. However, meeting this requirement becomes particularly challenging when certain input features require aggregating a large volume of data online. Recent literature on interpretable machine learning reveals that most machine learning models… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  29. arXiv:2405.11133  [pdf

    eess.IV cs.CV

    XCAT-3.0: A Comprehensive Library of Personalized Digital Twins Derived from CT Scans

    Authors: Lavsen Dahal, Mobina Ghojoghnejad, Dhrubajyoti Ghosh, Yubraj Bhandari, David Kim, Fong Chi Ho, Fakrul Islam Tushar, Sheng Luoa, Kyle J. Lafata, Ehsan Abadi, Ehsan Samei, Joseph Y. Lo, W. Paul Segars

    Abstract: Virtual Imaging Trials (VIT) offer a cost-effective and scalable approach for evaluating medical imaging technologies. Computational phantoms, which mimic real patient anatomy and physiology, play a central role in VIT. However, the current libraries of computational phantoms face limitations, particularly in terms of sample size and diversity. Insufficient representation of the population hampers… ▽ More

    Submitted 1 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  30. arXiv:2405.10467  [pdf, other

    cs.AI cs.SE

    Agent Design Pattern Catalogue: A Collection of Architectural Patterns for Foundation Model based Agents

    Authors: Yue Liu, Sin Kit Lo, Qinghua Lu, Liming Zhu, Dehai Zhao, Xiwei Xu, Stefan Harrer, Jon Whittle

    Abstract: Foundation model-enabled generative artificial intelligence facilitates the development and implementation of agents, which can leverage distinguished reasoning and language processing capabilities to takes a proactive, autonomous role to pursue users' goals. Nevertheless, there is a lack of systematic knowledge to guide practitioners in designing the agents considering challenges of goal-seeking… ▽ More

    Submitted 24 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  31. arXiv:2405.07948  [pdf, other

    cs.RO

    Scene Action Maps: Behavioural Maps for Navigation without Metric Information

    Authors: Joel Loo, David Hsu

    Abstract: Humans are remarkable in their ability to navigate without metric information. We can read abstract 2D maps, such as floor-plans or hand-drawn sketches, and use them to navigate in unseen rich 3D environments, without requiring prior traversals to map out these scenes in detail. We posit that this is enabled by the ability to represent the environment abstractly as interconnected navigational beha… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: ICRA 2024

  32. arXiv:2405.04605  [pdf

    cs.CV cs.AI cs.LG

    AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan Datasets

    Authors: Fakrul Islam Tushar, Avivah Wang, Lavsen Dahal, Michael R. Harowicz, Kyle J. Lafata, Tina D. Tailor, Joseph Y. Lo

    Abstract: Lung cancer's high mortality rate can be mitigated by early detection, increasingly reliant on AI for diagnostic imaging. However, AI model performance depends on training and validation datasets. This study develops and validates AI models for both nodule detection and cancer classification tasks. For detection, two models (DLCSD-mD and LUNA16-mD) were developed using the Duke Lung Cancer Screeni… ▽ More

    Submitted 12 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 16 pages, 2 tables, 5 figures

  33. arXiv:2404.19360  [pdf, other

    cs.CV cs.CL cs.IR

    Large Language Model Informed Patent Image Retrieval

    Authors: Hao-Cheng Lo, Jung-Mei Chu, Jieh Hsiang, Chun-Chieh Cho

    Abstract: In patent prosecution, image-based retrieval systems for identifying similarities between current patent images and prior art are pivotal to ensure the novelty and non-obviousness of patent applications. Despite their growing popularity in recent years, existing attempts, while effective at recognizing images within the same patent, fail to deliver practical value due to their limited generalizabi… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 8 pages. Under review

  34. arXiv:2404.16333  [pdf, other

    cs.SE cs.AI cs.PL

    AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation

    Authors: Zhensu Sun, Xiaoning Du, Zhou Yang, Li Li, David Lo

    Abstract: Besides humans and machines, Artificial Intelligence (AI) models have emerged to be another important audience of programming languages, as we come to the era of large language models (LLMs). LLMs can now excel at coding competitions and even program like developers to address various tasks, such as math calculation. Yet, the grammar and layout of existing programs are designed for humans. Particu… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: under review

  35. arXiv:2404.15596  [pdf, other

    cs.SE cs.CR

    VulEval: Towards Repository-Level Evaluation of Software Vulnerability Detection

    Authors: Xin-Cheng Wen, Xinchen Wang, Yujia Chen, Ruida Hu, David Lo, Cuiyun Gao

    Abstract: Deep Learning (DL)-based methods have proven to be effective for software vulnerability detection, with a potential for substantial productivity enhancements for detecting vulnerabilities. Current methods mainly focus on detecting single functions (i.e., intra-procedural vulnerabilities), ignoring the more complex inter-procedural vulnerability detection scenarios in practice. For example, develop… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 12 pages

  36. arXiv:2404.09290  [pdf, other

    cs.CV eess.IV

    RoofDiffusion: Constructing Roofs from Severely Corrupted Point Data via Diffusion

    Authors: Kyle Shih-Huang Lo, Jörg Peters, Eric Spellman

    Abstract: Accurate completion and denoising of roof height maps are crucial to reconstructing high-quality 3D buildings. Repairing sparse points can enhance low-cost sensor use and reduce UAV flight overlap. RoofDiffusion is a new end-to-end self-supervised diffusion technique for robustly completing, in particular difficult, roof height maps. RoofDiffusion leverages widely-available curated footprints and… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  37. arXiv:2404.07575  [pdf

    cs.SD cs.AI eess.AS

    An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

    Authors: Tien-Hong Lo, Fu-An Chao, Tzu-I Wu, Yao-Ting Sung, Berlin Chen

    Abstract: Automated speaking assessment (ASA) typically involves automatic speech recognition (ASR) and hand-crafted feature extraction from the ASR transcript of a learner's speech. Recently, self-supervised learning (SSL) has shown stellar performance compared to traditional methods. However, SSL-based ASA systems are faced with at least three data-related challenges: limited annotated data, uneven distri… ▽ More

    Submitted 11 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024 Findings

  38. arXiv:2404.06393  [pdf, other

    cs.SD cs.AI eess.AS

    MuPT: A Generative Symbolic Music Pretrained Transformer

    Authors: Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan , et al. (4 additional authors not shown)

    Abstract: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the chal… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  39. arXiv:2404.05583  [pdf, other

    cs.CV

    Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model

    Authors: Yue-Hua Han, Tai-Ming Huang, Shu-Tzu Lo, Po-Han Huang, Kai-Lung Hua, Jun-Cheng Chen

    Abstract: With the rise of deep learning, generative models have enabled the creation of highly realistic synthetic images, presenting challenges due to their potential misuse. While research in Deepfake detection has grown rapidly in response, many detection methods struggle with unseen Deepfakes generated by new synthesis techniques. To address this generalisation challenge, we propose a novel Deepfake de… ▽ More

    Submitted 5 June, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  40. arXiv:2404.04834  [pdf, ps, other

    cs.SE

    LLM-Based Multi-Agent Systems for Software Engineering: Vision and the Road Ahead

    Authors: Junda He, Christoph Treude, David Lo

    Abstract: Integrating Large Language Models(LLMs) into autonomous agents marks a significant shift in the research landscape by offering cognitive abilities competitive to human planning and reasoning. This paper envisions the evolution of LLM-based Multi-Agent (LMA) systems in addressing complex and multi-faceted software engineering challenges. LMA systems introduce numerous benefits, including enhanced r… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  41. arXiv:2404.04567  [pdf, other

    cs.CR cs.LG

    Optimization of Lightweight Malware Detection Models For AIoT Devices

    Authors: Felicia Lo, Shin-Ming Cheng, Rafael Kaliski

    Abstract: Malware intrusion is problematic for Internet of Things (IoT) and Artificial Intelligence of Things (AIoT) devices as they often reside in an ecosystem of connected devices, such as a smart home. If any devices are infected, the whole ecosystem can be compromised. Although various Machine Learning (ML) models are deployed to detect malware and network intrusion, generally speaking, robust high-acc… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Accepted by WF-IOT 2023, 7 pages, 4 figures

  42. arXiv:2404.04566  [pdf, other

    cs.SE

    Efficient and Green Large Language Models for Software Engineering: Vision and the Road Ahead

    Authors: Jieke Shi, Zhou Yang, David Lo

    Abstract: Large Language Models (LLMs) have recently shown remarkable capabilities in various software engineering tasks, spurring the rapid development of the Large Language Models for Software Engineering (LLM4SE) area. However, limited attention has been paid to crafting efficient LLM4SE solutions that demand minimal time and memory resources, as well as green LLM4SE solutions that reduce energy consumpt… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  43. arXiv:2404.02525  [pdf, other

    cs.SE

    Large Language Model for Vulnerability Detection and Repair: Literature Review and the Road Ahead

    Authors: Xin Zhou, Sicong Cao, Xiaobing Sun, David Lo

    Abstract: The significant advancements in Large Language Models (LLMs) have resulted in their widespread adoption across various tasks within Software Engineering (SE), including vulnerability detection and repair. Numerous recent studies have investigated the application of LLMs to enhance vulnerability detection and repair tasks. Despite the increasing research interest, there is currently no existing sur… ▽ More

    Submitted 6 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 11 pages

  44. arXiv:2404.01261  [pdf, other

    cs.CL cs.AI

    FABLES: Evaluating faithfulness and content selection in book-length summarization

    Authors: Yekyung Kim, Yapei Chang, Marzena Karpinska, Aparna Garimella, Varun Manjunatha, Kyle Lo, Tanya Goyal, Mohit Iyyer

    Abstract: While long-context large language models (LLMs) can technically summarize book-length documents (>100K tokens), the length and complexity of the documents have so far prohibited evaluations of input-dependent aspects like faithfulness. In this paper, we conduct the first large-scale human evaluation of faithfulness and content selection on LLM-generated summaries of fictional books. Our study miti… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: preprint - 39 pages

  45. arXiv:2403.19001  [pdf, other

    cs.CV cs.AI eess.IV q-bio.NC

    Cross-domain Fiber Cluster Shape Analysis for Language Performance Cognitive Score Prediction

    Authors: Yui Lo, Yuqian Chen, Dongnan Liu, Wan Liu, Leo Zekelman, Fan Zhang, Yogesh Rathi, Nikos Makris, Alexandra J. Golby, Weidong Cai, Lauren J. O'Donnell

    Abstract: Shape plays an important role in computer graphics, offering informative features to convey an object's morphology and functionality. Shape analysis in brain imaging can help interpret structural and functionality correlations of the human brain. In this work, we investigate the shape of the brain's 3D white matter connections and its potential predictive relationship to human cognitive function.… ▽ More

    Submitted 29 March, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: 2 figures, 11 pages

  46. arXiv:2403.15675  [pdf, other

    cs.CV

    An active learning model to classify animal species in Hong Kong

    Authors: Gareth Lamb, Ching Hei Lo, ** Wu, Calvin K. F. Lee

    Abstract: Camera traps are used by ecologists globally as an efficient and non-invasive method to monitor animals. While it is time-consuming to manually label the collected images, recent advances in deep learning and computer vision has made it possible to automating this process [1]. A major obstacle to this is the generalisability of these models when applying these images to independently collected dat… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 6 pages, 2 figures, 1 table

  47. arXiv:2403.15246  [pdf, other

    cs.IR cs.CL cs.LG

    FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

    Authors: Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn Lawrie, Luca Soldaini

    Abstract: Modern Language Models (LMs) are capable of following long and complex instructions that enable a large and diverse set of user requests. While Information Retrieval (IR) models use these LMs as the backbone of their architectures, virtually none of them allow users to provide detailed instructions alongside queries, thus limiting their ability to satisfy complex information needs. In this work, w… ▽ More

    Submitted 7 May, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  48. arXiv:2403.14371  [pdf

    cs.LG cs.AI cs.DC

    Loop Improvement: An Efficient Approach for Extracting Shared Features from Heterogeneous Data without Central Server

    Authors: Fei Li, Chu Kiong Loo, Wei Shiung Liew, Xiaofeng Liu

    Abstract: In federated learning, data heterogeneity significantly impacts performance. A typical solution involves segregating these parameters into shared and personalized components, a concept also relevant in multi-task learning. Addressing this, we propose "Loop Improvement" (LI), a novel method enhancing this separation and feature extraction without necessitating a central server or data interchange a… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 11 pages, 11 figures

  49. Is It Really You Who Forgot the Password? When Account Recovery Meets Risk-Based Authentication

    Authors: Andre Büttner, Andreas Thue Pedersen, Stephan Wiefling, Nils Gruschka, Luigi Lo Iacono

    Abstract: Risk-based authentication (RBA) is used in online services to protect user accounts from unauthorized takeover. RBA commonly uses contextual features that indicate a suspicious login attempt when the characteristic attributes of the login context deviate from known and thus expected values. Previous research on RBA and anomaly detection in authentication has mainly focused on the login process. Ho… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  50. arXiv:2403.11079  [pdf, other

    cs.SE cs.LG

    Bridging Expert Knowledge with Deep Learning Techniques for Just-In-Time Defect Prediction

    Authors: Xin Zhou, DongGyun Han, David Lo

    Abstract: Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features from co… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 48 pages