Skip to main content

Showing 1–50 of 99 results for author: Hooi, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13839  [pdf, other

    q-bio.BM cs.LG q-bio.GN

    RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design

    Authors: Rishabh Anand, Chaitanya K. Joshi, Alex Morehead, Arian R. Jamasb, Charles Harris, Simon V. Mathis, Kieran Didi, Bryan Hooi, Pietro Liò

    Abstract: We introduce RNA-FrameFlow, the first generative model for 3D RNA backbone design. We build upon SE(3) flow matching for protein backbone generation and establish protocols for data preparation and evaluation to address unique challenges posed by RNA modeling. We formulate RNA structures as a set of rigid-body frames and associated loss functions which account for larger, more conformationally fle… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: To be presented as an Oral at ICML 2024 Structured Probabilistic Inference & Generative Modeling Workshop, and a Spotlight at ICML 2024 AI4Science Workshop

  2. arXiv:2406.04975  [pdf, other

    cs.LG cs.AI

    UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting

    Authors: Juncheng Liu, Chenghao Liu, Gerald Woo, Yiwei Wang, Bryan Hooi, Caiming Xiong, Doyen Sahoo

    Abstract: Transformer-based models have emerged as powerful tools for multivariate time series forecasting (MTSF). However, existing Transformer models often fall short of capturing both intricate dependencies across variate and temporal dimensions in MTS data. Some recent models are proposed to separately capture variate and temporal dependencies through either two sequential or parallel attention mechanis… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2405.13873  [pdf, other

    cs.AI cs.CL

    FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering

    Authors: Yuan Sui, Yufei He, Nian Liu, Xiaoxin He, Kun Wang, Bryan Hooi

    Abstract: While large language models (LLMs) have achieved significant success in various applications, they often struggle with hallucinations, especially in scenarios that require deep and responsible reasoning. These issues could be partially mitigate by integrating external knowledge graphs (KG) in LLM reasoning. However, the method of their incorporation is still largely unexplored. In this paper, we p… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  4. arXiv:2403.02253  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection

    Authors: Yuexin Li, Chengyu Huang, Shumin Deng, Mei Lin Lock, Tri Cao, Nay Oo, Hoon Wei Lim, Bryan Hooi

    Abstract: Phishing attacks have inflicted substantial losses on individuals and businesses alike, necessitating the development of robust and efficient automated phishing detection approaches. Reference-based phishing detectors (RBPDs), which compare the logos on a target webpage to a known set of logos, have emerged as the state-of-the-art approach. However, a major limitation of existing RBPDs is that the… ▽ More

    Submitted 15 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted by USENIX Security 2024

  5. arXiv:2402.15300  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding

    Authors: Ailin Deng, Zhirui Chen, Bryan Hooi

    Abstract: Large Vision-Language Models (LVLMs) are susceptible to object hallucinations, an issue in which their generated text contains non-existent objects, greatly limiting their reliability and practicality. Current approaches often rely on the model's token likelihoods or other internal information, instruction tuning on additional datasets, or incorporating complex external tools. We first perform emp… ▽ More

    Submitted 23 April, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Code URL: https://github.com/d-ailin/CLIP-Guided-Decoding

  6. arXiv:2402.13630  [pdf, other

    cs.LG

    UniGraph: Learning a Cross-Domain Graph Foundation Model From Natural Language

    Authors: Yufei He, Bryan Hooi

    Abstract: Foundation models like ChatGPT and GPT-4 have revolutionized artificial intelligence, exhibiting remarkable abilities to generalize across a wide array of tasks and applications beyond their initial training objectives. However, when this concept is applied to graph learning, a stark contrast emerges. Graph learning has predominantly focused on single-graph models, tailored to specific tasks or da… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 16 pages, 1 figure. Preliminary work

  7. arXiv:2402.11816  [pdf, other

    cs.CV cs.LG

    Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning

    Authors: Jihai Zhang, Xiang Lan, Xiaoye Qu, Yu Cheng, Mengling Feng, Bryan Hooi

    Abstract: Self-Supervised Contrastive Learning has proven effective in deriving high-quality representations from unlabeled data. However, a major challenge that hinders both unimodal and multimodal contrastive learning is feature suppression, a phenomenon where the trained model captures only a limited portion of the information from the input data while overlooking other potentially valuable content. This… ▽ More

    Submitted 11 March, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  8. arXiv:2402.07630  [pdf, other

    cs.LG

    G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering

    Authors: Xiaoxin He, Yijun Tian, Yifei Sun, Nitesh V. Chawla, Thomas Laurent, Yann LeCun, Xavier Bresson, Bryan Hooi

    Abstract: Given a graph with textual attributes, we enable users to `chat with their graph': that is, to ask questions about the graph using a conversational interface. In response to a user's questions, our method provides textual replies and highlights the relevant parts of the graph. While existing works integrate large language models (LLMs) and graph neural networks (GNNs) in various ways, they mostly… ▽ More

    Submitted 27 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  9. arXiv:2402.03271  [pdf, other

    cs.CL cs.AI cs.LG

    Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

    Authors: Zhiyuan Hu, Chumin Liu, Xidong Feng, Yilun Zhao, See-Kiong Ng, Anh Tuan Luu, Junxian He, Pang Wei Koh, Bryan Hooi

    Abstract: In the face of uncertainty, the ability to *seek information* is of fundamental importance. In many practical applications, such as medical diagnosis and troubleshooting, the information needed to solve the task is not initially given and has to be actively sought by asking follow-up questions (for example, a doctor asking a patient for more details about their symptoms). In this work, we introduc… ▽ More

    Submitted 30 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Update Results

  10. arXiv:2311.18158  [pdf, other

    cs.CV

    HiPA: Enabling One-Step Text-to-Image Diffusion Models via High-Frequency-Promoting Adaptation

    Authors: Yifan Zhang, Bryan Hooi

    Abstract: Diffusion models have revolutionized text-to-image generation, but their real-world applications are hampered by the extensive time needed for hundreds of diffusion steps. Although progressive distillation has been proposed to speed up diffusion sampling to 2-8 steps, it still falls short in one-step generation, and necessitates training multiple student models, which is highly parameter-extensive… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  11. arXiv:2311.09101  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Towards A Unified View of Answer Calibration for Multi-Step Reasoning

    Authors: Shumin Deng, Ningyu Zhang, Nay Oo, Bryan Hooi

    Abstract: Large Language Models (LLMs) employing Chain-of-Thought (CoT) prompting have broadened the scope for improving multi-step reasoning capabilities. We generally divide multi-step reasoning into two phases: path generation to generate the reasoning path(s); and answer calibration post-processing the reasoning path(s) to obtain a final answer. However, the existing literature lacks systematic analysis… ▽ More

    Submitted 25 February, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Working in Progress

  12. arXiv:2310.14481  [pdf, other

    cs.LG cs.SI

    Efficient Heterogeneous Graph Learning via Random Projection

    Authors: Jun Hu, Bryan Hooi, Bingsheng He

    Abstract: Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs. Typical HGNNs require repetitive message passing during training, limiting efficiency for large-scale real-world graphs. Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors, enabling efficient mini-batch training. Exist… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  13. arXiv:2310.13206  [pdf, other

    cs.CL cs.AI

    Primacy Effect of ChatGPT

    Authors: Yiwei Wang, Yujun Cai, Muhao Chen, Yuxuan Liang, Bryan Hooi

    Abstract: Instruction-tuned large language models (LLMs), such as ChatGPT, have led to promising zero-shot performance in discriminative natural language understanding (NLU) tasks. This involves querying the LLM using a prompt containing the question, and the candidate labels to choose from. The question-answering capabilities of ChatGPT arise from its pre-training on large amounts of human-written text, as… ▽ More

    Submitted 14 May, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 short paper

  14. arXiv:2310.10830  [pdf, other

    cs.CL

    Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style Attacks

    Authors: Jiaying Wu, Bryan Hooi

    Abstract: It is commonly perceived that online fake news and reliable news exhibit stark differences in writing styles, such as the use of sensationalist versus objective language. However, we emphasize that style-related features can also be exploited for style-based attacks. Notably, the rise of powerful Large Language Models (LLMs) has enabled malicious users to mimic the style of trustworthy news outlet… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  15. arXiv:2310.09751  [pdf, other

    cs.LG

    UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting

    Authors: Xu Liu, Junfeng Hu, Yuan Li, Shizhe Diao, Yuxuan Liang, Bryan Hooi, Roger Zimmermann

    Abstract: Multivariate time series forecasting plays a pivotal role in contemporary web technologies. In contrast to conventional methods that involve creating dedicated models for specific time series application domains, this research advocates for a unified model paradigm that transcends domain boundaries. However, learning an effective cross-domain model presents the following challenges. First, various… ▽ More

    Submitted 23 February, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

  16. arXiv:2310.07478  [pdf, other

    cs.AI

    Multimodal Graph Learning for Generative Tasks

    Authors: Minji Yoon, **g Yu Koh, Bryan Hooi, Ruslan Salakhutdinov

    Abstract: Multimodal learning combines multiple data modalities, broadening the types and complexity of data our models can utilize: for example, from plain text to image-caption pairs. Most multimodal learning algorithms focus on modeling simple one-to-one pairs of data from two modalities, such as image-caption pairs, or audio-text pairs. However, in most real-world settings, entities of different modalit… ▽ More

    Submitted 12 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

  17. arXiv:2310.02124  [pdf, other

    cs.CL cs.AI cs.CY cs.LG cs.MA

    Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View

    Authors: **tian Zhang, Xin Xu, Ningyu Zhang, Ruibo Liu, Bryan Hooi, Shumin Deng

    Abstract: As Natural Language Processing (NLP) systems are increasingly employed in intricate social environments, a pressing query emerges: Can these NLP systems mirror human-esque collaborative intelligence, in a multi-agent society consisting of multiple large language models (LLMs)? This paper probes the collaboration mechanisms among contemporary NLP systems by melding practical experiments with theore… ▽ More

    Submitted 27 May, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: ACL 2024 Main Conference. 64 pages (8 main), 70 figures, 37 tables. Blog: https://www.zjukg.org/project/MachineSoM

  18. arXiv:2309.16424  [pdf, other

    cs.CL cs.AI cs.SI

    Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection

    Authors: Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi

    Abstract: Despite considerable advances in automated fake news detection, due to the timely nature of news, it remains a critical open question how to effectively predict the veracity of news articles based on limited fact-checks. Existing approaches typically follow a "Train-from-Scratch" paradigm, which is fundamentally bounded by the availability of large-scale annotated data. While expressive pre-traine… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: Accepted to CIKM 2023 (Full Paper)

  19. arXiv:2309.08949  [pdf, other

    cs.CL

    Enhancing Large Language Model Induced Task-Oriented Dialogue Systems Through Look-Forward Motivated Goals

    Authors: Zhiyuan Hu, Yue Feng, Yang Deng, Zekun Li, See-Kiong Ng, Anh Tuan Luu, Bryan Hooi

    Abstract: Recently, the development of large language models (LLMs) has been significantly enhanced the question answering and dialogue generation, and makes them become increasingly popular in current practical scenarios. While unlike the general dialogue system which emphasizes the semantic performance, the task-oriented dialogue (ToD) systems aim to achieve the dialogue goal efficiently and successfully… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

    Comments: 7 Pages

  20. arXiv:2308.13821  [pdf, other

    cs.LG cs.AI

    A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and Future Directions

    Authors: Zemin Liu, Yuan Li, Nan Chen, Qian Wang, Bryan Hooi, Bingsheng He

    Abstract: Graphs represent interconnected structures prevalent in a myriad of real-world scenarios. Effective graph analytics, such as graph learning methods, enables users to gain profound insights from graph data, underpinning various tasks including node classification and link prediction. However, these methods often suffer from data imbalance, a common issue in graph data where certain segments possess… ▽ More

    Submitted 29 August, 2023; v1 submitted 26 August, 2023; originally announced August 2023.

    Comments: The collection of awesome literature on imbalanced learning on graphs: https://github.com/Xtra-Computing/Awesome-Literature-ILoGs

  21. arXiv:2307.11572  [pdf, other

    cs.SI

    Prompt-Based Zero- and Few-Shot Node Classification: A Multimodal Approach

    Authors: Yuexin Li, Bryan Hooi

    Abstract: Multimodal data empowers machine learning models to better understand the world from various perspectives. In this work, we study the combination of \emph{text and graph} modalities, a challenging but understudied combination which is prevalent across multiple settings including citation networks, social media, and the web. We focus on the popular task of node classification using limited labels;… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: Work in progress

  22. DECOR: Degree-Corrected Social Graph Refinement for Fake News Detection

    Authors: Jiaying Wu, Bryan Hooi

    Abstract: Recent efforts in fake news detection have witnessed a surge of interest in using graph neural networks (GNNs) to exploit rich social context. Existing studies generally leverage fixed graph structures, assuming that the graphs accurately represent the related social engagements. However, edge noise remains a critical challenge in real-world graphs, as training on suboptimal structures can severel… ▽ More

    Submitted 30 June, 2023; originally announced July 2023.

    Comments: Accepted to KDD 2023 (Research Track)

  23. arXiv:2306.13063  [pdf, other

    cs.CL

    Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs

    Authors: Miao Xiong, Zhiyuan Hu, Xinyang Lu, Yifei Li, Jie Fu, Junxian He, Bryan Hooi

    Abstract: Empowering large language models to accurately express confidence in their answers is essential for trustworthy decision-making. Previous confidence elicitation methods, which primarily rely on white-box access to internal model information or model fine-tuning, have become less suitable for LLMs, especially closed-source commercial APIs. This leads to a growing need to explore the untapped area o… ▽ More

    Submitted 17 March, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

    Comments: The paper is accepted by ICLR 2024. The code is publicly available at https://github.com/MiaoXiong2320/llm-uncertainty

  24. Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulator to Enhance Dialogue System

    Authors: Zhiyuan Hu, Yue Feng, Anh Tuan Luu, Bryan Hooi, Aldo Lipani

    Abstract: Dialogue systems and large language models (LLMs) have gained considerable attention. However, the direct utilization of LLMs as task-oriented dialogue (TOD) models has been found to underperform compared to smaller task-specific models. Nonetheless, it is crucial to acknowledge the significant potential of LLMs and explore improved approaches for leveraging their impressive abilities. Motivated b… ▽ More

    Submitted 19 October, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted by CIKM 2023

  25. arXiv:2306.08456  [pdf, other

    cs.CL

    PoetryDiffusion: Towards Joint Semantic and Metrical Manipulation in Poetry Generation

    Authors: Zhiyuan Hu, Chumin Liu, Yue Feng, Anh Tuan Luu, Bryan Hooi

    Abstract: Controllable text generation is a challenging and meaningful field in natural language generation (NLG). Especially, poetry generation is a typical one with well-defined and strict conditions for text generation which is an ideal playground for the assessment of current methodologies. While prior works succeeded in controlling either semantic or metrical aspects of poetry generation, simultaneousl… ▽ More

    Submitted 19 December, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: Accepted by AAAI2024

  26. arXiv:2306.08259  [pdf, other

    cs.LG

    LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting

    Authors: Xu Liu, Yutong Xia, Yuxuan Liang, Junfeng Hu, Yiwei Wang, Lei Bai, Chao Huang, Zhenguang Liu, Bryan Hooi, Roger Zimmermann

    Abstract: Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning in capturing non-linear patterns of traffic data. However, the promising results achieved on current public datasets may not be applicable to practical scenarios due to limitations within these datasets. First, the limited sizes of them may not… ▽ More

    Submitted 28 October, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

  27. arXiv:2306.04590  [pdf, other

    cs.LG cs.AI

    Proximity-Informed Calibration for Deep Neural Networks

    Authors: Miao Xiong, Ailin Deng, Pang Wei Koh, Jiaying Wu, Shen Li, Jianqing Xu, Bryan Hooi

    Abstract: Confidence calibration is central to providing accurate and interpretable uncertainty estimates, especially under safety-critical scenarios. However, we find that existing calibration algorithms often overlook the issue of *proximity bias*, a phenomenon where models tend to be more overconfident in low proximity data (i.e., data lying in the sparse region of the data distribution) compared to high… ▽ More

    Submitted 17 March, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: The paper is accepted by NeurIPS 2023. The code is available at: https://github.com/MiaoXiong2320/ProximityBias-Calibration

  28. arXiv:2306.00015  [pdf, other

    cs.LG cs.AI

    GraphCleaner: Detecting Mislabelled Samples in Popular Graph Learning Benchmarks

    Authors: Yuwen Li, Miao Xiong, Bryan Hooi

    Abstract: Label errors have been found to be prevalent in popular text, vision, and audio datasets, which heavily influence the safe development and evaluation of machine learning algorithms. Despite increasing efforts towards improving the quality of generic data types, such as images and texts, the problem of mislabel detection in graph data remains underexplored. To bridge the gap, we explore mislabellin… ▽ More

    Submitted 30 May, 2023; originally announced June 2023.

    Comments: ICML 2023

  29. arXiv:2305.19523  [pdf, other

    cs.LG

    Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning

    Authors: Xiaoxin He, Xavier Bresson, Thomas Laurent, Adam Perold, Yann LeCun, Bryan Hooi

    Abstract: Representation learning on text-attributed graphs (TAGs) has become a critical research problem in recent years. A typical example of a TAG is a paper citation graph, where the text of each paper serves as node attributes. Initial graph neural network (GNN) pipelines handled these text attributes by transforming them into shallow or hand-crafted features, such as skip-gram or bag-of-words features… ▽ More

    Submitted 6 March, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: In Proceedings of ICLR 2024

  30. arXiv:2305.13617  [pdf, other

    cs.CL cs.AI cs.LG

    SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres

    Authors: Shumin Deng, Shengyu Mao, Ningyu Zhang, Bryan Hooi

    Abstract: Event-centric structured prediction involves predicting structured outputs of events. In most NLP cases, event structures are complex with manifold dependency, and it is challenging to effectively represent these complicated structured events. To address these issues, we propose Structured Prediction with Energy-based Event-Centric Hyperspheres (SPEECH). SPEECH models complex dependency among even… ▽ More

    Submitted 18 September, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted by ACL 2023 Main Conference. Code is released at \url{https://github.com/zjunlp/SPEECH}

  31. arXiv:2305.13551  [pdf, other

    cs.CL cs.AI

    How Fragile is Relation Extraction under Entity Replacements?

    Authors: Yiwei Wang, Bryan Hooi, Fei Wang, Yujun Cai, Yuxuan Liang, Wenxuan Zhou, **g Tang, Manjuan Duan, Muhao Chen

    Abstract: Relation extraction (RE) aims to extract the relations between entity names from the textual context. In principle, textual context determines the ground-truth relation and the RE models should be able to correctly identify the relations reflected by the textual context. However, existing work has found that the RE models memorize the entity name patterns to make RE predictions while ignoring the… ▽ More

    Submitted 7 May, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

  32. arXiv:2305.06102  [pdf, other

    cs.LG cs.AI

    Towards Better Graph Representation Learning with Parameterized Decomposition & Filtering

    Authors: Mingqi Yang, Wenjie Feng, Yanming Shen, Bryan Hooi

    Abstract: Proposing an effective and flexible matrix to represent a graph is a fundamental challenge that has been explored from multiple perspectives, e.g., filtering in Graph Fourier Transforms. In this work, we develop a novel and general framework which unifies many existing GNN models from the view of parameterized decomposition and filtering, and show how it helps to enhance the flexibility of GNNs wh… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: ICML 2023

  33. arXiv:2305.01481  [pdf, other

    cs.LG cs.AI cs.CV

    Great Models Think Alike: Improving Model Reliability via Inter-Model Latent Agreement

    Authors: Ailin Deng, Miao Xiong, Bryan Hooi

    Abstract: Reliable application of machine learning is of primary importance to the practical deployment of deep learning methods. A fundamental challenge is that models are often unreliable due to overconfidence. In this paper, we estimate a model's reliability by measuring \emph{the agreement between its latent space, and the latent space of a foundation model}. However, it is challenging to measure the ag… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: ICML 2023

  34. arXiv:2304.08640  [pdf, other

    cs.LG cs.SI

    TAP: A Comprehensive Data Repository for Traffic Accident Prediction in Road Networks

    Authors: Baixiang Huang, Bryan Hooi, Kai Shu

    Abstract: Road safety is a major global public health concern. Effective traffic crash prediction can play a critical role in reducing road traffic accidents. However, Existing machine learning approaches tend to focus on predicting traffic accidents in isolation, without considering the potential relationships between different accident locations within road networks. To incorporate graph structure informa… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: 10 pages, 5 figures

  35. arXiv:2302.13053  [pdf, other

    cs.LG cs.AI cs.IR

    Scalable Neural Network Training over Distributed Graphs

    Authors: Aashish Kolluri, Sarthak Choudhary, Bryan Hooi, Prateek Saxena

    Abstract: Graph neural networks (GNNs) fuel diverse machine learning tasks involving graph-structured data, ranging from predicting protein structures to serving personalized recommendations. Real-world graph data must often be stored distributed across many machines not just because of capacity constraints, but because of compliance with data residency or privacy laws. In such setups, network communication… ▽ More

    Submitted 11 February, 2024; v1 submitted 25 February, 2023; originally announced February 2023.

  36. arXiv:2302.02628  [pdf, other

    cs.LG cs.AI

    Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness

    Authors: Ailin Deng, Shen Li, Miao Xiong, Zhirui Chen, Bryan Hooi

    Abstract: Trustworthy machine learning is of primary importance to the practical deployment of deep learning models. While state-of-the-art models achieve astonishingly good performance in terms of accuracy, recent literature reveals that their predictive confidence scores unfortunately cannot be trusted: e.g., they are often overconfident when wrong predictions are made, or so even for obvious outliers. In… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: European Conference on Computer Vision 2022

  37. arXiv:2301.12603  [pdf, other

    cs.LG cs.SI

    Do We Really Need Graph Neural Networks for Traffic Forecasting?

    Authors: Xu Liu, Yuxuan Liang, Chao Huang, Hengchang Hu, Yushi Cao, Bryan Hooi, Roger Zimmermann

    Abstract: Spatio-temporal graph neural networks (STGNN) have become the most popular solution to traffic forecasting. While successful, they rely on the message passing scheme of GNNs to establish spatial dependencies between nodes, and thus inevitably inherit GNNs' notorious inefficiency. Given these facts, in this paper, we propose an embarrassingly simple yet remarkably effective spatio-temporal learning… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

  38. arXiv:2212.13350  [pdf, other

    cs.CV

    A Generalization of ViT/MLP-Mixer to Graphs

    Authors: Xiaoxin He, Bryan Hooi, Thomas Laurent, Adam Perold, Yann LeCun, Xavier Bresson

    Abstract: Graph Neural Networks (GNNs) have shown great potential in the field of graph representation learning. Standard GNNs define a local message-passing mechanism which propagates information over the whole graph domain by stacking multiple layers. This paradigm suffers from two major limitations, over-squashing and poor long-range dependencies, that can be solved using global attention but significant… ▽ More

    Submitted 30 May, 2023; v1 submitted 26 December, 2022; originally announced December 2022.

    Comments: In Proceedings of ICML 2023

  39. arXiv:2211.16466  [pdf, other

    cs.LG cs.AI cs.CV

    Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation

    Authors: Miao Xiong, Shen Li, Wenjie Feng, Ailin Deng, Jihai Zhang, Bryan Hooi

    Abstract: How do we know when the predictions made by a classifier can be trusted? This is a fundamental problem that also has immense practical applicability, especially in safety-critical areas such as medicine and autonomous driving. The de facto approach of using the classifier's softmax outputs as a proxy for trustworthiness suffers from the over-confidence issue; while the most recent works incur prob… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: Published in Transactions on Machine Learning Research (TMLR) 2022

    Journal ref: Transactions on Machine Learning Research 08/2022

  40. arXiv:2211.13976  [pdf, other

    cs.CV cs.LG

    Expanding Small-Scale Datasets with Guided Imagination

    Authors: Yifan Zhang, Daquan Zhou, Bryan Hooi, Kai Wang, Jiashi Feng

    Abstract: The power of DNNs relies heavily on the quantity and quality of training data. However, collecting and annotating data on a large scale is often expensive and time-consuming. To address this issue, we explore a new task, termed dataset expansion, aimed at expanding a ready-to-use small dataset by automatically creating new labeled samples. To this end, we present a Guided Imagination Framework (GI… ▽ More

    Submitted 10 October, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2023. Source code: https://github.com/Vanint/DatasetExpansion

  41. arXiv:2211.06977  [pdf, ps, other

    cs.DB

    Spade: A Real-Time Fraud Detection Framework on Evolving Graphs (Complete Version)

    Authors: Jiaxin Jiang, Yuan Li, Bingsheng He, Bryan Hooi, Jia Chen, Johan Kok Zhi Kang

    Abstract: Real-time fraud detection is a challenge for most financial and electronic commercial platforms. To identify fraudulent communities, Grab, one of the largest technology companies in Southeast Asia, forms a graph from a set of transactions and detects dense subgraphs arising from abnormally large numbers of connections among fraudsters. Existing dense subgraph detection approaches focus on static g… ▽ More

    Submitted 13 November, 2022; originally announced November 2022.

  42. arXiv:2210.13153  [pdf, other

    cs.LG cs.AI

    Reachability-Aware Laplacian Representation in Reinforcement Learning

    Authors: Kaixin Wang, Kuangqi Zhou, Jiashi Feng, Bryan Hooi, Xinchao Wang

    Abstract: In Reinforcement Learning (RL), Laplacian Representation (LapRep) is a task-agnostic state representation that encodes the geometry of the environment. A desirable property of LapRep stated in prior works is that the Euclidean distance in the LapRep space roughly reflects the reachability between states, which motivates the usage of this distance for reward sha**. However, we find that LapRep do… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  43. arXiv:2210.08353  [pdf, other

    cs.LG cs.AI

    MGNNI: Multiscale Graph Neural Networks with Implicit Layers

    Authors: Juncheng Liu, Bryan Hooi, Kenji Kawaguchi, Xiaokui Xiao

    Abstract: Recently, implicit graph neural networks (GNNs) have been proposed to capture long-range dependencies in underlying graphs. In this paper, we introduce and justify two weaknesses of implicit GNNs: the constrained expressiveness due to their limited effective range for capturing long-range dependencies, and their lack of ability to capture multiscale information on graphs at multiple resolutions. T… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  44. arXiv:2209.15214  [pdf, other

    cs.AI cs.CL cs.IR cs.LG

    Construction and Applications of Billion-Scale Pre-Trained Multimodal Business Knowledge Graph

    Authors: Shumin Deng, Chengming Wang, Zhoubo Li, Ningyu Zhang, Zelin Dai, Hehong Chen, Feiyu Xiong, Ming Yan, Qiang Chen, Mosha Chen, Jiaoyan Chen, Jeff Z. Pan, Bryan Hooi, Huajun Chen

    Abstract: Business Knowledge Graphs (KGs) are important to many enterprises today, providing factual knowledge and structured data that steer many products and make them more intelligent. Despite their promising benefits, building business KG necessitates solving prohibitive issues of deficient structure and multiple modalities. In this paper, we advance the understanding of the practical challenges related… ▽ More

    Submitted 19 March, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: OpenBG. Accepted by ICDE 2023. The project is released at https://github.com/OpenBGBenchmark/OpenBG . Website: https://kg.alibaba.com/ , Leaderboard: https://tianchi.aliyun.com/dataset/dataDetail?dataId=122271

  45. arXiv:2209.12162  [pdf, other

    cs.IR cs.AI cs.LG cs.SI

    Joint Triplet Loss Learning for Next New POI Recommendation

    Authors: Nicholas Lim, Bryan Hooi, See-Kiong Ng, Yong Liang Goh

    Abstract: Sparsity of the User-POI matrix is a well established problem for next POI recommendation, which hinders effective learning of user preferences. Focusing on a more granular extension of the problem, we propose a Joint Triplet Loss Learning (JTLL) module for the Next New ($N^2$) POI recommendation task, which is more challenging. Our JTLL module first computes additional training samples from the u… ▽ More

    Submitted 25 September, 2022; originally announced September 2022.

  46. arXiv:2209.10100  [pdf, other

    cs.SI cs.AI cs.LG

    Flashlight: Scalable Link Prediction with Effective Decoders

    Authors: Yiwei Wang, Bryan Hooi, Yozen Liu, Tong Zhao, Zhichun Guo, Neil Shah

    Abstract: Link prediction (LP) has been recognized as an important task in graph learning with its broad practical applications. A typical application of LP is to retrieve the top scoring neighbors for a given source node, such as the friend recommendation. These services desire the high inference scalability to find the top scoring neighbors from many candidate nodes at low latencies. There are two popular… ▽ More

    Submitted 3 December, 2022; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: text overlap with arXiv:2112.02936 by other authors

  47. arXiv:2209.08799  [pdf, other

    cs.SI cs.AI cs.CL

    Probing Spurious Correlations in Popular Event-Based Rumor Detection Benchmarks

    Authors: Jiaying Wu, Bryan Hooi

    Abstract: As social media becomes a hotbed for the spread of misinformation, the crucial task of rumor detection has witnessed promising advances fostered by open-source benchmark datasets. Despite being widely used, we find that these datasets suffer from spurious correlations, which are ignored by existing studies and lead to severe overestimation of existing rumor detection performance. The spurious corr… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: Accepted to ECML-PKDD 2022

  48. arXiv:2208.10753  [pdf, other

    cs.CV cs.LG

    Neural PCA for Flow-Based Representation Learning

    Authors: Shen Li, Bryan Hooi

    Abstract: Of particular interest is to discover useful representations solely from observations in an unsupervised generative manner. However, the question of whether existing normalizing flows provide effective representations for downstream tasks remains mostly unanswered despite their strong ability for sample generation and density estimation. This paper investigates this problem for such a family of ge… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Comments: Accepted to IJCAI 2022

  49. arXiv:2208.08609  [pdf, other

    cs.AI cs.FL cs.LG cs.SC

    A Scalable, Interpretable, Verifiable & Differentiable Logic Gate Convolutional Neural Network Architecture From Truth Tables

    Authors: Adrien Benamira, Tristan Guérand, Thomas Peyrin, Trevor Yap, Bryan Hooi

    Abstract: We propose $\mathcal{T}$ruth $\mathcal{T}$able net ($\mathcal{TT}$net), a novel Convolutional Neural Network (CNN) architecture that addresses, by design, the open challenges of interpretability, formal verification, and logic gate conversion. $\mathcal{TT}$net is built using CNNs' filters that are equivalent to tractable truth tables and that we call Learning Truth Table (LTT) blocks. The dual fo… ▽ More

    Submitted 2 February, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

  50. arXiv:2206.07604  [pdf, other

    cs.LG cs.AI

    ARES: Locally Adaptive Reconstruction-based Anomaly Scoring

    Authors: Adam Goodge, Bryan Hooi, See Kiong Ng, Wee Siong Ng

    Abstract: How can we detect anomalies: that is, samples that significantly differ from a given set of high-dimensional data, such as images or sensor data? This is a practical problem with numerous applications and is also relevant to the goal of making learning algorithms more robust to unexpected inputs. Autoencoders are a popular approach, partly due to their simplicity and their ability to perform dimen… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Journal ref: ECMLPKDD2022