Skip to main content

Showing 1–50 of 191 results for author: Shih

Searching in archive cs. Search in all archives.
.
  1. Efficient IoT Devices Localization Through Wi-Fi CSI Feature Fusion and Anomaly Detection

    Authors: Yan Li, Jie Yang, Shang-Ling Shih, Wan-Ting Shih, Chao-Kai Wen, Shi **

    Abstract: Internet of Things (IoT) device localization is fundamental to smart home functionalities, including indoor navigation and tracking of individuals. Traditional localization relies on relative methods utilizing the positions of anchors within a home environment, yet struggles with precision due to inherent inaccuracies in these anchor positions. In response, we introduce a cutting-edge smartphone-b… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted in IEEE Internet of Things Journal, Early Access, 2024

    Journal ref: IEEE Internet of Things Journal, Early Access, 2024

  2. arXiv:2406.18087  [pdf, other

    cs.SE cs.AI cs.CL

    EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models

    Authors: Chun-Chieh Liao, Wei-Ting Kuo, I-Hsuan Hu, Yen-Chen Shih, Jun-En Ding, Feng Liu, Fang-Ming Hung

    Abstract: Traditional diagnosis of chronic diseases involves in-person consultations with physicians to identify the disease. However, there is a lack of research focused on predicting and develo** application systems using clinical notes and blood test values. We collected five years of Electronic Health Records (EHRs) from Taiwan's hospital database between 2017 and 2021 as an AI database. Furthermore,… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.13578  [pdf, other

    cs.CL

    Enhancing Distractor Generation for Multiple-Choice Questions with Retrieval Augmented Pretraining and Knowledge Graph Integration

    Authors: Han-Cheng Yu, Yu-An Shih, Kin-Man Law, Kai-Yu Hsieh, Yu-Chen Cheng, Hsin-Chih Ho, Zih-An Lin, Wen-Chuan Hsu, Yao-Chung Fan

    Abstract: In this paper, we tackle the task of distractor generation (DG) for multiple-choice questions. Our study introduces two key designs. First, we propose \textit{retrieval augmented pretraining}, which involves refining the language model pretraining to align it more closely with the downstream task of DG. Second, we explore the integration of knowledge graphs to enhance the performance of DG. Throug… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Findings at ACL 2024

  4. arXiv:2406.12209  [pdf, other

    cs.SD cs.CL eess.AS

    Interface Design for Self-Supervised Speech Models

    Authors: Yi-Jen Shih, David Harwath

    Abstract: Self-supervised speech (SSL) models have recently become widely adopted for many downstream speech processing tasks. The general usage pattern is to employ SSL models as feature extractors, and then train a downstream prediction head to solve a specific task. However, different layers of SSL models have been shown to capture different types of information, and the methods of combining them are not… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech2024

  5. arXiv:2406.09395  [pdf, other

    cs.CV

    Modeling Ambient Scene Dynamics for Free-view Synthesis

    Authors: Meng-Li Shih, Jia-Bin Huang, Changil Kim, Rajvi Shah, Johannes Kopf, Chen Gao

    Abstract: We introduce a novel method for dynamic free-view synthesis of an ambient scenes from a monocular capture bringing a immersive quality to the viewing experience. Our method builds upon the recent advancements in 3D Gaussian Splatting (3DGS) that can faithfully reconstruct complex static scenes. Previous attempts to extend 3DGS to represent dynamics have been confined to bounded scenes or require m… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  6. arXiv:2406.06133  [pdf, other

    cs.CV

    ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models

    Authors: Meng-Li Shih, Wei-Chiu Ma, Lorenzo Boyice, Aleksander Holynski, Forrester Cole, Brian L. Curless, Janne Kontkanen

    Abstract: We propose ExtraNeRF, a novel method for extrapolating the range of views handled by a Neural Radiance Field (NeRF). Our main idea is to leverage NeRFs to model scene-specific, fine-grained details, while capitalizing on diffusion models to extrapolate beyond our observed data. A key ingredient is to track visibility to determine what portions of the scene have not been observed, and focus on reco… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 8 pages, 8 figures, CVPR2024

  7. arXiv:2406.01033  [pdf

    cs.CV cs.LG cs.MM

    Generalized Jersey Number Recognition Using Multi-task Learning With Orientation-guided Weight Refinement

    Authors: Yung-Hui Lin, Yu-Wen Chang, Huang-Chia Shih, Takahiro Ogawa

    Abstract: Jersey number recognition (JNR) has always been an important task in sports analytics. Improving recognition accuracy remains an ongoing challenge because images are subject to blurring, occlusion, deformity, and low resolution. Recent research has addressed these problems using number localization and optical character recognition. Some approaches apply player identification schemes to image sequ… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 10 pages, 6 figures, 5 tables

  8. arXiv:2406.00936  [pdf, other

    cs.CL

    A Survey of Useful LLM Evaluation

    Authors: Ji-Lun Peng, Sijia Cheng, Egil Diau, Yung-Yu Shih, Po-Heng Chen, Yen-Ting Lin, Yun-Nung Chen

    Abstract: LLMs have gotten attention across various research domains due to their exceptional performance on a wide range of complex tasks. Therefore, refined methods to evaluate the capabilities of LLMs are needed to determine the tasks and responsibility they should undertake. Our study mainly discussed how LLMs, as useful tools, should be effectively assessed. We proposed the two-stage framework: from ``… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  9. arXiv:2405.20407  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Convolutional L2LFlows: Generating Accurate Showers in Highly Granular Calorimeters Using Convolutional Normalizing Flows

    Authors: Thorsten Buss, Frank Gaede, Gregor Kasieczka, Claudius Krause, David Shih

    Abstract: In the quest to build generative surrogate models as computationally efficient alternatives to rule-based simulations, the quality of the generated samples remains a crucial frontier. So far, normalizing flows have been among the models with the best fidelity. However, as the latent space in such models is required to have the same dimensionality as the data space, scaling up normalizing flows to… ▽ More

    Submitted 3 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Report number: HEPHY-ML-24-02

  10. arXiv:2405.19595  [pdf

    cs.CV

    The RSNA Abdominal Traumatic Injury CT (RATIC) Dataset

    Authors: Jeffrey D. Rudie, Hui-Ming Lin, Robyn L. Ball, Sabeena Jalal, Luciano M. Prevedello, Savvas Nicolaou, Brett S. Marinelli, Adam E. Flanders, Kirti Magudia, George Shih, Melissa A. Davis, John Mongan, Peter D. Chang, Ferco H. Berger, Sebastiaan Hermans, Meng Law, Tyler Richards, Jan-Peter Grunz, Andreas Steven Kunz, Shobhit Mathur, Sandro Galea-Soler, Andrew D. Chung, Saif Afat, Chin-Chi Kuo, Layal Aweidah , et al. (15 additional authors not shown)

    Abstract: The RSNA Abdominal Traumatic Injury CT (RATIC) dataset is the largest publicly available collection of adult abdominal CT studies annotated for traumatic injuries. This dataset includes 4,274 studies from 23 institutions across 14 countries. The dataset is freely available for non-commercial use via Kaggle at https://www.kaggle.com/competitions/rsna-2023-abdominal-trauma-detection. Created for the… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 40 pages, 2 figures, 3 tables

  11. arXiv:2405.19166  [pdf, other

    cs.LG cs.AI

    Transformers as Neural Operators for Solutions of Differential Equations with Finite Regularity

    Authors: Benjamin Shih, Ahmad Peyvan, Zhongqiang Zhang, George Em Karniadakis

    Abstract: Neural operator learning models have emerged as very effective surrogates in data-driven methods for partial differential equations (PDEs) across different applications from computational science and engineering. Such operator learning models not only predict particular instances of a physical or biological system in real-time but also forecast classes of solutions corresponding to a distribution… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  12. arXiv:2405.02260  [pdf, other

    cs.HC

    Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows

    Authors: Jasmine Y. Shih, Vishal Mohanty, Yannis Katsis, Hariharan Subramonyam

    Abstract: Domain experts can play a crucial role in guiding data scientists to optimize machine learning models while ensuring contextual relevance for downstream use. However, in current workflows, such collaboration is challenging due to differing expertise, abstract documentation practices, and lack of access and visibility into low-level implementation artifacts. To address these challenges and enable d… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  13. arXiv:2403.15528  [pdf, other

    eess.IV cs.AI cs.CV

    Evaluating GPT-4 with Vision on Detection of Radiological Findings on Chest Radiographs

    Authors: Yiliang Zhou, Hanley Ong, Patrick Kennedy, Carol Wu, Jacob Kazam, Keith Hentel, Adam Flanders, George Shih, Yifan Peng

    Abstract: The study examines the application of GPT-4V, a multi-modal large language model equipped with visual recognition, in detecting radiological findings from a set of 100 chest radiographs and suggests that GPT-4V is currently not ready for real-world diagnostic usage in interpreting chest radiographs.

    Submitted 12 May, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  14. arXiv:2403.14711  [pdf, other

    cs.CY cs.AI cs.HC cs.LG

    Human-in-the-Loop AI for Cheating Ring Detection

    Authors: Yong-Siang Shih, Manqian Liao, Ruidong Liu, Mirza Basim Baig

    Abstract: Online exams have become popular in recent years due to their accessibility. However, some concerns have been raised about the security of the online exams, particularly in the context of professional cheating services aiding malicious test takers in passing exams, forming so-called "cheating rings". In this paper, we introduce a human-in-the-loop AI cheating ring detection system designed to dete… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted to the AI4Ed Workshop at AAAI 2024 as a short paper

  15. arXiv:2403.13178  [pdf, other

    stat.ML cs.AI cs.LG

    Fast Value Tracking for Deep Reinforcement Learning

    Authors: Frank Shih, Faming Liang

    Abstract: Reinforcement learning (RL) tackles sequential decision-making problems by creating agents that interacts with their environment. However, existing algorithms often view these problem as static, focusing on point estimates for model parameters to maximize expected rewards, neglecting the stochastic dynamics of agent-environment interactions and the critical role of uncertainty quantification. Our… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  16. arXiv:2403.09188  [pdf

    cs.LG eess.SP

    Design of an basis-projected layer for sparse datasets in deep learning training using gc-ms spectra as a case study

    Authors: Yu Tang Chang, Shih Fang Chen

    Abstract: Deep learning (DL) models encompass millions or even billions of parameters and learn complex patterns from big data. However, not all data are initially stored in a suitable formation to effectively train a DL model, e.g., gas chromatography-mass spectrometry (GC-MS) spectra and DNA sequence. These datasets commonly contain many zero values, and the sparse data formation causes difficulties in op… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 5 pages, 2 figures, 2 tables, conference

    MSC Class: 68-06 ACM Class: I.2.4; J.2

  17. arXiv:2403.03218  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

    Authors: Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer , et al. (32 additional authors not shown)

    Abstract: The White House Executive Order on Artificial Intelligence highlights the risks of large language models (LLMs) empowering malicious actors in develo** biological, cyber, and chemical weapons. To measure these risks of malicious use, government institutions and major AI labs are develo** evaluations for hazardous capabilities in LLMs. However, current evaluations are private, preventing furthe… ▽ More

    Submitted 15 May, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: See the project page at https://wmdp.ai

  18. arXiv:2402.06982  [pdf, other

    cs.CV cs.AI physics.med-ph

    Treatment-wise Glioblastoma Survival Inference with Multi-parametric Preoperative MRI

    Authors: Xiaofeng Liu, Nadya Shusharina, Helen A Shih, C. -C. Jay Kuo, Georges El Fakhri, Jonghye Woo

    Abstract: In this work, we aim to predict the survival time (ST) of glioblastoma (GBM) patients undergoing different treatments based on preoperative magnetic resonance (MR) scans. The personalized and precise treatment planning can be achieved by comparing the ST of different treatments. It is well established that both the current status of the patient (as represented by the MR scans) and the choice of tr… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: SPIE Medical Imaging 2024: Computer-Aided Diagnosis

  19. arXiv:2402.06959  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data

    Authors: Hsuan-Fu Wang, Yi-Jen Shih, Heng-Jui Chang, Layne Berry, Puyuan Peng, Hung-yi Lee, Hsin-Min Wang, David Harwath

    Abstract: The recently proposed visually grounded speech model SpeechCLIP is an innovative framework that bridges speech and text through images via CLIP without relying on text transcription. On this basis, this paper introduces two extensions to SpeechCLIP. First, we apply the Continuous Integrate-and-Fire (CIF) module to replace a fixed number of CLS tokens in the cascaded architecture. Second, we propos… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024, Self-supervision in Audio, Speech, and Beyond (SASB) workshop

  20. arXiv:2402.05819  [pdf, other

    eess.AS cs.CL cs.LG

    Integrating Self-supervised Speech Model with Pseudo Word-level Targets from Visually-grounded Speech Model

    Authors: Hung-Chieh Fang, Nai-Xuan Ye, Yi-Jen Shih, Puyuan Peng, Hsuan-Fu Wang, Layne Berry, Hung-yi Lee, David Harwath

    Abstract: Recent advances in self-supervised speech models have shown significant improvement in many downstream tasks. However, these models predominantly centered on frame-level training objectives, which can fall short in spoken language understanding tasks that require semantic comprehension. Existing works often rely on additional speech-text data as intermediate targets, which is costly in the real-wo… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024 workshop on Self-supervision in Audio, Speech, and Beyond (SASB)

  21. arXiv:2401.15111  [pdf, other

    eess.IV cs.CV cs.LG

    Improving Fairness of Automated Chest X-ray Diagnosis by Contrastive Learning

    Authors: Mingquan Lin, Tianhao Li, Zhaoyi Sun, Gregory Holste, Ying Ding, Fei Wang, George Shih, Yifan Peng

    Abstract: Purpose: Limited studies exploring concrete methods or approaches to tackle and enhance model fairness in the radiology domain. Our proposed AI model utilizes supervised contrastive learning to minimize bias in CXR diagnosis. Materials and Methods: In this retrospective study, we evaluated our proposed method on two datasets: the Medical Imaging and Data Resource Center (MIDRC) dataset with 77,8… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 23 pages, 5 figures

    MSC Class: arms.org

  22. arXiv:2401.08864  [pdf, other

    eess.AS cs.LG cs.SD

    Binaural Angular Separation Network

    Authors: Yang Yang, George Sung, Shao-Fu Shih, Hakan Erdogan, Chehung Lee, Matthias Grundmann

    Abstract: We propose a neural network model that can separate target speech sources from interfering sources at different angular regions using two microphones. The model is trained with simulated room impulse responses (RIRs) using omni-directional microphones without needing to collect real RIRs. By relying on specific angular regions and multiple room simulations, the model utilizes consistent time diffe… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted to ICASSP 2024

  23. arXiv:2401.01461  [pdf, other

    cs.CV

    Efficient Hybrid Zoom using Camera Fusion on Mobile Phones

    Authors: Xiaotong Wu, Wei-Sheng Lai, YiChang Shih, Charles Herrmann, Michael Krainin, Deqing Sun, Chia-Kai Liang

    Abstract: DSLR cameras can achieve multiple zoom levels via shifting lens distances or swap** lens types. However, these techniques are not possible on smartphone devices due to space constraints. Most smartphone manufacturers adopt a hybrid zoom system: commonly a Wide (W) camera at a low zoom level and a Telephoto (T) camera at a high zoom level. To simulate zoom levels between W and T, these systems cr… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Accepted to SIGGRAPH Asia 2023 (ACM TOG). Project website: https://www.wslai.net/publications/fusion_zoom

  24. arXiv:2312.14495  [pdf, other

    cs.SI cs.IT eess.SP

    Beam Foreseeing in Millimeter-Wave Systems with Situational Awareness: Fundamental Limits via Cramér-Rao Lower Bound

    Authors: Wan-Ting Shih, Chao-Kai Wen, Shang-Ho Tsai, Shi **, Chau Yuen

    Abstract: Millimeter-wave (mmWave) networks offer the potential for high-speed data transfer and precise localization, leveraging large antenna arrays and extensive bandwidths. However, these networks are challenged by significant path loss and susceptibility to blockages. In this study, we delve into the use of situational awareness for beam prediction within the 5G NR beam management framework. We introdu… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: 16 pages, 10 figures; IEEE Transactions on Wireless Communications

  25. arXiv:2312.11822  [pdf, other

    cond-mat.soft cs.LG

    Classification of complex local environments in systems of particle shapes through shape-symmetry encoded data augmentation

    Authors: Shih-Kuang, Lee, Sun-Ting Tsai, Sharon Glotzer

    Abstract: Detecting and analyzing the local environment is crucial for investigating the dynamical processes of crystal nucleation and shape colloidal particle self-assembly. Recent developments in machine learning provide a promising avenue for better order parameters in complex systems that are challenging to study using traditional approaches. However, the application of machine learning to self-assembly… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: 14 pages, 9 figures

  26. arXiv:2312.11629  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Residual ANODE

    Authors: Ranit Das, Gregor Kasieczka, David Shih

    Abstract: We present R-ANODE, a new method for data-driven, model-agnostic resonant anomaly detection that raises the bar for both performance and interpretability. The key to R-ANODE is to enhance the inductive bias of the anomaly detection task by fitting a normalizing flow directly to the small and unknown signal component, while holding fixed a background model (also a normalizing flow) learned from sid… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: 9 pages, 6 figures

  27. arXiv:2312.09781  [pdf, other

    cs.CL cs.AI

    GSQA: An End-to-End Model for Generative Spoken Question Answering

    Authors: Min-Han Shih, Ho-Lam Chung, Yu-Chi Pai, Ming-Hao Hsu, Guan-Ting Lin, Shang-Wen Li, Hung-yi Lee

    Abstract: In recent advancements in spoken question answering (QA), end-to-end models have made significant strides. However, previous research has primarily focused on extractive span selection. While this extractive-based approach is effective when answers are present directly within the input, it falls short in addressing abstractive questions, where answers are not directly extracted but inferred from t… ▽ More

    Submitted 2 July, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: 5 pages, 2 figures, Interspeech 2024

  28. arXiv:2312.00123  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Flow Matching Beyond Kinematics: Generating Jets with Particle-ID and Trajectory Displacement Information

    Authors: Joschka Birk, Erik Buhmann, Cedric Ewen, Gregor Kasieczka, David Shih

    Abstract: We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a g… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  29. arXiv:2311.18695  [pdf, other

    cs.CV cs.LG

    Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction

    Authors: Cheng Sun, Wei-En Tai, Yu-Lin Shih, Kuan-Wei Chen, Yong-**g Syu, Kent Selwyn The, Yu-Chiang Frank Wang, Hwann-Tzong Chen

    Abstract: State-of-the-art single-view 360-degree room layout reconstruction methods formulate the problem as a high-level 1D (per-column) regression task. On the other hand, traditional low-level 2D layout segmentation is simpler to learn and can represent occluded regions, but it requires complex post-processing for the targeting layout polygon and sacrifices accuracy. We present Seg2Reg to render 1D layo… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  30. arXiv:2311.17082  [pdf, other

    cs.CV stat.ML

    DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling

    Authors: Linqi Zhou, Andy Shih, Chenlin Meng, Stefano Ermon

    Abstract: Recent methods such as Score Distillation Sampling (SDS) and Variational Score Distillation (VSD) using 2D diffusion models for text-to-3D generation have demonstrated impressive generation quality. However, the long generation time of such algorithms significantly degrades the user experience. To tackle this problem, we propose DreamPropeller, a drop-in acceleration algorithm that can be wrapped… ▽ More

    Submitted 20 May, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: Github repo: https://github.com/alexzhou907/DreamPropeller; Project page: https://alexzhou907.github.io/dreampropeller_page/

  31. arXiv:2311.07178  [pdf, other

    cs.AI cs.GT cs.LG

    Game Solving with Online Fine-Tuning

    Authors: Ti-Rong Wu, Hung Guei, Ting Han Wei, Chung-Chin Shih, Jui-Te Chin, I-Chen Wu

    Abstract: Game solving is a similar, yet more difficult task than mastering a game. Solving a game typically means to find the game-theoretic value (outcome given optimal play), and optionally a full strategy to follow in order to achieve that outcome. The AlphaZero algorithm has demonstrated super-human level play, and its powerful policy and value predictions have also served as heuristics in game solving… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted by the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  32. arXiv:2311.05477  [pdf, other

    eess.IV cs.CV cs.LG

    Using ResNet to Utilize 4-class T2-FLAIR Slice Classification Based on the Cholinergic Pathways Hyperintensities Scale for Pathological Aging

    Authors: Wei-Chun Kevin Tsai, Yi-Chien Liu, Ming-Chun Yu, Chia-Ju Chou, Sui-Hing Yan, Yang-Teng Fan, Yan-Hsiang Huang, Yen-Ling Chiu, Yi-Fang Chuang, Ran-Zan Wang, Yao-Chia Shih

    Abstract: The Cholinergic Pathways Hyperintensities Scale (CHIPS) is a visual rating scale used to assess the extent of cholinergic white matter hyperintensities in T2-FLAIR images, serving as an indicator of dementia severity. However, the manual selection of four specific slices for rating throughout the entire brain is a time-consuming process. Our goal was to develop a deep learning-based model capable… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: 8 pages, 2 figures, 2 tables

  33. arXiv:2310.16112  [pdf, other

    cs.CV

    Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge

    Authors: Gregory Holste, Yiliang Zhou, Song Wang, Ajay Jaiswal, Mingquan Lin, Sherry Zhuge, Yuzhe Yang, Dongkyun Kim, Trong-Hieu Nguyen-Mau, Minh-Triet Tran, Jaehyup Jeong, Wongi Park, Jongbin Ryu, Feng Hong, Arsh Verma, Yosuke Yamagishi, Changhyun Kim, Hyeryeong Seo, Myungjoo Kang, Leo Anthony Celi, Zhiyong Lu, Ronald M. Summers, George Shih, Zhangyang Wang, Yifan Peng

    Abstract: Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" $\unicode{x2013}$ there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of… ▽ More

    Submitted 1 April, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: Update after major revision

  34. arXiv:2310.15414  [pdf, other

    cs.AI cs.LG cs.MA

    Diverse Conventions for Human-AI Collaboration

    Authors: Bidipta Sarkar, Andy Shih, Dorsa Sadigh

    Abstract: Conventions are crucial for strong performance in cooperative multi-agent games, because they allow players to coordinate on a shared strategy without explicit communication. Unfortunately, standard multi-agent reinforcement learning techniques, such as self-play, converge to conventions that are arbitrary and non-diverse, leading to poor generalization when interacting with new partners. In this… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 25 pages, 9 figures, 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  35. arXiv:2310.12209  [pdf, other

    astro-ph.IM astro-ph.HE cs.LG gr-qc hep-ph

    Fast Parameter Inference on Pulsar Timing Arrays with Normalizing Flows

    Authors: David Shih, Marat Freytsis, Stephen R. Taylor, Jeff A. Dror, Nolan Smyth

    Abstract: Pulsar timing arrays (PTAs) perform Bayesian posterior inference with expensive MCMC methods. Given a dataset of ~10-100 pulsars and O(10^3) timing residuals each, producing a posterior distribution for the stochastic gravitational wave background (SGWB) can take days to a week. The computational bottleneck arises because the likelihood evaluation required for MCMC is extremely costly when conside… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 8 pages, 3 figures

  36. arXiv:2310.11305  [pdf, other

    cs.AI cs.LG

    MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games

    Authors: Ti-Rong Wu, Hung Guei, Pei-Chiun Peng, Po-Wei Huang, Ting Han Wei, Chung-Chin Shih, Yun-Jui Tsai

    Abstract: This paper presents MiniZero, a zero-knowledge learning framework that supports four state-of-the-art algorithms, including AlphaZero, MuZero, Gumbel AlphaZero, and Gumbel MuZero. While these algorithms have demonstrated super-human performance in many games, it remains unclear which among them is most suitable or efficient for specific tasks. Through MiniZero, we systematically evaluate the perfo… ▽ More

    Submitted 26 April, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted by IEEE Transactions on Games

  37. arXiv:2310.00049  [pdf, other

    hep-ph cs.LG

    EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion

    Authors: Erik Buhmann, Cedric Ewen, Darius A. Faroughy, Tobias Golling, Gregor Kasieczka, Matthew Leigh, Guillaume Quétant, John Andrew Raine, Debajyoti Sengupta, David Shih

    Abstract: Jets at the LHC, typically consisting of a large number of highly correlated particles, are a fascinating laboratory for deep generative modeling. In this paper, we present two novel methods that generate LHC jets as point clouds efficiently and accurately. We introduce \epcjedi, which combines score-matching diffusion models with the Equivariant Point Cloud (EPiC) architecture based on the deep s… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

    Comments: 21 pages, 8 figures

  38. arXiv:2309.10787  [pdf, other

    eess.AS cs.CV cs.MM cs.SD

    AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models

    Authors: Yuan Tseng, Layne Berry, Yi-Ting Chen, I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Po-Yao Huang, Chun-Mao Lai, Shang-Wen Li, David Harwath, Yu Tsao, Shinji Watanabe, Abdelrahman Mohamed, Chi-Luen Feng, Hung-yi Lee

    Abstract: Audio-visual representation learning aims to develop systems with human-like perception by utilizing correlation between auditory and visual information. However, current models often focus on a limited set of tasks, and generalization abilities of learned representations are unclear. To this end, we propose the AV-SUPERB benchmark that enables general-purpose evaluation of unimodal audio/visual a… ▽ More

    Submitted 19 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024; Evaluation Code: https://github.com/roger-tseng/av-superb Submission Platform: https://av.superbbenchmark.org

  39. arXiv:2308.11700  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Calorimeter shower superresolution

    Authors: Ian Pang, John Andrew Raine, David Shih

    Abstract: Calorimeter shower simulation is a major bottleneck in the Large Hadron Collider computational pipeline. There have been recent efforts to employ deep-generative surrogate models to overcome this challenge. However, many of best performing models have training and generation times that do not scale well to high-dimensional calorimeter showers. In this work, we introduce SuperCalo, a flow-based sup… ▽ More

    Submitted 15 May, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 16 pages, 13 figures, v3: title changed, matches published version

    Journal ref: Phys. Rev. D 109, 092009 (2024)

  40. arXiv:2308.09180  [pdf, other

    cs.CV cs.AI

    How Does Pruning Impact Long-Tailed Multi-Label Medical Image Classifiers?

    Authors: Gregory Holste, Ziyu Jiang, Ajay Jaiswal, Maria Hanna, Shlomo Minkowitz, Alan C. Legasto, Joanna G. Escalon, Sharon Steinberger, Mark Bittman, Thomas C. Shen, Ying Ding, Ronald M. Summers, George Shih, Yifan Peng, Zhangyang Wang

    Abstract: Pruning has emerged as a powerful technique for compressing deep neural networks, reducing memory usage and inference time without significantly affecting overall performance. However, the nuanced ways in which pruning impacts model behavior are not well understood, particularly for long-tailed, multi-label datasets commonly found in clinical settings. This knowledge gap could have dangerous impli… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Early accepted to MICCAI 2023

  41. arXiv:2308.04571  [pdf, other

    cs.RO cs.CV cs.HC

    Optimizing Algorithms From Pairwise User Preferences

    Authors: Leonid Keselman, Katherine Shih, Martial Hebert, Aaron Steinfeld

    Abstract: Typical black-box optimization approaches in robotics focus on learning from metric scores. However, that is not always possible, as not all developers have ground truth available. Learning appropriate robot behavior in human-centric contexts often requires querying users, who typically cannot provide precise metric scores. Existing approaches leverage human feedback in an attempt to model an impl… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted at IROS 2023

    ACM Class: I.2.9; H.1.2; I.2.8

  42. arXiv:2307.08593  [pdf, other

    physics.acc-ph cs.LG hep-ex nucl-ex nucl-th

    Artificial Intelligence for the Electron Ion Collider (AI4EIC)

    Authors: C. Allaire, R. Ammendola, E. -C. Aschenauer, M. Balandat, M. Battaglieri, J. Bernauer, M. Bondì, N. Branson, T. Britton, A. Butter, I. Chahrour, P. Chatagnon, E. Cisbani, E. W. Cline, S. Dash, C. Dean, W. Deconinck, A. Deshpande, M. Diefenthaler, R. Ent, C. Fanelli, M. Finger, M. Finger, Jr., E. Fol, S. Furletov , et al. (70 additional authors not shown)

    Abstract: The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 27 pages, 11 figures, AI4EIC workshop, tutorials and hackathon

  43. arXiv:2307.08230  [pdf, other

    cs.RO eess.SY

    Image-based Regularization for Action Smoothness in Autonomous Miniature Racing Car with Deep Reinforcement Learning

    Authors: Hoang-Giang Cao, I Lee, Bo-Jiun Hsu, Zheng-Yi Lee, Yu-Wei Shih, Hsueh-Cheng Wang, I-Chen Wu

    Abstract: Deep reinforcement learning has achieved significant results in low-level controlling tasks. However, for some applications like autonomous driving and drone flying, it is difficult to control behavior stably since the agent may suddenly change its actions which often lowers the controlling system's efficiency, induces excessive mechanical wear, and causes uncontrollable, dangerous behavior to the… ▽ More

    Submitted 10 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)2023

  44. arXiv:2307.05564  [pdf, other

    cs.CL

    Augmenters at SemEval-2023 Task 1: Enhancing CLIP in Handling Compositionality and Ambiguity for Zero-Shot Visual WSD through Prompt Augmentation and Text-To-Image Diffusion

    Authors: Jie S. Li, Yow-Ting Shiue, Yong-Siang Shih, Jonas Gei**

    Abstract: This paper describes our zero-shot approaches for the Visual Word Sense Disambiguation (VWSD) Task in English. Our preliminary study shows that the simple approach of matching candidate images with the phrase using CLIP suffers from the many-to-many nature of image-text pairs. We find that the CLIP text encoder may have limited abilities in capturing the compositionality in natural language. Conve… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

  45. arXiv:2305.19404  [pdf, other

    cs.CV cs.AI cs.LG physics.med-ph

    Incremental Learning for Heterogeneous Structure Segmentation in Brain Tumor MRI

    Authors: Xiaofeng Liu, Helen A. Shih, Fangxu Xing, Emiliano Santarnecchi, Georges El Fakhri, Jonghye Woo

    Abstract: Deep learning (DL) models for segmenting various anatomical structures have achieved great success via a static DL model that is trained in a single source domain. Yet, the static DL model is likely to perform poorly in a continually evolving environment, requiring appropriate model updates. In an incremental learning setting, we would expect that well-trained static models are updated, following… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Early Accept to MICCAI 2023

  46. arXiv:2305.18211  [pdf

    eess.SP cs.CV cs.LG

    WiFi-TCN: Temporal Convolution for Human Interaction Recognition based on WiFi signal

    Authors: Chih-Yang Lin, Chia-Yu Lin, Yu-Tso Liu, Timothy K. Shih

    Abstract: The utilization of Wi-Fi based human activity recognition has gained considerable interest in recent times, primarily owing to its applications in various domains such as healthcare for monitoring breath and heart rate, security, elderly care. These Wi-Fi-based methods exhibit several advantages over conventional state-of-the-art techniques that rely on cameras and sensors, including lower costs a… ▽ More

    Submitted 11 January, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: Paper is currently under review at IEEE Access

  47. arXiv:2305.16317  [pdf, other

    cs.LG cs.AI

    Parallel Sampling of Diffusion Models

    Authors: Andy Shih, Suneel Belkhale, Stefano Ermon, Dorsa Sadigh, Nima Anari

    Abstract: Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing the number of denoising steps, but these methods hurt sample quality. Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal app… ▽ More

    Submitted 15 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 37th Conference on Neural Information Processing Systems

  48. arXiv:2305.12308  [pdf, other

    cs.IT

    MIMO Evolution toward 6G: End-User-Centric Collaborative MIMO

    Authors: Lung-Sheng Tsai, Shang-Ling Shih, Pei-Kai Liao, Chao-Kai Wen

    Abstract: In 6G, the trend of transitioning from massive antenna elements to even more massive ones is continued. However, installing additional antennas in the limited space of user equipment (UE) is challenging, resulting in limited capacity scaling gain for end users, despite network side support for increasing numbers of antennas. To address this issue, we propose an end-user-centric collaborative MIMO… ▽ More

    Submitted 14 November, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: 7 pages, 5 figures, 1 table. This work has been accepted in IEEE Communications Magazine

  49. arXiv:2305.11934  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Inductive Simulation of Calorimeter Showers with Normalizing Flows

    Authors: Matthew R. Buckley, Claudius Krause, Ian Pang, David Shih

    Abstract: Simulating particle detector response is the single most expensive step in the Large Hadron Collider computational pipeline. Recently it was shown that normalizing flows can accelerate this process while achieving unprecedented levels of accuracy, but scaling this approach up to higher resolutions relevant for future detector upgrades leads to prohibitive memory constraints. To overcome this probl… ▽ More

    Submitted 13 February, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: 19 pages, 15 figures; v2: title changed, matches published version

    Journal ref: Phys. Rev. D 109, 033006 (2024)

  50. arXiv:2305.03761  [pdf, other

    astro-ph.GA cs.LG hep-ph physics.data-an

    Weakly-Supervised Anomaly Detection in the Milky Way

    Authors: Mariel Pettee, Sowmya Thanvantri, Benjamin Nachman, David Shih, Matthew R. Buckley, Jack H. Collins

    Abstract: Large-scale astrophysics datasets present an opportunity for new machine learning techniques to identify regions of interest that might otherwise be overlooked by traditional searches. To this end, we use Classification Without Labels (CWoLa), a weakly-supervised anomaly detection method, to identify cold stellar streams within the more than one billion Milky Way stars observed by the Gaia satelli… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.