Skip to main content

Showing 1–12 of 12 results for author: Li, J Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.02844  [pdf, other

    cs.IR cs.CL

    Item-Language Model for Conversational Recommendation

    Authors: Li Yang, Anushya Subbiah, Hardik Patel, Judith Yue Li, Yanwei Song, Reza Mirghaderi, Vikram Aggarwal

    Abstract: Large-language Models (LLMs) have been extremely successful at tasks like complex dialogue understanding, reasoning and coding due to their emergent abilities. These emergent abilities have been extended with multi-modality to include image, audio, and video capabilities. Recommender systems, on the other hand, have been critical for information seeking and item discovery needs. Recently, there ha… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 15 pages, 3 figures

  2. arXiv:2307.08996  [pdf, other

    cs.CV

    Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond

    Authors: Yang Zhao, Tingbo Hou, Yu-Chuan Su, Xuhui Jia. Yandong Li, Matthias Grundmann

    Abstract: An authentic face restoration system is becoming increasingly demanding in many computer vision applications, e.g., image enhancement, video communication, and taking portrait. Most of the advanced face restoration models can recover high-quality faces from low-quality ones but usually fail to faithfully generate realistic and high-frequency details that are favored by users. To achieve authentic… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: ICCV 2023

  3. arXiv:2305.06594  [pdf, other

    cs.SD cs.CV cs.LG cs.MM eess.AS

    V2Meow: Meowing to the Visual Beat via Video-to-Music Generation

    Authors: Kun Su, Judith Yue Li, Qingqing Huang, Dima Kuzmin, Joonseok Lee, Chris Donahue, Fei Sha, Aren Jansen, Yu Wang, Mauro Verzetti, Timo I. Denk

    Abstract: Video-to-music generation demands both a temporally localized high-quality listening experience and globally aligned video-acoustic signatures. While recent music generation models excel at the former through advanced audio codecs, the exploration of video-acoustic signatures has been confined to specific visual scenarios. In contrast, our research confronts the challenge of learning globally alig… ▽ More

    Submitted 22 February, 2024; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: accepted at AAAI 2024, music samples available at https://tinyurl.com/v2meow

  4. arXiv:2305.06218  [pdf, other

    cs.CL cs.AI cs.IR

    Multi-Task End-to-End Training Improves Conversational Recommendation

    Authors: Naveen Ram, Dima Kuzmin, Ellie Ka In Chio, Moustafa Farid Alzantot, Santiago Ontanon, Ambarish Jash, Judith Yue Li

    Abstract: In this paper, we analyze the performance of a multitask end-to-end transformer model on the task of conversational recommendations, which aim to provide recommendations based on a user's explicit preferences expressed in dialogue. While previous works in this area adopt complex multi-component approaches where the dialogue management and entity recommendation tasks are handled by separate compone… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: 10 pages, 4 tables, 1 figure

  5. arXiv:2301.03238  [pdf, ps, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    MAQA: A Multimodal QA Benchmark for Negation

    Authors: Judith Yue Li, Aren Jansen, Qingqing Huang, Joonseok Lee, Ravi Ganti, Dima Kuzmin

    Abstract: Multimodal learning can benefit from the representation power of pretrained Large Language Models (LLMs). However, state-of-the-art transformer based LLMs often ignore negations in natural language and there is no existing benchmark to quantitatively evaluate whether multimodal transformers inherit this weakness. In this study, we present a new multimodal question answering (QA) benchmark adapted… ▽ More

    Submitted 9 January, 2023; originally announced January 2023.

    Comments: NeurIPS 2022 SyntheticData4ML Workshop

  6. arXiv:2212.05716  [pdf, ps, other

    cs.LG math.OC

    On Generalization and Regularization via Wasserstein Distributionally Robust Optimization

    Authors: Qinyu Wu, Jonathan Yu-Meng Li, Tiantian Mao

    Abstract: Wasserstein distributionally robust optimization (DRO) has found success in operations research and machine learning applications as a powerful means to obtain solutions with favourable out-of-sample performances. Two compelling explanations for the success are the generalization bounds derived from Wasserstein DRO and the equivalency between Wasserstein DRO and the regularization scheme commonly… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  7. arXiv:2208.12415  [pdf, other

    eess.AS cs.CL cs.SD stat.ML

    MuLan: A Joint Embedding of Music Audio and Natural Language

    Authors: Qingqing Huang, Aren Jansen, Joonseok Lee, Ravi Ganti, Judith Yue Li, Daniel P. W. Ellis

    Abstract: Music tagging and content-based retrieval systems have traditionally been constructed using pre-defined ontologies covering a rigid set of music attributes or text queries. This paper presents MuLan: a first attempt at a new generation of acoustic models that link music audio directly to unconstrained natural language music descriptions. MuLan takes the form of a two-tower, joint audio-text embedd… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: To appear in ISMIR 2022

  8. arXiv:2109.07005  [pdf, other

    q-fin.PM cs.LG

    WaveCorr: Correlation-savvy Deep Reinforcement Learning for Portfolio Management

    Authors: Saeed Marzban, Erick Delage, Jonathan Yumeng Li, Jeremie Desgagne-Bouchard, Carl Dussault

    Abstract: The problem of portfolio management represents an important and challenging class of dynamic decision making problems, where rebalancing decisions need to be made over time with the consideration of many factors such as investors preferences, trading environments, and market conditions. In this paper, we present a new portfolio policy network architecture for deep reinforcement learning (DRL)that… ▽ More

    Submitted 28 September, 2021; v1 submitted 14 September, 2021; originally announced September 2021.

  9. arXiv:2109.04001  [pdf, other

    q-fin.PR cs.LG

    Deep Reinforcement Learning for Equal Risk Pricing and Hedging under Dynamic Expectile Risk Measures

    Authors: Saeed Marzban, Erick Delage, Jonathan Yumeng Li

    Abstract: Recently equal risk pricing, a framework for fair derivative pricing, was extended to consider dynamic risk measures. However, all current implementations either employ a static risk measure that violates time consistency, or are based on traditional dynamic programming solution schemes that are impracticable in problems with a large number of underlying assets (due to the curse of dimensionality)… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

  10. arXiv:1909.05675  [pdf, other

    cs.CV cs.AI

    Accelerating Training using Tensor Decomposition

    Authors: Mostafa Elhoushi, Ye Henry Tian, Zihao Chen, Farhan Shafiq, Joey Yiwei Li

    Abstract: Tensor decomposition is one of the well-known approaches to reduce the latency time and number of parameters of a pre-trained model. However, in this paper, we propose an approach to use tensor decomposition to reduce training time of training a model from scratch. In our approach, we train the model from scratch (i.e., randomly initialized weights) with its original architecture for a small numbe… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

    Journal ref: AAAI 2020 Artificial Intelligence of Things Workshop

  11. arXiv:1905.13298  [pdf, other

    cs.LG cs.NE

    DeepShift: Towards Multiplication-Less Neural Networks

    Authors: Mostafa Elhoushi, Zihao Chen, Farhan Shafiq, Ye Henry Tian, Joey Yiwei Li

    Abstract: The high computation, memory, and power budgets of inferring convolutional neural networks (CNNs) are major bottlenecks of model deployment to edge computing platforms, e.g., mobile devices and IoT. Moreover, training CNNs is time and energy-intensive even on high-grade servers. Convolution layers and fully connected layers, because of their intense use of multiplications, are the dominant contrib… ▽ More

    Submitted 7 July, 2021; v1 submitted 30 May, 2019; originally announced May 2019.

    Comments: -Added results for 8-bit and 16-bit fixed point activations, as well as 5-bit, 4-bit, 3-bit, and 2-bit weights. - Added link to GitHub code - Updated and fixed the training algorithm - Introduced 2 approaches for backward and forward pases - Showed better results for training from scratch on CIFAR10 and Imagenet - Added implementation on NVIDIA's GPU -Accepted in CVPR Mobile AI 2021 Workshop

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021

  12. arXiv:1805.02306  [pdf, other

    stat.ME cs.LG stat.ML

    Semi-orthogonal Non-negative Matrix Factorization with an Application in Text Mining

    Authors: Jack Yutong Li, Ruoqing Zhu, Annie Qu, Han Ye, Zhankun Sun

    Abstract: Emergency Department (ED) crowding is a worldwide issue that affects the efficiency of hospital management and the quality of patient care. This occurs when the request for an admit ward-bed to receive a patient is delayed until an admission decision is made by a doctor. To reduce the overcrowding and waiting time of ED, we build a classifier to predict the disposition of patients using manually-t… ▽ More

    Submitted 4 July, 2019; v1 submitted 6 May, 2018; originally announced May 2018.

    MSC Class: 97R40; 68T10