Skip to main content

Showing 1–21 of 21 results for author: Leung, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08603  [pdf, other

    cs.CV cs.AI cs.LG

    FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion

    Authors: George Cazenavette, Avneesh Sud, Thomas Leung, Ben Usman

    Abstract: Due to the high potential for abuse of GenAI systems, the task of detecting synthetic images has recently become of great interest to the research community. Unfortunately, existing image-space detectors quickly become obsolete as new high-fidelity text-to-image models are developed at blinding speed. In this work, we propose a new synthetic image detector that uses features obtained by inverting… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Project page: https://fake-inversion.github.io

    Journal ref: CVPR 2024

  2. arXiv:2403.14084  [pdf, other

    math.NA cs.LG

    Learning-based Multi-continuum Model for Multiscale Flow Problems

    Authors: Fan Wang, Yating Wang, Wing Tat Leung, Zongben Xu

    Abstract: Multiscale problems can usually be approximated through numerical homogenization by an equation with some effective parameters that can capture the macroscopic behavior of the original system on the coarse grid to speed up the simulation. However, this approach usually assumes scale separation and that the heterogeneity of the solution can be approximated by the solution average in each coarse blo… ▽ More

    Submitted 20 June, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Corrected typos

  3. arXiv:2403.03691  [pdf, other

    cs.CV cs.AI

    MolNexTR: A Generalized Deep Learning Model for Molecular Image Recognition

    Authors: Yufan Chen, Ching Ting Leung, Yong Huang, Jianwei Sun, Hao Chen, Hanyu Gao

    Abstract: In the field of chemical structure recognition, the task of converting molecular images into graph structures and SMILES string stands as a significant challenge, primarily due to the varied drawing styles and conventions prevalent in chemical literature. To bridge this gap, we proposed MolNexTR, a novel image-to-graph deep learning model that collaborates to fuse the strengths of ConvNext, a powe… ▽ More

    Submitted 8 March, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: Submitted to the Journal of Cheminformatics

  4. arXiv:2305.04004  [pdf, other

    cs.RO cs.AI cs.AR eess.IV physics.med-ph

    Towards a Simple Framework of Skill Transfer Learning for Robotic Ultrasound-guidance Procedures

    Authors: Tsz Yan Leung, Miguel Xochicale

    Abstract: In this paper, we present a simple framework of skill transfer learning for robotic ultrasound-guidance procedures. We briefly review challenges in skill transfer learning for robotic ultrasound-guidance procedures. We then identify the need of appropriate sampling techniques, computationally efficient neural networks models that lead to the proposal of a simple framework of skill transfer learnin… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

    Comments: 2 pages, 2 figures and code and demo data

  5. arXiv:2302.13153  [pdf, other

    cs.CV cs.GR cs.LG

    Directed Diffusion: Direct Control of Object Placement through Attention Guidance

    Authors: Wan-Duo Kurt Ma, J. P. Lewis, Avisek Lahiri, Thomas Leung, W. Bastiaan Kleijn

    Abstract: Text-guided diffusion models such as DALLE-2, Imagen, eDiff-I, and Stable Diffusion are able to generate an effectively endless variety of images given only a short text prompt describing the desired image content. In many cases the images are of very high quality. However, these models often struggle to compose scenes containing several key objects such as characters in specified positional relat… ▽ More

    Submitted 26 September, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: Our project page: https://hohonu-vicml.github.io/DirectedDiffusion.Page

  6. arXiv:2207.13061  [pdf, other

    cs.CV cs.AI cs.CL

    NewsStories: Illustrating articles with visual summaries

    Authors: Reuben Tan, Bryan A. Plummer, Kate Saenko, JP Lewis, Avneesh Sud, Thomas Leung

    Abstract: Recent self-supervised approaches have used large-scale image-text datasets to learn powerful representations that transfer to many tasks without finetuning. These methods often assume that there is one-to-one correspondence between its images and their (short) captions. However, many tasks require reasoning about multiple images and long text narratives, such as describing news articles with visu… ▽ More

    Submitted 14 August, 2022; v1 submitted 26 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022

  7. arXiv:2207.11735  [pdf, other

    cs.LG

    AMS-Net: Adaptive Multiscale Sparse Neural Network with Interpretable Basis Expansion for Multiphase Flow Problems

    Authors: Yating Wang, Wing Tat Leung, Guang Lin

    Abstract: In this work, we propose an adaptive sparse learning algorithm that can be applied to learn the physical processes and obtain a sparse representation of the solution given a large snapshot space. Assume that there is a rich class of precomputed basis functions that can be used to approximate the quantity of interest. We then design a neural network architecture to learn the coefficients of solutio… ▽ More

    Submitted 24 July, 2022; originally announced July 2022.

  8. arXiv:2202.00476  [pdf

    cs.CL

    Exploring COVID-19 Related Stressors Using Topic Modeling

    Authors: Yue Tong Leung, Farzad Khalvati

    Abstract: The COVID-19 pandemic has affected lives of people from different countries for almost two years. The changes on lifestyles due to the pandemic may cause psychosocial stressors for individuals, and have a potential to lead to mental health problems. To provide high quality mental health supports, healthcare organization need to identify the COVID-19 specific stressors, and notice the trends of pre… ▽ More

    Submitted 12 January, 2022; originally announced February 2022.

  9. arXiv:2102.00917  [pdf, other

    cs.CL

    Counting Protests in News Articles: A Dataset and Semi-Automated Data Collection Pipeline

    Authors: Tommy Leung, L. Nathan Perkins

    Abstract: Between January 2017 and January 2021, thousands of local news sources in the United States reported on over 42,000 protests about topics such as civil rights, immigration, guns, and the environment. Given the vast number of local journalists that report on protests daily, extracting these events as structured data to understand temporal and geographic trends can empower civic decision-making. How… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

  10. arXiv:2011.08954  [pdf, other

    cs.LG math.NA

    Multi-agent Reinforcement Learning Accelerated MCMC on Multiscale Inversion Problem

    Authors: Eric Chung, Yalchin Efendiev, Wing Tat Leung, Sai-Mang Pun, Zecheng Zhang

    Abstract: In this work, we propose a multi-agent actor-critic reinforcement learning (RL) algorithm to accelerate the multi-level Monte Carlo Markov Chain (MCMC) sampling algorithms. The policies (actors) of the agents are used to generate the proposal in the MCMC steps; and the critic, which is centralized, is in charge of estimating the long term reward. We verify our proposed algorithm by solving an inve… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  11. arXiv:2009.11917  [pdf, other

    econ.TH cs.AI econ.GN

    Learning in a Small/Big World

    Authors: Benson Tsz Kin Leung

    Abstract: Complexity and limited ability have profound effect on how we learn and make decisions under uncertainty. Using the theory of finite automaton to model belief formation, this paper studies the characteristics of optimal learning behavior in small and big worlds, where the complexity of the environment is low and high, respectively, relative to the cognitive ability of the decision maker. Optimal b… ▽ More

    Submitted 30 March, 2023; v1 submitted 24 September, 2020; originally announced September 2020.

  12. Cross-view Semantic Segmentation for Sensing Surroundings

    Authors: Bowen Pan, Jiankai Sun, Ho Yin Tiga Leung, Alex Andonian, Bolei Zhou

    Abstract: Sensing surroundings plays a crucial role in human spatial perception, as it extracts the spatial configuration of objects as well as the free space from the observations. To facilitate the robot perception with such a surrounding sensing capability, we introduce a novel visual task called Cross-view Semantic Segmentation as well as a framework named View Parsing Network (VPN) to address it. In th… ▽ More

    Submitted 18 June, 2020; v1 submitted 9 June, 2019; originally announced June 2019.

    Journal ref: IEEE Robotics and Automation Letters ( Volume: 5 , Issue: 3 , July 2020 )

  13. arXiv:1906.01737  [pdf, other

    cs.CV

    Geo-Aware Networks for Fine-Grained Recognition

    Authors: Grace Chu, Brian Potetz, Weijun Wang, Andrew Howard, Yang Song, Fernando Brucher, Thomas Leung, Hartwig Adam

    Abstract: Fine-grained recognition distinguishes among categories with subtle visual differences. In order to differentiate between these challenging visual categories, it is helpful to leverage additional information. Geolocation is a rich source of additional information that can be used to improve fine-grained classification accuracy, but has been understudied. Our contributions to this field are twofold… ▽ More

    Submitted 4 September, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: ICCVW 2019

  14. arXiv:1809.08585  [pdf

    cs.HC cs.AI

    The use of Virtual Reality in Enhancing Interdisciplinary Research and Education

    Authors: Tiffany Leung, Farhana Zulkernine, Haruna Isah

    Abstract: Virtual Reality (VR) is increasingly being recognized for its educational potential and as an effective way to convey new knowledge to people, it supports interactive and collaborative activities. Affordable VR powered by mobile technologies is opening a new world of opportunities that can transform the ways in which we learn and engage with others. This paper reports our study regarding the appli… ▽ More

    Submitted 23 September, 2018; originally announced September 2018.

    Comments: 6 Pages

    ACM Class: F.2.2, I.2.7

  15. arXiv:1808.00447  [pdf, other

    cs.CV

    Towards a Semantic Perceptual Image Metric

    Authors: Troy Chinen, Johannes Ballé, Chunhui Gu, Sung ** Hwang, Sergey Ioffe, Nick Johnston, Thomas Leung, David Minnen, Sean O'Malley, Charles Rosenberg, George Toderici

    Abstract: We present a full reference, perceptual image metric based on VGG-16, an artificial neural network trained on object classification. We fit the metric to a new database based on 140k unique images annotated with ground truth by human raters who received minimal instruction. The resulting metric shows competitive performance on TID 2013, a database widely used to assess image quality assessments me… ▽ More

    Submitted 1 August, 2018; originally announced August 2018.

  16. arXiv:1712.05055  [pdf, other

    cs.CV

    MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels

    Authors: Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, Li Fei-Fei

    Abstract: Recent deep networks are capable of memorizing the entire data even when the labels are completely random. To overcome the overfitting on corrupted labels, we propose a novel technique of learning another neural network, called MentorNet, to supervise the training of the base deep networks, namely, StudentNet. During training, MentorNet provides a curriculum (sample weighting scheme) for StudentNe… ▽ More

    Submitted 13 August, 2018; v1 submitted 13 December, 2017; originally announced December 2017.

    Journal ref: published at ICML 2018

  17. arXiv:1703.07464  [pdf, other

    cs.CV

    No Fuss Distance Metric Learning using Proxies

    Authors: Yair Movshovitz-Attias, Alexander Toshev, Thomas K. Leung, Sergey Ioffe, Saurabh Singh

    Abstract: We address the problem of distance metric learning (DML), defined as learning a distance consistent with a notion of semantic similarity. Traditionally, for this problem supervision is expressed in the form of sets of points that follow an ordinal relationship -- an anchor point $x$ is similar to a set of positive points $Y$, and dissimilar to a set of negative points $Z$, and a loss defined over… ▽ More

    Submitted 1 August, 2017; v1 submitted 21 March, 2017; originally announced March 2017.

    Comments: To be presented in ICCV 2017

  18. arXiv:1604.04326  [pdf, other

    cs.CV cs.LG

    Improving the Robustness of Deep Neural Networks via Stability Training

    Authors: Stephan Zheng, Yang Song, Thomas Leung, Ian Goodfellow

    Abstract: In this paper we address the issue of output instability of deep neural networks: small perturbations in the visual input can significantly distort the feature embeddings and output of a neural network. Such instability affects many deep architectures with state-of-the-art performance on a wide range of computer vision tasks. We present a general stability training method to stabilize deep network… ▽ More

    Submitted 14 April, 2016; originally announced April 2016.

    Comments: Published in CVPR 2016

  19. arXiv:1507.00302  [pdf, other

    cs.CV

    Pose Embeddings: A Deep Architecture for Learning to Match Human Poses

    Authors: Greg Mori, Caroline Pantofaru, Nisarg Kothari, Thomas Leung, George Toderici, Alexander Toshev, Weilong Yang

    Abstract: We present a method for learning an embedding that places images of humans in similar poses nearby. This embedding can be used as a direct method of comparing images based on human pose, avoiding potential challenges of estimating body joint positions. Pose embedding learning is formulated under a triplet-based distance criterion. A deep architecture is used to allow learning of a representation c… ▽ More

    Submitted 1 July, 2015; originally announced July 2015.

  20. arXiv:1404.4661  [pdf, ps, other

    cs.CV

    Learning Fine-grained Image Similarity with Deep Ranking

    Authors: Jiang Wang, Yang song, Thomas Leung, Chuck Rosenberg, **bin Wang, James Philbin, Bo Chen, Ying Wu

    Abstract: Learning fine-grained image similarity is a challenging task. It needs to capture between-class and within-class image differences. This paper proposes a deep ranking model that employs deep learning techniques to learn similarity metric directly from images.It has higher learning capability than models based on hand-crafted features. A novel multiscale network structure has been developed to desc… ▽ More

    Submitted 17 April, 2014; originally announced April 2014.

    Comments: CVPR 2014

  21. arXiv:1312.4894  [pdf, other

    cs.CV

    Deep Convolutional Ranking for Multilabel Image Annotation

    Authors: Yunchao Gong, Yangqing Jia, Thomas Leung, Alexander Toshev, Sergey Ioffe

    Abstract: Multilabel image annotation is one of the most important challenges in computer vision with many real-world applications. While existing work usually use conventional visual features for multilabel annotation, features based on Deep Neural Networks have shown potential to significantly boost performance. In this work, we propose to leverage the advantage of such features and analyze key components… ▽ More

    Submitted 14 April, 2014; v1 submitted 17 December, 2013; originally announced December 2013.