Skip to main content

Showing 1–37 of 37 results for author: Chai, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.01296  [pdf, other

    cs.CV

    MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space

    Authors: Armand Comas-Massagué, Di Qiu, Menglei Chai, Marcel Bühler, Amit Raj, Ruiqi Gao, Qiangeng Xu, Mark Matthews, Paulo Gotardo, Octavia Camps, Sergio Orts-Escolano, Thabo Beeler

    Abstract: We introduce a novel framework for 3D human avatar generation and personalization, leveraging text prompts to enhance user engagement and customization. Central to our approach are key innovations aimed at overcoming the challenges in photo-realistic avatar synthesis. Firstly, we utilize a conditional Neural Radiance Fields (NeRF) model, trained on a large-scale unannotated multi-view dataset, to… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  2. arXiv:2403.12171  [pdf, other

    cs.CL cs.AI

    EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models

    Authors: Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, **g Shao, Tao Gui, Qi Zhang, Xuan**g Huang

    Abstract: Jailbreak attacks are crucial for identifying and mitigating the security vulnerabilities of Large Language Models (LLMs). They are designed to bypass safeguards and elicit prohibited outputs. However, due to significant differences among various jailbreak methods, there is no standard implementation framework available for the community, which limits comprehensive security evaluations. This paper… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  3. arXiv:2312.04875  [pdf, other

    cs.CV

    MVDD: Multi-View Depth Diffusion Models

    Authors: Zhen Wang, Qiangeng Xu, Feitong Tan, Menglei Chai, Shichen Liu, Rohit Pandey, Sean Fanello, Achuta Kadambi, Yinda Zhang

    Abstract: Denoising diffusion models have demonstrated outstanding results in 2D image generation, yet it remains a challenge to replicate its success in 3D shape generation. In this paper, we propose leveraging multi-view depth, which represents complex 3D shapes in a 2D data format that is easy to denoise. We pair this representation with a diffusion model, MVDD, that is capable of generating high-quality… ▽ More

    Submitted 19 December, 2023; v1 submitted 8 December, 2023; originally announced December 2023.

  4. arXiv:2312.02157  [pdf, other

    cs.CV

    Mesh-Guided Neural Implicit Field Editing

    Authors: Can Wang, Mingming He, Menglei Chai, Dongdong Chen, **g Liao

    Abstract: Neural implicit fields have emerged as a powerful 3D representation for reconstructing and rendering photo-realistic views, yet they possess limited editability. Conversely, explicit 3D representations, such as polygonal meshes, offer ease of editing but may not be as suitable for rendering high-quality novel views. To harness the strengths of both representations, we propose a new approach that e… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: Project page: https://cassiepython.github.io/MNeuEdit/

  5. GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations

    Authors: Yuxiao Zhou, Menglei Chai, Alessandro Pepe, Markus Gross, Thabo Beeler

    Abstract: Despite recent successes in hair acquisition that fits a high-dimensional hair model to a specific input subject, generative hair models, which establish general embedding spaces for encoding, editing, and sampling diverse hairstyles, are way less explored. In this paper, we present GroomGen, the first generative model designed for hair geometry composed of highly-detailed dense strands. Our appro… ▽ More

    Submitted 16 November, 2023; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: SIGGRAPH Asia 2023

    Journal ref: ACM Trans. Graph. 42, 6, Article 267 (December 2023)

  6. arXiv:2307.05462  [pdf, other

    cs.CV

    Efficient 3D Articulated Human Generation with Layered Surface Volumes

    Authors: Yinghao Xu, Wang Yifan, Alexander W. Bergman, Menglei Chai, Bolei Zhou, Gordon Wetzstein

    Abstract: Access to high-quality and diverse 3D articulated digital human assets is crucial in various applications, ranging from virtual reality to social platforms. Generative approaches, such as 3D generative adversarial networks (GANs), are rapidly replacing laborious manual content creation tools. However, existing 3D GAN frameworks typically rely on scene representations that leverage either template… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

    Comments: Project page: https://www.computationalimaging.org/publications/lsv/ Demo: https://www.youtube.com/watch?v=vahgMFCM3j4

  7. arXiv:2303.17606  [pdf, other

    cs.CV

    AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control

    Authors: Ruixiang Jiang, Can Wang, **gbo Zhang, Menglei Chai, Mingming He, Dongdong Chen, **g Liao

    Abstract: Neural implicit fields are powerful for representing 3D scenes and generating high-quality novel views, but it remains challenging to use such implicit representations for creating a 3D human avatar with a specific identity and artistic style that can be easily animated. Our proposed method, AvatarCraft, addresses this challenge by using diffusion models to guide the learning of geometry and textu… ▽ More

    Submitted 21 August, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: ICCV 2023 Camera Ready

  8. arXiv:2302.09227  [pdf, other

    cs.CV cs.GR

    Invertible Neural Skinning

    Authors: Yash Kant, Aliaksandr Siarohin, Riza Alp Guler, Menglei Chai, Jian Ren, Sergey Tulyakov, Igor Gilitschenski

    Abstract: Building animatable and editable models of clothed humans from raw 3D scans and poses is a challenging problem. Existing reposing methods suffer from the limited expressiveness of Linear Blend Skinning (LBS), require costly mesh extraction to generate each new pose, and typically do not preserve surface correspondences across different poses. In this work, we introduce Invertible Neural Skinning (… ▽ More

    Submitted 4 March, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

  9. arXiv:2301.11326  [pdf, other

    cs.CV

    Unsupervised Volumetric Animation

    Authors: Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Kyle Olszewski, Jian Ren, Hsin-Ying Lee, Menglei Chai, Sergey Tulyakov

    Abstract: We propose a novel approach for unsupervised 3D animation of non-rigid deformable objects. Our method learns the 3D structure and dynamics of objects solely from single-view RGB videos, and can decompose them into semantically meaningful parts that can be tracked and animated. Using a 3D autodecoder framework, paired with a keypoint estimator via a differentiable PnP algorithm, our model learns th… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

  10. arXiv:2301.09637  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    InfiniCity: Infinite-Scale City Synthesis

    Authors: Chieh Hubert Lin, Hsin-Ying Lee, Willi Menapace, Menglei Chai, Aliaksandr Siarohin, Ming-Hsuan Yang, Sergey Tulyakov

    Abstract: Toward infinite-scale 3D city synthesis, we propose a novel framework, InfiniCity, which constructs and renders an unconstrainedly large and 3D-grounded environment from random noises. InfiniCity decomposes the seemingly impractical task into three feasible modules, taking advantage of both 2D and 3D data. First, an infinite-pixel image synthesis module generates arbitrary-scale 2D maps from the b… ▽ More

    Submitted 14 August, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

  11. arXiv:2301.02700  [pdf, other

    cs.CV cs.GR

    3DAvatarGAN: Bridging Domains for Personalized Editable Avatars

    Authors: Rameen Abdal, Hsin-Ying Lee, Peihao Zhu, Menglei Chai, Aliaksandr Siarohin, Peter Wonka, Sergey Tulyakov

    Abstract: Modern 3D-GANs synthesize geometry and texture by training on large-scale datasets with a consistent structure. Training such models on stylized, artistic data, with often unknown, highly variable geometry, and camera information has not yet been shown possible. Can we train a 3D GAN on such artistic data, while maintaining multi-view consistency and texture quality? To this end, we propose an ada… ▽ More

    Submitted 26 March, 2023; v1 submitted 6 January, 2023; originally announced January 2023.

    Comments: Project Page: https://rameenabdal.github.io/3DAvatarGAN/

  12. arXiv:2212.11984  [pdf, other

    cs.CV

    DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis

    Authors: Yinghao Xu, Menglei Chai, Zifan Shi, Sida Peng, Ivan Skorokhodov, Aliaksandr Siarohin, Ceyuan Yang, Yujun Shen, Hsin-Ying Lee, Bolei Zhou, Sergey Tulyakov

    Abstract: Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects. This work presents DisCoScene: a 3Daware generative model for high-quality and controllable scene synthesis. The key ingredient of our method is a very abstract object-level representation (i.e., 3D bounding boxes… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

    Comments: Project page: https://snap-research.github.io/discoscene/

  13. arXiv:2212.08070  [pdf, other

    cs.CV cs.GR

    NeRF-Art: Text-Driven Neural Radiance Fields Stylization

    Authors: Can Wang, Ruixiang Jiang, Menglei Chai, Mingming He, Dongdong Chen, **g Liao

    Abstract: As a powerful representation of 3D scenes, the neural radiance field (NeRF) enables high-quality novel view synthesis from multi-view images. Stylizing NeRF, however, remains challenging, especially on simulating a text-guided style with both the appearance and the geometry altered simultaneously. In this paper, we present NeRF-Art, a text-guided NeRF stylization approach that manipulates the styl… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: Project page: https://cassiepython.github.io/nerfart/

  14. arXiv:2210.02573  [pdf, other

    cs.LG

    Efficient Learning of Mesh-Based Physical Simulation with BSMS-GNN

    Authors: Yadi Cao, Menglei Chai, Minchen Li, Chenfanfu Jiang

    Abstract: Learning the physical simulation on large-scale meshes with flat Graph Neural Networks (GNNs) and stacking Message Passings (MPs) is challenging due to the scaling complexity w.r.t. the number of nodes and over-smoothing. There has been growing interest in the community to introduce \textit{multi-scale} structures to GNNs for physical simulation. However, current state-of-the-art methods are limit… ▽ More

    Submitted 18 June, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Updates summary: * update to the nwe ICML style

  15. arXiv:2207.11795  [pdf, other

    cs.CV

    Cross-Modal 3D Shape Generation and Manipulation

    Authors: Zezhou Cheng, Menglei Chai, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Subhransu Maji, Sergey Tulyakov

    Abstract: Creating and editing the shape and color of 3D objects require tremendous human effort and expertise. Compared to direct manipulation in 3D interfaces, 2D interactions such as sketches and scribbles are usually much more natural and intuitive for the users. In this paper, we propose a generic multi-modal generative model that couples the 2D modalities and implicit 3D representations through shared… ▽ More

    Submitted 24 July, 2022; originally announced July 2022.

    Comments: ECCV 2022. Project page: https://people.cs.umass.edu/~zezhoucheng/edit3d/

  16. arXiv:2204.00604  [pdf, other

    cs.CV cs.SD eess.AS

    Quantized GAN for Complex Music Generation from Dance Videos

    Authors: Ye Zhu, Kyle Olszewski, Yu Wu, Panos Achlioptas, Menglei Chai, Yan Yan, Sergey Tulyakov

    Abstract: We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates complex musical samples conditioned on dance videos. Our proposed framework takes dance video frames and human body motions as input, and learns to generate music samples that plausibly accompany the corresponding input. Unlike most existing conditional music generation works that generate specific types… ▽ More

    Submitted 19 July, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: Dataset and code at https://github.com/L-YeZhu/D2M-GAN

  17. arXiv:2203.17261  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis

    Authors: Huan Wang, Jian Ren, Zeng Huang, Kyle Olszewski, Menglei Chai, Yun Fu, Sergey Tulyakov

    Abstract: Recent research explosion on Neural Radiance Field (NeRF) shows the encouraging potential to represent complex scenes with neural networks. One major drawback of NeRF is its prohibitive inference time: Rendering a single pixel requires querying the NeRF network hundreds of times. To resolve it, existing efforts mainly attempt to reduce the number of required sampled points. However, the problem of… ▽ More

    Submitted 22 July, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Accepted by ECCV 2022. Code: https://github.com/snap-research/R2L

  18. arXiv:2201.02533  [pdf, other

    cs.CV

    NeROIC: Neural Rendering of Objects from Online Image Collections

    Authors: Zhengfei Kuang, Kyle Olszewski, Menglei Chai, Zeng Huang, Panos Achlioptas, Sergey Tulyakov

    Abstract: We present a novel method to acquire object representations from online image collections, capturing high-quality geometry and material properties of arbitrary objects from photographs with varying cameras, illumination, and backgrounds. This enables various object-centric rendering applications such as novel-view synthesis, relighting, and harmonized background composition from challenging in-the… ▽ More

    Submitted 1 September, 2022; v1 submitted 7 January, 2022; originally announced January 2022.

    Comments: SIGGRAPH 2022 (Journal Track). Project page: https://formyfamily.github.io/NeROIC/ Code repository: https://github.com/snap-research/NeROIC/

  19. arXiv:2112.05139  [pdf, other

    cs.CV cs.GR

    CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields

    Authors: Can Wang, Menglei Chai, Mingming He, Dongdong Chen, **g Liao

    Abstract: We present CLIP-NeRF, a multi-modal 3D object manipulation method for neural radiance fields (NeRF). By leveraging the joint language-image embedding space of the recent Contrastive Language-Image Pre-Training (CLIP) model, we propose a unified framework that allows manipulating NeRF in a user-friendly way, using either a short text prompt or an exemplar image. Specifically, to combine the novel v… ▽ More

    Submitted 2 March, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

    Comments: To Appear at CVPR 2022

  20. arXiv:2109.08090  [pdf, other

    cs.LG cs.CV

    DisUnknown: Distilling Unknown Factors for Disentanglement Learning

    Authors: Sitao Xiang, Yuming Gu, Pengda Xiang, Menglei Chai, Hao Li, Yajie Zhao, Mingming He

    Abstract: Disentangling data into interpretable and independent factors is critical for controllable generation tasks. With the availability of labeled data, supervision can help enforce the separation of specific factors as expected. However, it is often expensive or even impossible to label every single factor to achieve fully-supervised disentanglement. In this paper, we adopt a general setting where all… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: Accepted for publication at ICCV 2021. Videos, demos and updates will be published at project website: https://stormraiser.github.io/disunknown/

  21. arXiv:2106.07771  [pdf, other

    cs.CV

    Flow Guided Transformable Bottleneck Networks for Motion Retargeting

    Authors: Jian Ren, Menglei Chai, Oliver J. Woodford, Kyle Olszewski, Sergey Tulyakov

    Abstract: Human motion retargeting aims to transfer the motion of one person in a "driving" video or set of images to another person. Existing efforts leverage a long training video from each target person to train a subject-specific motion transfer model. However, the scalability of such methods is limited, as each model can only generate videos for the given target subject, and such training videos are la… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Comments: CVPR 2021

  22. arXiv:2104.15069  [pdf, other

    cs.CV

    A Good Image Generator Is What You Need for High-Resolution Video Synthesis

    Authors: Yu Tian, Jian Ren, Menglei Chai, Kyle Olszewski, Xi Peng, Dimitris N. Metaxas, Sergey Tulyakov

    Abstract: Image and video synthesis are closely related areas aiming at generating content from noise. While rapid progress has been demonstrated in improving image-based models to handle large resolutions, high-quality renderings, and wide variations in image content, achieving comparable video generation results remains problematic. We present a framework that leverages contemporary image generators to re… ▽ More

    Submitted 30 April, 2021; originally announced April 2021.

    Comments: Accepted to ICLR 2021

  23. arXiv:2104.14559  [pdf, other

    cs.CV cs.GR

    Exemplar-Based 3D Portrait Stylization

    Authors: Fangzhou Han, Shuquan Ye, Mingming He, Menglei Chai, **g Liao

    Abstract: Exemplar-based portrait stylization is widely attractive and highly desired. Despite recent successes, it remains challenging, especially when considering both texture and geometric styles. In this paper, we present the first framework for one-shot 3D portrait style transfer, which can generate 3D face models with both the geometry exaggerated and the texture stylized while preserving the identity… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

    Comments: Project page: https://halfjoe.github.io/projs/3DPS/index.html

  24. arXiv:2104.11280  [pdf, other

    cs.CV

    Motion Representations for Articulated Animation

    Authors: Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov

    Abstract: We propose novel motion representations for animating articulated objects consisting of distinct parts. In a completely unsupervised manner, our method identifies object parts, tracks them in a driving video, and infers their motions by considering their principal axes. In contrast to the previous keypoint-based works, our method extracts meaningful and consistent regions, describing locations, sh… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

    Journal ref: CVPR 2021

  25. arXiv:2104.11228  [pdf, other

    cs.CV cs.GR

    Cross-Domain and Disentangled Face Manipulation with 3D Guidance

    Authors: Can Wang, Menglei Chai, Mingming He, Dongdong Chen, **g Liao

    Abstract: Face image manipulation via three-dimensional guidance has been widely applied in various interactive scenarios due to its semantically-meaningful understanding and user-friendly controllability. However, existing 3D-morphable-model-based manipulation methods are not directly applicable to out-of-domain faces, such as non-photorealistic paintings, cartoon portraits, or even animals, mainly due to… ▽ More

    Submitted 28 February, 2022; v1 submitted 22 April, 2021; originally announced April 2021.

    Comments: Accepted by TVCG, final version

  26. arXiv:2103.06878  [pdf, other

    cs.CV cs.GR

    Diverse Semantic Image Synthesis via Probability Distribution Modeling

    Authors: Zhentao Tan, Menglei Chai, Dongdong Chen, **g Liao, Qi Chu, Bin Liu, Gang Hua, Nenghai Yu

    Abstract: Semantic image synthesis, translating semantic layouts to photo-realistic images, is a one-to-many map** problem. Though impressive progress has been recently made, diverse semantic synthesis that can efficiently produce semantic-level multimodal results, still remains a challenge. In this paper, we propose a novel diverse semantic image synthesis framework from the perspective of semantic class… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

    Comments: Accepted By CVPR 2021

  27. arXiv:2012.04644  [pdf, other

    cs.CV cs.GR

    Efficient Semantic Image Synthesis via Class-Adaptive Normalization

    Authors: Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, **g Liao, Mingming He, Lu Yuan, Gang Hua, Nenghai Yu

    Abstract: Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis \cite{park2019semantic}, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away. Despite its impressive performance, a more thorough understanding of the advantages insi… ▽ More

    Submitted 4 May, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: To appear at TPAMI 2021, code is available https://github.com/tzt101/CLADE.git

  28. arXiv:2010.16417  [pdf, other

    cs.CV

    MichiGAN: Multi-Input-Conditioned Hair Image Generation for Portrait Editing

    Authors: Zhentao Tan, Menglei Chai, Dongdong Chen, **g Liao, Qi Chu, Lu Yuan, Sergey Tulyakov, Nenghai Yu

    Abstract: Despite the recent success of face image generation with GANs, conditional hair editing remains challenging due to the under-explored complexity of its geometry and appearance. In this paper, we present MichiGAN (Multi-Input-Conditioned Hair Image GAN), a novel conditional image generation method for interactive portrait hair manipulation. To provide user control over every major hair visual facto… ▽ More

    Submitted 30 October, 2020; originally announced October 2020.

    Comments: Siggraph 2020, code is available at https://github.com/tzt101/MichiGAN

  29. arXiv:2004.14489  [pdf, other

    cs.GR cs.CV

    Interactive Video Stylization Using Few-Shot Patch-Based Training

    Authors: Ondřej Texler, David Futschik, Michal Kučera, Ondřej Jamriška, Šárka Sochorová, Menglei Chai, Sergey Tulyakov, Daniel Sýkora

    Abstract: In this paper, we present a learning-based method to the keyframe-based video stylization that allows an artist to propagate the style from a few selected keyframes to the rest of the sequence. Its key advantage is that the resulting stylization is semantically meaningful, i.e., specific parts of moving objects are stylized according to the artist's intention. In contrast to previous style transfe… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

  30. arXiv:2004.13297  [pdf, other

    cs.CV

    Neural Hair Rendering

    Authors: Menglei Chai, Jian Ren, Sergey Tulyakov

    Abstract: In this paper, we propose a generic neural-based hair rendering pipeline that can synthesize photo-realistic images from virtual 3D hair models. Unlike existing supervised translation methods that require model-level similarity to preserve consistent structure representation for both real images and fake renderings, our method adopts an unsupervised solution to work on arbitrary hair models. The k… ▽ More

    Submitted 21 July, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

    Comments: ECCV 2020

  31. arXiv:2004.03142  [pdf, other

    cs.CV

    Human Motion Transfer from Poses in the Wild

    Authors: Jian Ren, Menglei Chai, Sergey Tulyakov, Chen Fang, Xiaohui Shen, Jianchao Yang

    Abstract: In this paper, we tackle the problem of human motion transfer, where we synthesize novel motion video for a target person that imitates the movement from a reference video. It is a video-to-video translation task in which the estimated poses are used to bridge two domains. Despite substantial progress on the topic, there exist several problems with the previous methods. First, there is a domain ga… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

  32. arXiv:2004.02867  [pdf, other

    cs.CV cs.GR

    Rethinking Spatially-Adaptive Normalization

    Authors: Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, **g Liao, Mingming He, Lu Yuan, Nenghai Yu

    Abstract: Spatially-adaptive normalization is remarkably successful recently in conditional semantic image synthesis, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to preserve the semantic information from being washed away. Despite its impressive performance, a more thorough understanding of the true advantages inside the box is still highly… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

  33. arXiv:1911.11419  [pdf, other

    cs.CV

    Revisiting Image Aesthetic Assessment via Self-Supervised Feature Learning

    Authors: Kekai Sheng, Weiming Dong, Menglei Chai, Guohui Wang, Peng Zhou, Feiyue Huang, Bao-Gang Hu, Rongrong Ji, Chongyang Ma

    Abstract: Visual aesthetic assessment has been an active research field for decades. Although latest methods have achieved promising performance on benchmark datasets, they typically rely on a large number of manual annotations including both aesthetic labels and related image attributes. In this paper, we revisit the problem of image aesthetic assessment from the self-supervised feature learning perspectiv… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: AAAI Conference on Artificial Intelligence, 2020, accepted

    Journal ref: Proceedings of AAAI Conference on Articial Intelligence 2020

  34. arXiv:1904.00680  [pdf, other

    cs.CV

    End-to-End Time-Lapse Video Synthesis from a Single Outdoor Image

    Authors: Seonghyeon Nam, Chongyang Ma, Menglei Chai, William Brendel, Ning Xu, Seon Joo Kim

    Abstract: Time-lapse videos usually contain visually appealing content but are often difficult and costly to create. In this paper, we present an end-to-end solution to synthesize a time-lapse video from a single outdoor image using deep neural networks. Our key idea is to train a conditional generative adversarial network based on existing datasets of time-lapse videos and image sequences. We propose a mul… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

    Comments: To appear in CVPR 2019

  35. arXiv:1812.04580  [pdf, ps, other

    cs.LO cs.SC

    BOSPHORUS: Bridging ANF and CNF Solvers

    Authors: Davin Choo, Mate Soos, Kian Ming A. Chai, Kuldeep S. Meel

    Abstract: Algebraic Normal Form (ANF) and Conjunctive Normal Form (CNF) are commonly used to encode problems in Boolean algebra. ANFs are typically solved via Gr"obner basis algorithms, often using more memory than is feasible; while CNFs are solved using SAT solvers, which cannot exploit the algebra of polynomials naturally. We propose a paradigm that bridges between ANF and CNF solving techniques: the tec… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.

    Comments: To Appear in Proceedings of DATE 2019

  36. arXiv:1206.6475  [pdf

    cs.LG stat.ML

    A Split-Merge Framework for Comparing Clusterings

    Authors: Qiaoliang Xiang, Qi Mao, Kian Ming Chai, Hai Leong Chieu, Ivor Tsang, Zhendong Zhao

    Abstract: Clustering evaluation measures are frequently used to evaluate the performance of algorithms. However, most measures are not properly normalized and ignore some information in the inherent structure of clusterings. We model the relation between two clusterings as a bipartite graph and propose a general component-based decomposition formula based on the components of the graph. Most existing measur… ▽ More

    Submitted 4 September, 2012; v1 submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

  37. arXiv:1206.4625  [pdf

    cs.LG

    Optimizing F-measure: A Tale of Two Approaches

    Authors: Ye Nan, Kian Ming Chai, Wee Sun Lee, Hai Leong Chieu

    Abstract: F-measures are popular performance metrics, particularly for tasks with imbalanced data sets. Algorithms for learning to maximize F-measures follow two approaches: the empirical utility maximization (EUM) approach learns a classifier having optimal performance on training data, while the decision-theoretic approach learns a probabilistic model and then predicts labels with maximum expected F-measu… ▽ More

    Submitted 18 June, 2012; originally announced June 2012.

    Comments: ICML2012