Skip to main content

Showing 1–28 of 28 results for author: Mai, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08659  [pdf, other

    cs.CV

    Vivid-ZOO: Multi-View Video Generation with Diffusion Model

    Authors: Bing Li, Cheng Zheng, Wenxuan Zhu, **jie Mai, Biao Zhang, Peter Wonka, Bernard Ghanem

    Abstract: While diffusion models have shown impressive performance in 2D image/video generation, diffusion-based Text-to-Multi-view-Video (T2MVid) generation remains underexplored. The new challenges posed by T2MVid generation lie in the lack of massive captioned multi-view videos and the complexity of modeling such multi-dimensional distribution. To this end, we propose a novel diffusion-based pipeline tha… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Our project page is at https://hi-zhengcheng.github.io/vividzoo/

  2. arXiv:2406.05962  [pdf, other

    cs.DC cs.DB

    Data Caching for Enterprise-Grade Petabyte-Scale OLAP

    Authors: Chunxu Tang, Bin Fan, **g Zhao, Chen Liang, Yi Wang, Beinan Wang, Ziyue Qiu, Lu Qiu, Bowen Ding, Shouzhuo Sun, Saiguang Che, Jiaming Mai, Shouwei Chen, Yu Zhu, Jianjian Xie, Yutian, Sun, Yao Li, Yangjun Zhang, Ke Wang, Mingmin Chen

    Abstract: With the exponential growth of data and evolving use cases, petabyte-scale OLAP data platforms are increasingly adopting a model that decouples compute from storage. This shift, evident in organizations like Uber and Meta, introduces operational challenges including massive, read-heavy I/O traffic with potential throttling, as well as skewed and fragmented data access patterns. Addressing these ch… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted to the USENIX Annual Technical Conference (USENIX ATC) 2024

  3. arXiv:2402.10128  [pdf, other

    cs.CV cs.GR cs.LG

    GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering

    Authors: Abdullah Hamdi, Luke Melas-Kyriazi, **jie Mai, Guocheng Qian, Ruoshi Liu, Carl Vondrick, Bernard Ghanem, Andrea Vedaldi

    Abstract: Advancements in 3D Gaussian Splatting have significantly accelerated 3D reconstruction and generation. However, it may require a large number of Gaussians, which creates a substantial memory footprint. This paper introduces GES (Generalized Exponential Splatting), a novel representation that employs Generalized Exponential Function (GEF) to model 3D scenes, requiring far fewer particles to represe… ▽ More

    Submitted 24 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: CVPR 2024 paper. project website https://abdullahamdi.com/ges

  4. 3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease Progression from Longitudinal OCTs

    Authors: Taha Emre, Arunava Chakravarty, Antoine Rivail, Dmitrii Lachinov, Oliver Leingang, Sophie Riedl, Julia Mai, Hendrik P. N. Scholl, Sobha Sivaprasad, Daniel Rueckert, Andrew Lotery, Ursula Schmidt-Erfurth, Hrvoje Bogunović

    Abstract: Self-supervised learning (SSL) has emerged as a powerful technique for improving the efficiency and effectiveness of deep learning models. Contrastive methods are a prominent family of SSL that extract similar representations of two augmented views of an image while pushing away others in the representation space as negatives. However, the state-of-the-art contrastive methods require large batch s… ▽ More

    Submitted 13 May, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in IEEE TMI

  5. arXiv:2308.03233  [pdf, other

    cs.AR

    LEAPS: Topological-Layout-Adaptable Multi-Die FPGA Placement for Super Long Line Minimization

    Authors: Zhixiong Di, Runzhe Tao, **g Mai, Lin Chen, Yibo Lin

    Abstract: Multi-die FPGAs are crucial components in modern computing systems, particularly for high-performance applications such as artificial intelligence and data centers. Super long lines (SLLs) provide interconnections between super logic regions (SLRs) for a multi-die FPGA on a silicon interposer. They have significantly higher delay compared to regular interconnects, which need to be minimized. With… ▽ More

    Submitted 2 February, 2024; v1 submitted 6 August, 2023; originally announced August 2023.

  6. Pretrained Deep 2.5D Models for Efficient Predictive Modeling from Retinal OCT

    Authors: Taha Emre, Marzieh Oghbaie, Arunava Chakravarty, Antoine Rivail, Sophie Riedl, Julia Mai, Hendrik P. N. Scholl, Sobha Sivaprasad, Daniel Rueckert, Andrew Lotery, Ursula Schmidt-Erfurth, Hrvoje Bogunović

    Abstract: In the field of medical imaging, 3D deep learning models play a crucial role in building powerful predictive models of disease progression. However, the size of these models presents significant challenges, both in terms of computational resources and data requirements. Moreover, achieving high-quality pretraining of 3D models proves to be even more challenging. To address these issues, hybrid 2.5… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Accepted at OMIA-X MICCAI'23 Workshop

  7. Self-supervised learning via inter-modal reconstruction and feature projection networks for label-efficient 3D-to-2D segmentation

    Authors: José Morano, Guilherme Aresta, Dmitrii Lachinov, Julia Mai, Ursula Schmidt-Erfurth, Hrvoje Bogunović

    Abstract: Deep learning has become a valuable tool for the automation of certain medical image segmentation tasks, significantly relieving the workload of medical specialists. Some of these tasks require segmentation to be performed on a subset of the input dimensions, the most common case being 3D-to-2D. However, the performance of existing methods is strongly conditioned by the amount of labeled data avai… ▽ More

    Submitted 13 July, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: To appear in MICCAI 2023. Code: https://github.com/j-morano/multimodal-ssl-fpn

  8. arXiv:2306.17843  [pdf, other

    cs.CV

    Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors

    Authors: Guocheng Qian, **jie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, Bernard Ghanem

    Abstract: We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors. In the first stage, we optimize a neural radiance field to produce a coarse geometry. In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing… ▽ More

    Submitted 23 July, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: webpage: https://guochengqian.github.io/project/magic123/

  9. arXiv:2306.16665  [pdf, other

    cs.AR

    OpenPARF: An Open-Source Placement and Routing Framework for Large-Scale Heterogeneous FPGAs with Deep Learning Toolkit

    Authors: **g Mai, Jiarui Wang, Zhixiong Di, Guojie Luo, Yun Liang, Yibo Lin

    Abstract: This paper proposes OpenPARF, an open-source placement and routing framework for large-scale FPGA designs. OpenPARF is implemented with the deep learning toolkit PyTorch and supports massive parallelization on GPU. The framework proposes a novel asymmetric multi-electrostatic field system to solve FPGA placement. It considers fine-grained routing resources inside configurable logic blocks (CLBs) f… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  10. arXiv:2305.17066  [pdf, other

    cs.AI cs.CL cs.CV cs.LG cs.MA

    Mindstorms in Natural Language-Based Societies of Mind

    Authors: Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, **jie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-** Fan, Bernard Ghanem , et al. (1 additional authors not shown)

    Abstract: Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overco… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 9 pages in main text + 7 pages of references + 38 pages of appendices, 14 figures in main text + 13 in appendices, 7 tables in appendices

    MSC Class: 68T07 ACM Class: I.2.6; I.2.11

  11. arXiv:2304.09349   

    cs.AI cs.CL cs.RO

    LLM as A Robotic Brain: Unifying Egocentric Memory and Control

    Authors: **jie Mai, Jun Chen, Bing Li, Guocheng Qian, Mohamed Elhoseiny, Bernard Ghanem

    Abstract: Embodied AI focuses on the study and development of intelligent systems that possess a physical or virtual embodiment (i.e. robots) and are able to dynamically interact with their environment. Memory and control are the two essential parts of an embodied system and usually require separate frameworks to model each of them. In this paper, we propose a novel and generalizable framework called LLM-Br… ▽ More

    Submitted 12 June, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: This early project is now integrated to: Mindstorms in Natural Language-Based Societies of Mind, arXiv:2305.17066

  12. arXiv:2304.08439  [pdf, other

    eess.IV cs.CV

    Morph-SSL: Self-Supervision with Longitudinal Morphing to Predict AMD Progression from OCT

    Authors: Arunava Chakravarty, Taha Emre, Oliver Leingang, Sophie Riedl, Julia Mai, Hendrik P. N. Scholl, Sobha Sivaprasad, Daniel Rueckert, Andrew Lotery, Ursula Schmidt-Erfurth, Hrvoje Bogunović

    Abstract: The lack of reliable biomarkers makes predicting the conversion from intermediate to neovascular age-related macular degeneration (iAMD, nAMD) a challenging task. We develop a Deep Learning (DL) model to predict the future risk of conversion of an eye from iAMD to nAMD from its current OCT scan. Although eye clinics generate vast amounts of longitudinal OCT scans to monitor AMD progression, only a… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

  13. arXiv:2303.09305  [pdf, other

    cs.AR

    Multi-Electrostatic FPGA Placement Considering SLICEL-SLICEM Heterogeneity, Clock Feasibility, and Timing Optimization

    Authors: **g Mai, Jiarui Wang, Zhixiong Di, Yibo Lin

    Abstract: When modern FPGA architecture becomes increasingly complicated, modern FPGA placement is a mixed optimization problem with multiple objectives, including wirelength, routability, timing closure, and clock feasibility. Typical FPGA devices nowadays consist of heterogeneous SLICEs like SLICEL and SLICEM. The resources of a SLICE can be configured to {LUT, FF, distributed RAM, SHIFT, CARRY}. Besides… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  14. arXiv:2303.02449  [pdf, other

    cs.CV

    Exploit CAM by itself: Complementary Learning System for Weakly Supervised Semantic Segmentation

    Authors: Jiren Mai, Fei Zhang, Junjie Ye, Marcus Kalander, Xian Zhang, WanKou Yang, Tongliang Liu, Bo Han

    Abstract: Weakly Supervised Semantic Segmentation (WSSS) with image-level labels has long been suffering from fragmentary object regions led by Class Activation Map (CAM), which is incapable of generating fine-grained masks for semantic segmentation. To guide CAM to find more non-discriminating object patterns, this paper turns to an interesting working mechanism in agent learning named Complementary Learni… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

  15. arXiv:2212.06969  [pdf, other

    cs.CV

    EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries

    Authors: **jie Mai, Abdullah Hamdi, Silvio Giancola, Chen Zhao, Bernard Ghanem

    Abstract: With the recent advances in video and 3D understanding, novel 4D spatio-temporal methods fusing both concepts have emerged. Towards this direction, the Ego4D Episodic Memory Benchmark proposed a task for Visual Queries with 3D Localization (VQ3D). Given an egocentric video clip and an image crop depicting a query object, the goal is to localize the 3D position of the center of that query object wi… ▽ More

    Submitted 28 August, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: ICCV 2023

  16. arXiv:2211.10284  [pdf, other

    cs.CV

    Estimating more camera poses for ego-centric videos is essential for VQ3D

    Authors: **jie Mai, Chen Zhao, Abdullah Hamdi, Silvio Giancola, Bernard Ghanem

    Abstract: Visual queries 3D localization (VQ3D) is a task in the Ego4D Episodic Memory Benchmark. Given an egocentric video, the goal is to answer queries of the form "Where did I last see object X?", where the query object X is specified as a static image, and the answer should be a 3D displacement vector pointing to object X. However, current techniques use naive ways to estimate the camera poses of video… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: Second International Ego4D Workshop at ECCV 2022

  17. Segmentation of Bruch's Membrane in retinal OCT with AMD using anatomical priors and uncertainty quantification

    Authors: Botond Fazekas, Dmitrii Lachinov, Guilherme Aresta, Julia Mai, Ursula Schmidt-Erfurth, Hrvoje Bogunovic

    Abstract: Bruch's membrane (BM) segmentation on optical coherence tomography (OCT) is a pivotal step for the diagnosis and follow-up of age-related macular degeneration (AMD), one of the leading causes of blindness in the developed world. Automated BM segmentation methods exist, but they usually do not account for the anatomical coherence of the results, neither provide feedback on the confidence of the pre… ▽ More

    Submitted 30 October, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

  18. SD-LayerNet: Semi-supervised retinal layer segmentation in OCT using disentangled representation with anatomical priors

    Authors: Botond Fazekas, Guilherme Aresta, Dmitrii Lachinov, Sophie Riedl, Julia Mai, Ursula Schmidt-Erfurth, Hrvoje Bogunovic

    Abstract: Optical coherence tomography (OCT) is a non-invasive 3D modality widely used in ophthalmology for imaging the retina. Achieving automated, anatomically coherent retinal layer segmentation on OCT is important for the detection and monitoring of different retinal diseases, like Age-related Macular Disease (AMD) or Diabetic Retinopathy. However, the majority of state-of-the-art layer segmentation met… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: Accepted at MICCAI 2022

    Journal ref: MICCAI 2022. Lecture Notes in Computer Science, vol 13438. Springer, Cham

  19. arXiv:2206.04670  [pdf, other

    cs.CV cs.AI

    PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies

    Authors: Guocheng Qian, Yuchen Li, Houwen Peng, **jie Mai, Hasan Abed Al Kader Hammoud, Mohamed Elhoseiny, Bernard Ghanem

    Abstract: PointNet++ is one of the most influential neural architectures for point cloud understanding. Although the accuracy of PointNet++ has been largely surpassed by recent networks such as PointMLP and Point Transformer, we find that a large portion of the performance gain is due to improved training strategies, i.e. data augmentation and optimization techniques, and increased model sizes rather than a… ▽ More

    Submitted 12 October, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: Accepted by NeurIPS'22. Code and models are available at https://github.com/guochengqian/pointnext

  20. arXiv:2205.01355  [pdf, other

    cs.GR cs.CV cs.LG

    Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks

    Authors: Xiaoyu Pan, Jiaming Mai, Xinwei Jiang, Dongxue Tang, **gxiang Li, Tianjia Shao, Kun Zhou, Xiaogang **, Dinesh Manocha

    Abstract: We present a learning algorithm that uses bone-driven motion networks to predict the deformation of loose-fitting garment meshes at interactive rates. Given a garment, we generate a simulation database and extract virtual bones from simulated mesh sequences using skin decomposition. At runtime, we separately compute low- and high-frequency deformations in a sequential manner. The low-frequency def… ▽ More

    Submitted 27 May, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

    Comments: SIGGRAPH 22 Conference Paper

  21. arXiv:2204.06455  [pdf, other

    eess.IV cs.CV

    WSSS4LUAD: Grand Challenge on Weakly-supervised Tissue Semantic Segmentation for Lung Adenocarcinoma

    Authors: Chu Han, Xipeng Pan, Lixu Yan, Huan Lin, Bingbing Li, Su Yao, Shanshan Lv, Zhenwei Shi, **hai Mai, Jiatai Lin, Bingchao Zhao, Zeyan Xu, Zhizhen Wang, Yumeng Wang, Yuan Zhang, Huihui Wang, Chao Zhu, Chunhui Lin, Lijian Mao, Min Wu, Luwen Duan, **gsong Zhu, Dong Hu, Zijie Fang, Yang Chen , et al. (18 additional authors not shown)

    Abstract: Lung cancer is the leading cause of cancer death worldwide, and adenocarcinoma (LUAD) is the most common subtype. Exploiting the potential value of the histopathology images can promote precision medicine in oncology. Tissue segmentation is the basic upstream task of histopathology image analysis. Existing deep learning models have achieved superior segmentation performance but require sufficient… ▽ More

    Submitted 13 April, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

  22. arXiv:2202.03433  [pdf, other

    eess.IV cs.CV

    A Coarse-to-fine Morphological Approach With Knowledge-based Rules and Self-adapting Correction for Lung Nodules Segmentation

    Authors: Xinliang Fu, Jiayin Zheng, Juanyun Mai, Yanbo Shao, Minghao Wang, Linyu Li, Zhaoqi Diao, Yulong Chen, Jianyu Xiao, Jian You, Airu Yin, Yang Yang, Xiangcheng Qiu, **sheng Tao, Bo Wang, Hua Ji

    Abstract: The segmentation module which precisely outlines the nodules is a crucial step in a computer-aided diagnosis(CAD) system. The most challenging part of such a module is how to achieve high accuracy of the segmentation, especially for the juxtapleural, non-solid and small nodules. In this research, we present a coarse-to-fine methodology that greatly improves the thresholding method performance with… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  23. arXiv:2201.13392   

    eess.IV cs.CV

    MHSnet: Multi-head and Spatial Attention Network with False-Positive Reduction for Pulmonary Nodules Detection

    Authors: Juanyun Mai, Minghao Wang, Jiayin Zheng, Yanbo Shao, Zhaoqi Diao, Xinliang Fu, Yulong Chen, Jianyu Xiao, Jian You, Airu Yin, Yang Yang, Xiangcheng Qiu, **sheng Tao, Bo Wang, Hua Ji

    Abstract: The mortality of lung cancer has ranked high among cancers for many years. Early detection of lung cancer is critical for disease prevention, cure, and mortality rate reduction. However, existing detection methods on pulmonary nodules introduce an excessive number of false positive proposals in order to achieve high sensitivity, which is not practical in clinical situations. In this paper, we prop… ▽ More

    Submitted 12 May, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

    Comments: We have to revise the experiment results and conclusions

  24. arXiv:2110.08048  [pdf, other

    eess.IV cs.CV q-bio.QM

    Multi-Layer Pseudo-Supervision for Histopathology Tissue Semantic Segmentation using Patch-level Classification Labels

    Authors: Chu Han, Jiatai Lin, **hai Mai, Yi Wang, Qingling Zhang, Bingchao Zhao, Xin Chen, Xipeng Pan, Zhenwei Shi, Xiaowei Xu, Su Yao, Lixu Yan, Huan Lin, Zeyan Xu, Xiaomei Huang, Guoqiang Han, Changhong Liang, Zaiyi Liu

    Abstract: Tissue-level semantic segmentation is a vital step in computational pathology. Fully-supervised models have already achieved outstanding performance with dense pixel-level annotations. However, drawing such labels on the giga-pixel whole slide images is extremely expensive and time-consuming. In this paper, we use only patch-level classification labels to achieve tissue semantic segmentation on hi… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: 15 pages, 10 figures, journal

    MSC Class: 68U10 ACM Class: I.4.6

  25. arXiv:2108.00831  [pdf, other

    eess.IV cs.CV

    Projective Skip-Connections for Segmentation Along a Subset of Dimensions in Retinal OCT

    Authors: Dmitrii Lachinov, Philipp Seeboeck, Julia Mai, Ursula Schmidt-Erfurth, Hrvoje Bogunovic

    Abstract: In medical imaging, there are clinically relevant segmentation tasks where the output mask is a projection to a subset of input image dimensions. In this work, we propose a novel convolutional neural network architecture that can effectively learn to produce a lower-dimensional segmentation mask than the input image. The network restores encoded representation only in a subset of input spatial dim… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: Submitted to MICCAI 2021

  26. HeterSkinNet: A Heterogeneous Network for Skin Weights Prediction

    Authors: Xiaoyu Pan, Jiancong Huang, Jiaming Mai, He Wang, Honglin Li, Tongkui Su, Wenjun Wang, Xiaogang **

    Abstract: Character rigging is universally needed in computer graphics but notoriously laborious. We present a new method, HeterSkinNet, aiming to fully automate such processes and significantly boost productivity. Given a character mesh and skeleton as input, our method builds a heterogeneous graph that treats the mesh vertices and the skeletal bones as nodes of different types and uses graph convolutions… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

    Comments: I3D 2021

  27. The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction

    Authors: Martin Gauch, Juliane Mai, Jimmy Lin

    Abstract: Accurate streamflow prediction largely relies on historical meteorological records and streamflow measurements. For many regions, however, such data are only scarcely available. Facing this problem, many studies simply trained their machine learning models on the region's available data, leaving possible repercussions of this strategy unclear. In this study, we evaluate the sensitivity of tree- an… ▽ More

    Submitted 11 August, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

    Comments: 13 pages, 3 figures

    Journal ref: Environmental Modelling & Software, Volume 135, 2021, 104926

  28. arXiv:1609.02035  [pdf

    cs.DC cs.CV

    Component-Based Distributed Framework for Coherent and Real-Time Video Dehazing

    Authors: Meihua Wang, Jiaming Mai, Yun Liang, Tom Z. J. Fu, Zhenjie Zhang, Ruichu Cai

    Abstract: Traditional dehazing techniques, as a well studied topic in image processing, are now widely used to eliminate the haze effects from individual images. However, even the state-of-the-art dehazing algorithms may not provide sufficient support to video analytics, as a crucial pre-processing step for video-based decision making systems (e.g., robot navigation), due to the limitations of these algorit… ▽ More

    Submitted 9 September, 2016; v1 submitted 7 September, 2016; originally announced September 2016.