Skip to main content

Showing 1–44 of 44 results for author: Xie, Z

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.08052  [pdf, other

    cs.SD eess.AS

    FakeSound: Deepfake General Audio Detection

    Authors: Zeyu Xie, Baihan Li, Xuenan Xu, Zheng Liang, Kai Yu, Mengyue Wu

    Abstract: With the advancement of audio generation, generative models can produce highly realistic audios. However, the proliferation of deepfake general audio can pose negative consequences. Therefore, we propose a new task, deepfake general audio detection, which aims to identify whether audio content is manipulated and to locate deepfake regions. Leveraging an automated manipulation pipeline, a dataset n… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

    MSC Class: 68Txx ACM Class: I.2

  2. arXiv:2406.00626  [pdf, other

    cs.MM cs.SD eess.AS

    Intelligent Text-Conditioned Music Generation

    Authors: Zhouyao Xie, Nikhil Yadala, Xinyi Chen, **g Xi Liu

    Abstract: CLIP (Contrastive Language-Image Pre-Training) is a multimodal neural network trained on (text, image) pairs to predict the most relevant text caption given an image. It has been used extensively in image generation by connecting its output with a generative model such as VQGAN, with the most notable example being OpenAI's DALLE-2. In this project, we apply a similar approach to bridge the gap bet… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  3. arXiv:2405.14029  [pdf, ps, other

    cs.IT eess.SP

    Analog Beamforming Enabled Multicasting: Finite-Alphabet Inputs and Statistical CSI

    Authors: Yanjun Wu, Zhong Xie, Zhuochen Xie, Chongjun Ouyang, Xuwen Liang

    Abstract: The average multicast rate (AMR) is analyzed in a multicast channel utilizing analog beamforming with finite-alphabet inputs, considering statistical channel state information (CSI). New expressions for the AMR are derived for non-cooperative and cooperative multicasting scenarios. Asymptotic analyses are conducted in the high signal-to-noise ratio regime to derive the array gain and diversity ord… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 5 pages

  4. Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NAS

    Authors: Afzal Ahmad, Linfeng Du, Zhiyao Xie, Wei Zhang

    Abstract: One of the primary challenges impeding the progress of Neural Architecture Search (NAS) is its extensive reliance on exorbitant computational resources. NAS benchmarks aim to simulate runs of NAS experiments at zero cost, remediating the need for extensive compute. However, existing NAS benchmarks use synthetic datasets and model proxies that make simplified assumptions about the characteristics o… ▽ More

    Submitted 18 June, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: Accepted at Design Automation Conference DAC'24

  5. arXiv:2403.17615  [pdf, other

    eess.IV cs.CV q-bio.QM

    Grad-CAMO: Learning Interpretable Single-Cell Morphological Profiles from 3D Cell Painting Images

    Authors: Vivek Gopalakrishnan, **gzhe Ma, Zhiyong Xie

    Abstract: Despite their black-box nature, deep learning models are extensively used in image-based drug discovery to extract feature vectors from single cells in microscopy images. To better understand how these networks perform representation learning, we employ visual explainability techniques (e.g., Grad-CAM). Our analyses reveal several mechanisms by which supervised models cheat, exploiting biologicall… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  6. arXiv:2403.15072  [pdf, other

    eess.SY

    Direct and Indirect Hydrogen Storage: Dynamics and Interactions in the Transition to a Renewable Energy Based System for Europe

    Authors: Zhiyuan Xie, Gorm Bruun Andresen

    Abstract: To move towards a low-carbon society by 2050, understanding the intricate dynamics of energy systems is critical. Our study examines these interactions through the lens of hydrogen storage, dividing it into 'direct' and 'indirect' hydrogen storage. Direct hydrogen storage involves electrolysis-produced hydrogen being stored before use, while indirect storage first transforms hydrogen into gas via… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  7. arXiv:2403.04594  [pdf, other

    cs.SD eess.AS

    A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds

    Authors: Xuenan Xu, Xiaohang Xu, Zeyu Xie, **yue Zhang, Mengyue Wu, Kai Yu

    Abstract: Recently, there has been an increasing focus on audio-text cross-modal learning. However, most of the existing audio-text datasets contain only simple descriptions of sound events. Compared with classification labels, the advantages of such descriptions are significantly limited. In this paper, we first analyze the detailed information that human descriptions of audio may contain beyond sound even… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  8. arXiv:2403.01278  [pdf, other

    cs.SD eess.AS

    Enhancing Audio Generation Diversity with Visual Information

    Authors: Zeyu Xie, Baihan Li, Xuenan Xu, Mengyue Wu, Kai Yu

    Abstract: Audio and sound generation has garnered significant attention in recent years, with a primary focus on improving the quality of generated audios. However, there has been limited research on enhancing the diversity of generated audio, particularly when it comes to audio generation within specific categories. Current models tend to produce homogeneous audio samples within a category. This work aims… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    ACM Class: I.2

  9. arXiv:2402.15985  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Phonetic and Lexical Discovery of a Canine Language using HuBERT

    Authors: Xingyuan Li, Sinong Wang, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

    Abstract: This paper delves into the pioneering exploration of potential communication patterns within dog vocalizations and transcends traditional linguistic analysis barriers, which heavily relies on human priori knowledge on limited datasets to find sound units in dog vocalization. We present a self-supervised approach with HuBERT, enabling the accurate classification of phoneme labels and the identifica… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  10. arXiv:2402.04753  [pdf, other

    eess.IV cs.CV

    Cortical Surface Diffusion Generative Models

    Authors: Zhenshan Xie, Simon Dahan, Logan Z. J. Williams, M. Jorge Cardoso, Emma C. Robinson

    Abstract: Cortical surface analysis has gained increased prominence, given its potential implications for neurological and developmental disorders. Traditional vision diffusion models, while effective in generating natural images, present limitations in capturing intricate development patterns in neuroimaging due to limited datasets. This is particularly true for generating cortical surfaces where individua… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 4 pages

  11. arXiv:2401.13051  [pdf, other

    cs.CV eess.IV

    PA-SAM: Prompt Adapter SAM for High-Quality Image Segmentation

    Authors: Zhaozhi Xie, Bochen Guan, Weihao Jiang, Muyang Yi, Yue Ding, Hongtao Lu, Lei Zhang

    Abstract: The Segment Anything Model (SAM) has exhibited outstanding performance in various image segmentation tasks. Despite being trained with over a billion masks, SAM faces challenges in mask prediction quality in numerous scenarios, especially in real-world contexts. In this paper, we introduce a novel prompt-driven adapter into SAM, namely Prompt Adapter Segment Anything Model (PA-SAM), aiming to enha… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Code is available at https://github.com/xzz2/pa-sam

  12. arXiv:2401.04394  [pdf, other

    cs.MM cs.SD eess.AS

    SonicVisionLM: Playing Sound with Vision Language Models

    Authors: Zhifeng Xie, Shengye Yu, Qile He, Mengtian Li

    Abstract: There has been a growing interest in the task of generating sound for silent videos, primarily because of its practicality in streamlining video post-production. However, existing methods for video-sound generation attempt to directly create sound from visual representations, which can be challenging due to the difficulty of aligning visual representations with audio representations. In this paper… ▽ More

    Submitted 3 April, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: CVPR 2024

  13. arXiv:2311.07081  [pdf, other

    cs.IT eess.SP

    Sensing Mutual Information with Random Signals in Gaussian Channels

    Authors: Lei Xie, Fan Liu, Zhanyuan Xie, Zheng Jiang, Shenghui Song

    Abstract: Sensing performance is typically evaluated by classical metrics, such as Cramer-Rao bound and signal-to-clutter-plus-noise ratio. The recent development of the integrated sensing and communication (ISAC) framework motivated the efforts to unify the metric for sensing and communication, where researchers have proposed to utilize mutual information (MI) to measure the sensing performance with determ… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  14. arXiv:2307.14697  [pdf, ps, other

    cs.NI eess.SP

    Space-Air-Ground Integrated Network (SAGIN): A Survey

    Authors: Jiming Chen, Han Zhang, Zhe Xie

    Abstract: Since existing mobile communication networks may not be able to meet the low latency and high-efficiency requirements of emerging technologies and applications, novel network architectures need to be investigated to support these new requirements. As a new network architecture that integrates satellite systems, air networks and ground communication, Space-Air-Ground Integrated Network (SAGIN) has… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  15. arXiv:2307.05138  [pdf

    physics.optics eess.IV physics.med-ph

    Super-resolution imaging through a multimode fiber: the physical upsampling of speckle-driven

    Authors: Chuncheng Zhang, Tingting Liu, Zhihua Xie, Yu Wang, Tong Liu, Qian Chen, Xiubao Sui

    Abstract: Following recent advancements in multimode fiber (MMF), miniaturization of imaging endoscopes has proven crucial for minimally invasive surgery in vivo. Recent progress enabled by super-resolution imaging methods with a data-driven deep learning (DL) framework has balanced the relationship between the core size and resolution. However, most of the DL approaches lack attention to the physical prope… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  16. arXiv:2307.04133  [pdf, other

    eess.IV cs.CV

    Ultrasonic Image's Annotation Removal: A Self-supervised Noise2Noise Approach

    Authors: Yuanheng Zhang, Nan Jiang, Zhaoheng Xie, Junying Cao, Yueyang Teng

    Abstract: Accurately annotated ultrasonic images are vital components of a high-quality medical report. Hospitals often have strict guidelines on the types of annotations that should appear on imaging results. However, manually inspecting these images can be a cumbersome task. While a neural network could potentially automate the process, training such a model typically requires a dataset of paired input an… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: 10 pages, 7 figures

  17. arXiv:2307.00174  [pdf, other

    eess.IV cs.CV

    Multiscale Progressive Text Prompt Network for Medical Image Segmentation

    Authors: Xianjun Han, Qianqian Chen, Zhaoyang Xie, Xuejun Li, Hongyu Yang

    Abstract: The accurate segmentation of medical images is a crucial step in obtaining reliable morphological statistics. However, training a deep neural network for this task requires a large amount of labeled data to ensure high-accuracy results. To address this issue, we propose using progressive text prompts as prior knowledge to guide the segmentation process. Our model consists of two stages. In the fir… ▽ More

    Submitted 30 June, 2023; originally announced July 2023.

  18. arXiv:2306.10090  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    Improving Audio Caption Fluency with Automatic Error Correction

    Authors: Hanxue Zhang, Zeyu Xie, Xuenan Xu, Mengyue Wu, Kai Yu

    Abstract: Automated audio captioning (AAC) is an important cross-modality translation task, aiming at generating descriptions for audio clips. However, captions generated by previous AAC models have faced ``false-repetition'' errors due to the training objective. In such scenarios, we propose a new task of AAC error correction and hope to reduce such errors by post-processing AAC outputs. To tackle this pro… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted by NCMMSC 2022

  19. arXiv:2306.01533  [pdf, other

    cs.SD eess.AS

    Enhance Temporal Relations in Audio Captioning with Sound Event Detection

    Authors: Zeyu Xie, Xuenan Xu, Mengyue Wu, Kai Yu

    Abstract: Automated audio captioning aims at generating natural language descriptions for given audio clips, not only detecting and classifying sounds, but also summarizing the relationships between audio events. Recent research advances in audio captioning have introduced additional guidance to improve the accuracy of audio events in generated sentences. However, temporal relations between audio events hav… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  20. arXiv:2304.06141  [pdf, ps, other

    cs.IT eess.SP

    Performance Analysis for Near-Field MIMO: Discrete and Continuous Aperture Antennas

    Authors: Ziyi Xie, Yuanwei Liu, Jiaqi Xu, Xuanli Wu, Arumugam Nallanathan

    Abstract: Performance analysis is carried out in a near-field multiple-input multiple-output (MIMO) system for both discrete and continuous aperture antennas. The effective degrees of freedom (EDoF) is first derived. It is shown that near-field MIMO systems have a higher EDoF than free-space far-field ones. Additionally, the near-field EDoF further depends on the communication distance. Based on the derived… ▽ More

    Submitted 1 October, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: 6 pages, 4 figures. This is the long version of the paper which published in IEEE Wireless Communications Letters with the same title

  21. arXiv:2303.07902  [pdf, other

    cs.SD eess.AS

    BLAT: Bootstrap** Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data

    Authors: Xuenan Xu, Zhiling Zhang, Zelin Zhou, **yue Zhang, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

    Abstract: Compared with ample visual-text pre-training research, few works explore audio-text pre-training, mostly due to the lack of sufficient parallel audio-text data. Most existing methods incorporate the visual modality as a pivot for audio-text pre-training, which inevitably induces data noise. In this paper, we propose to utilize audio captioning to generate text directly from audio, without the aid… ▽ More

    Submitted 5 March, 2024; v1 submitted 14 March, 2023; originally announced March 2023.

  22. arXiv:2303.00232  [pdf, other

    eess.IV cs.CV

    Towards more precise automatic analysis: a comprehensive survey of deep learning-based multi-organ segmentation

    Authors: Xiaoyu Liu, Linhao Qu, Ziyue Xie, Jiayue Zhao, Yonghong Shi, Zhijian Song

    Abstract: Accurate segmentation of multiple organs of the head, neck, chest, and abdomen from medical images is an essential step in computer-aided diagnosis, surgical navigation, and radiation therapy. In the past few years, with a data-driven feature extraction approach and end-to-end training, automatic deep learning-based multi-organ segmentation method has far outperformed traditional methods and becom… ▽ More

    Submitted 2 March, 2023; v1 submitted 28 February, 2023; originally announced March 2023.

    Comments: 25 pages, 9 figures, 16 tabels

  23. arXiv:2302.14751  [pdf

    eess.SP physics.optics

    High speed free-space optical communication using standard fiber communication component without optical amplification

    Authors: Yao Zhang, Hua-Ying Liu, Xiaoyi Liu, Peng Xu, Xiang Dong, Pengfei Fan, Xiaohui Tian, Hua Yu, Dong Pan, Zhijun Yin, Guilu Long, Shi-Ning Zhu, Zhenda Xie

    Abstract: Free-space optical communication (FSO) can achieve fast, secure and license-free communication without need for physical cables, making it a cost-effective, energy-efficient and flexible solution when the fiber connection is unavailable. To establish FSO connection on-demand, it is essential to build portable FSO devices with compact structure and light weight. Here, we develop a miniaturized FSO… ▽ More

    Submitted 16 April, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: 7 pages, 5 figures

  24. arXiv:2212.14354  [pdf

    eess.SP cs.CE

    A Fault Location Method Based on Electromagnetic Transient Convolution Considering Frequency-Dependent Parameters and Lossy Ground

    Authors: Guanbo Wang, Chijie Zhuang, Jun Deng, Zhicheng Xie

    Abstract: As the capacity of power systems grows, the need for quick and precise short-circuit fault location becomes increasingly vital for ensuring the safe and continuous supply of power. In this paper, we propose a fault location method that utilizes electromagnetic transient convolution (EMTC). We assess the performance of a naive EMTC implementation in multi-phase power lines by using frequency-depend… ▽ More

    Submitted 31 December, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

  25. arXiv:2212.07009  [pdf

    physics.optics eess.IV

    Piston sensing for sparse aperture systems via all-optical diffractive neural network

    Authors: Xiafei Ma, Zongliang Xie, Haotong Ma, Ge Ren

    Abstract: It is a crucial issue to realize real-time piston correction in the area of sparse aperture imaging. This paper introduces an optical diffractive neural network-based piston sensing method, which can achieve light-speed sensing. By using detectable intensity to represent pistons, the proposed method is capable of converting complex amplitude distribution of the imaging optical field into piston va… ▽ More

    Submitted 29 June, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: 5 pages, 6 figures

  26. arXiv:2210.13060  [pdf, ps, other

    cs.IT eess.SP

    Is the Envelope Beneficial to Non-Orthogonal Multiple Access?

    Authors: Ziyi Xie, Wenqiang Yi, Xuanli Wu, Yuanwei Liu, Arumugam Nallanathan

    Abstract: Non-orthogonal multiple access (NOMA) is capable of serving different numbers of users in the same time-frequency resource element, and this feature can be leveraged to carry additional information. In the orthogonal frequency division multiplexing (OFDM) system, we propose a novel enhanced NOMA scheme, called NOMA with informative envelope (NOMA-IE), to explore the flexibility of the envelope of… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: 30 pages, 9 figures

  27. arXiv:2207.13334  [pdf

    physics.optics eess.IV

    Fast optical refocusing through multimode fiber bend using Cake-Cutting Hadamard encoding algorithm to improve robustness

    Authors: Chuncheng Zhang, Zheyi Yao, Zhengyue Qin, Guohua Gu, Qian Chen, Zhihua Xie, Guodong Liu, Xiubao Sui

    Abstract: Multimode fibres offer the advantages of high resolution and miniaturization over single mode fibers in the field of optical imaging. However, multimode fibre's imaging is susceptible to perturbations of MMF that can lead to secondary spatial distortions in the transmitted image. Perturbations include random disturbances in the fiber as well as environmental noise. Here, we exploit the fast focusi… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

  28. arXiv:2205.05357  [pdf, other

    cs.SD eess.AS

    Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning

    Authors: Xuenan Xu, Zeyu Xie, Mengyue Wu, Kai Yu

    Abstract: Automated audio captioning (AAC), a task that mimics human perception as well as innovatively links audio processing and natural language processing, has overseen much progress over the last few years. AAC requires recognizing contents such as the environment, sound events and the temporal relationships between sound events and describing these elements with a fluent sentence. Currently, an encode… ▽ More

    Submitted 15 November, 2023; v1 submitted 11 May, 2022; originally announced May 2022.

  29. GlacierNet2: A Hybrid Multi-Model Learning Architecture for Alpine Glacier Map**

    Authors: Zhiyuan Xie, Umesh K. Haritashya, Vijayan K. Asari, Michael P. Bishop, Jeffrey S. Kargel, Theus H. Aspiras

    Abstract: In recent decades, climate change has significantly affected glacier dynamics, resulting in mass loss and an increased risk of glacier-related hazards including supraglacial and proglacial lake development, as well as catastrophic outburst flooding. Rapidly changing conditions dictate the need for continuous and detailed observations and analysis of climate-glacier dynamics. Thematic and quantitat… ▽ More

    Submitted 29 July, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

    Journal ref: International Journal of Applied Earth Observation and Geoinformation, 2022

  30. arXiv:2203.10597  [pdf, other

    cs.CR cs.LG eess.SY

    The Dark Side: Security Concerns in Machine Learning for EDA

    Authors: Zhiyao Xie, **gyu Pan, Chen-Chia Chang, Yiran Chen

    Abstract: The growing IC complexity has led to a compelling need for design efficiency improvement through new electronic design automation (EDA) methodologies. In recent years, many unprecedented efficient EDA methods have been enabled by machine learning (ML) techniques. While ML demonstrates its great potential in circuit design, however, the dark side about security problems, is seldomly discussed. This… ▽ More

    Submitted 20 March, 2022; originally announced March 2022.

  31. arXiv:2112.08232  [pdf

    eess.IV cs.CV cs.LG

    RA V-Net: Deep learning network for automated liver segmentation

    Authors: Zhiqi Lee, Sumin Qi, Chongchong Fan, Ziwei Xie

    Abstract: Accurate segmentation of the liver is a prerequisite for the diagnosis of disease. Automated segmentation is an important application of computer-aided detection and diagnosis of liver disease. In recent years, automated processing of medical images has gained breakthroughs. However, the low contrast of abdominal scan CT images and the complexity of liver morphology make accurate automatic segment… ▽ More

    Submitted 15 December, 2021; v1 submitted 15 December, 2021; originally announced December 2021.

  32. arXiv:2110.04684  [pdf, other

    cs.SD cs.CL eess.AS

    Can Audio Captions Be Evaluated with Image Caption Metrics?

    Authors: Zelin Zhou, Zhiling Zhang, Xuenan Xu, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

    Abstract: Automated audio captioning aims at generating textual descriptions for an audio clip. To evaluate the quality of generated audio captions, previous works directly adopt image captioning metrics like SPICE and CIDEr, without justifying their suitability in this new domain, which may mislead the development of advanced models. This problem is still unstudied due to the lack of human judgment dataset… ▽ More

    Submitted 27 January, 2022; v1 submitted 9 October, 2021; originally announced October 2021.

    Comments: ICASSP 2022

  33. arXiv:2104.13182  [pdf, ps, other

    cs.IT eess.SP

    Modeling and Coverage Analysis for RIS-aided NOMA Transmissions in Heterogeneous Networks

    Authors: Ziyi Xie, Wenqiang Yi, Xuanli Wu, Yuanwei Liu, Arumugam Nallanathan

    Abstract: Reconfigurable intelligent surface (RIS) has been regarded as a promising tool to strengthen the quality of signal transmissions in non-orthogonal multiple access (NOMA) networks. This article introduces a heterogeneous network (HetNet) structure into RIS-aided NOMA multi-cell networks. A practical user equipment (UE) association scheme for maximizing the average received power is adopted. To eval… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: 30 pages, 7 figures, 2 tables

  34. arXiv:2102.11457  [pdf, other

    cs.SD eess.AS

    Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning

    Authors: Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Zeyu Xie, Kai Yu

    Abstract: Automated audio captioning (AAC) aims at generating summarizing descriptions for audio clips. Multitudinous concepts are described in an audio caption, ranging from local information such as sound events to global information like acoustic scenery. Currently, the mainstream paradigm for AAC is the end-to-end encoder-decoder architecture, expecting the encoder to learn all levels of concepts embedd… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

  35. arXiv:2011.13491  [pdf, other

    cs.LG eess.SP

    Fast IR Drop Estimation with Machine Learning

    Authors: Zhiyao Xie, Hai Li, Xiaoqing Xu, Jiang Hu, Yiran Chen

    Abstract: IR drop constraint is a fundamental requirement enforced in almost all chip designs. However, its evaluation takes a long time, and mitigation techniques for fixing violations may require numerous iterations. As such, fast and accurate IR drop prediction becomes critical for reducing design turnaround time. Recently, machine learning (ML) techniques have been actively studied for fast IR drop esti… ▽ More

    Submitted 26 November, 2020; originally announced November 2020.

    Journal ref: 2020 International Conference On Computer Aided Design (ICCAD 2020)

  36. arXiv:2008.11827  [pdf, other

    eess.SP cs.LG cs.PF

    Smart-PGSim: Using Neural Network to Accelerate AC-OPF Power Grid Simulation

    Authors: Wenqian Dong, Zhen Xie, Gokcen Kestor, Dong Li

    Abstract: The optimal power flow (OPF) problem is one of the most important optimization problems for the operation of the power grid. It calculates the optimum scheduling of the committed generation units. In this paper, we develop a neural network approach to the problem of accelerating the current optimal power flow (AC-OPF) by generating an intelligent initial solution. The high quality of the initial s… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

  37. arXiv:2008.03942  [pdf, other

    eess.SP

    Joint Bandwidth Allocation and Path Selection in WANs with Path Cardinality Constraints

    Authors: **xin Wang, Fan Zhang, Zhonglin Xie, Gong Zhang, Zaiwen Wen

    Abstract: In this paper, we study a joint bandwidth allocation and path selection problem via solving a multi-objective minimization problem under the path cardinality constraints, namely MOPC. Our problem formulation captures various types of objectives including the proportional fairness, the total completion time, as well as the worst-case link utilization ratio. Such an optimization problem is very chal… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.

    Comments: Submitted to IEEE TSP and being under review

  38. arXiv:2007.04574  [pdf, other

    eess.IV cs.CV

    Neural Video Coding using Multiscale Motion Compensation and Spatiotemporal Context Model

    Authors: Haojie Liu, Ming Lu, Zhan Ma, Fan Wang, Zhihuang Xie, Xun Cao, Yao Wang

    Abstract: Over the past two decades, traditional block-based video coding has made remarkable progress and spawned a series of well-known standards such as MPEG-4, H.264/AVC and H.265/HEVC. On the other hand, deep neural networks (DNNs) have shown their powerful capacity for visual content understanding, feature extraction and compact representation. Some previous works have explored the learnt video coding… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

  39. arXiv:1910.09405  [pdf, other

    eess.IV cs.CV

    Hyperspectral Image Classification Based on Adaptive Sparse Deep Network

    Authors: **gwen Yan, Zixin Xie, **gyao Chen, Yinan Liu, Lei Liu

    Abstract: Sparse model is widely used in hyperspectral image classification.However, different of sparsity and regularization parameters has great influence on the classification results.In this paper, a novel adaptive sparse deep network based on deep architecture is proposed, which can construct the optimal sparse representation and regularization parameters by deep network.Firstly, a data flow graph is d… ▽ More

    Submitted 21 October, 2019; originally announced October 2019.

  40. arXiv:1910.08397  [pdf

    eess.IV physics.optics

    Translation position extracting in incoherent Fourier ptychography

    Authors: Zongliang Xie, Haotong Ma, Yihan Luo, Bo Qi, Ge Ren

    Abstract: Incoherent Fourier ptychography (IFP) is a newly developed super-resolution method, where accurate knowledge of translation positions is essential for image reconstruction.To release this limitation, we propose a preprocessing algorithm capable of extracting translation positions of the structure light directly from raw images of IFP, termed translation position extracting (TPE). TPE mainly involv… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

  41. arXiv:1907.08983  [pdf, other

    eess.SP

    Physical-Layer Network Coding: An Efficient Technique for Wireless Communications

    Authors: **** Chen, Zhaopeng Xie, Yi Fang, Zhifeng Chen, Shahid Mumtaz, Joel J. P. C. Rodrigues

    Abstract: As a subfield of network coding, physical-layer network coding (PNC) can effectively enhance the throughput of wireless networks by map** superimposed signals at receiver to other forms of user messages. Over the past twenty years, PNC has received significant research attention and has been widely studied in various communication scenarios, e.g., two-way relay communications (TWRC), nonorthogon… ▽ More

    Submitted 23 July, 2019; v1 submitted 21 July, 2019; originally announced July 2019.

  42. arXiv:1906.00884  [pdf, other

    cs.CV eess.IV

    Fashion Editing with Adversarial Parsing Learning

    Authors: Haoye Dong, Xiaodan Liang, Yixuan Zhang, Xujie Zhang, Zhenyu Xie, Bowen Wu, Ziqi Zhang, Xiaohui Shen, Jian Yin

    Abstract: Interactive fashion image manipulation, which enables users to edit images with sketches and color strokes, is an interesting research problem with great application value. Existing works often treat it as a general inpainting task and do not fully leverage the semantic structural information in fashion images. Moreover, they directly utilize conventional convolution and normalization layers to re… ▽ More

    Submitted 28 September, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: 22 pages, 18 figures

  43. arXiv:1905.13594  [pdf

    eess.IV cs.CR cs.CV

    Known-plaintext attack and ciphertext-only attack for encrypted single-pixel imaging

    Authors: Shuming Jiao, Yang Gao, Ting Lei, Zhenwei Xie, Xiaocong Yuan

    Abstract: In many previous works, a single-pixel imaging (SPI) system is constructed as an optical image encryption system. Unauthorized users are not able to reconstruct the plaintext image from the ciphertext intensity sequence without knowing the illumination pattern key. However, little cryptanalysis about encrypted SPI has been investigated in the past. In this work, we propose a known-plaintext attack… ▽ More

    Submitted 31 May, 2019; originally announced May 2019.

  44. Multiple-image encryption and hiding with an optical diffractive neural network

    Authors: Yang Gao, Shuming Jiao, Juncheng Fang, Ting Lei, Zhenwei Xie, Xiaocong Yuan

    Abstract: A cascaded phase-only mask architecture (or an optical diffractive neural network) can be employed for different optical information processing tasks such as pattern recognition, orbital angular momentum (OAM) mode conversion, image salience detection and image encryption. However, for optical encryption and watermarking applications, such a system usually cannot process multiple pairs of input im… ▽ More

    Submitted 10 February, 2020; v1 submitted 21 February, 2019; originally announced February 2019.