Skip to main content

Showing 1–50 of 67 results for author: Su, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18443  [pdf, other

    cs.CV

    Unveiling the Unknown: Conditional Evidence Decoupling for Unknown Rejection

    Authors: Zhaowei Wu, Binyi Su, Hua Zhang, Zhong Zhou

    Abstract: In this paper, we focus on training an open-set object detector under the condition of scarce training samples, which should distinguish the known and unknown categories. Under this challenging scenario, the decision boundaries of unknowns are difficult to learn and often ambiguous. To mitigate this issue, we develop a novel open-set object detection framework, which delves into conditional eviden… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.11092  [pdf, other

    cs.LG math.NA stat.ML

    Guaranteed Sampling Flexibility for Low-tubal-rank Tensor Completion

    Authors: Bowen Su, Juntao You, HanQin Cai, Longxiu Huang

    Abstract: While Bernoulli sampling is extensively studied in tensor completion, t-CUR sampling approximates low-tubal-rank tensors via lateral and horizontal subtensors. However, both methods lack sufficient flexibility for diverse practical applications. To address this, we introduce Tensor Cross-Concentrated Sampling (t-CCS), a novel and straightforward sampling model that advances the matrix cross-concen… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  3. arXiv:2406.04519  [pdf, other

    cs.LG

    Multifidelity digital twin for real-time monitoring of structural dynamics in aquaculture net cages

    Authors: Eirini Katsidoniotaki, Biao Su, Eleni Kelasidi, Themistoklis P. Sapsis

    Abstract: As the global population grows and climate change intensifies, sustainable food production is critical. Marine aquaculture offers a viable solution, providing a sustainable protein source. However, the industry's expansion requires novel technologies for remote management and autonomous operations. Digital twin technology can advance the aquaculture industry, but its adoption has been limited. Fis… ▽ More

    Submitted 10 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2405.19654  [pdf, other

    cs.AI

    Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training

    Authors: **xia Yang, Bing Su, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: Medical vision-language pre-training methods mainly leverage the correspondence between paired medical images and radiological reports. Although multi-view spatial images and temporal sequences of image-report pairs are available in off-the-shelf multi-modal medical datasets, most existing methods have not thoroughly tapped into such extensive supervision signals. In this paper, we introduce the M… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted at ICML 2024

  5. arXiv:2405.01053  [pdf, other

    cs.LG cs.AI

    Explicitly Modeling Universality into Self-Supervised Learning

    Authors: **gyao Wang, Wenwen Qiang, Zeen Song, Lingyu Si, Jiangmeng Li, Changwen Zheng, Bing Su

    Abstract: The goal of universality in self-supervised learning (SSL) is to learn universal representations from unlabeled data and achieve excellent performance on all samples and tasks. However, these methods lack explicit modeling of the universality in the learning objective, and the related theoretical understanding remains limited. This may cause models to overfit in data-scarce situations and generali… ▽ More

    Submitted 23 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 28 pages, submitted to ICML24 with 7766

  6. arXiv:2404.15131  [pdf, other

    cs.RO

    Optimizing Multi-Touch Textile and Tactile Skin Sensing Through Circuit Parameter Estimation

    Authors: Bo Ying Su, Yuchen Wu, Chengtao Wen, Changliu Liu

    Abstract: Tactile and textile skin technologies have become increasingly important for enhancing human-robot interaction and allowing robots to adapt to different environments. Despite notable advancements, there are ongoing challenges in skin signal processing, particularly in achieving both accuracy and speed in dynamic touch sensing. This paper introduces a new framework that poses the touch sensing prob… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  7. arXiv:2404.04095  [pdf, other

    cs.CV cs.AI

    Dynamic Prompt Optimizing for Text-to-Image Generation

    Authors: Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen, Qing Yang

    Abstract: Text-to-image generative models, specifically those based on diffusion models like Imagen and Stable Diffusion, have made substantial advancements. Recently, there has been a surge of interest in the delicate refinement of text prompts. Users assign weights or alter the injection time steps of certain words in the text prompts to improve the quality of generated images. However, the success of fin… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  8. arXiv:2404.00340  [pdf, other

    cs.RO eess.SY

    Deep Reinforcement Learning in Autonomous Car Path Planning and Control: A Survey

    Authors: Yiyang Chen, Chao Ji, Yunrui Cai, Tong Yan, Bo Su

    Abstract: Combining data-driven applications with control systems plays a key role in recent Autonomous Car research. This thesis offers a structured review of the latest literature on Deep Reinforcement Learning (DRL) within the realm of autonomous vehicle Path Planning and Control. It collects a series of DRL methodologies and algorithms and their applications in the field, focusing notably on their roles… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  9. arXiv:2402.09204  [pdf, other

    cs.CV

    Domain-adaptive and Subgroup-specific Cascaded Temperature Regression for Out-of-distribution Calibration

    Authors: Jiexin Wang, Jiahao Chen, Bing Su

    Abstract: Although deep neural networks yield high classification accuracy given sufficient training data, their predictions are typically overconfident or under-confident, i.e., the prediction confidences cannot truly reflect the accuracy. Post-hoc calibration tackles this problem by calibrating the prediction confidences without re-training the classification model. However, current approaches assume cong… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Journal ref: 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024), Seoul, Korea

  10. arXiv:2401.15566  [pdf, other

    stat.ML cs.IT cs.LG math.OC

    On the Robustness of Cross-Concentrated Sampling for Matrix Completion

    Authors: HanQin Cai, Longxiu Huang, Chandra Kundu, Bowen Su

    Abstract: Matrix completion is one of the crucial tools in modern data science research. Recently, a novel sampling model for matrix completion coined cross-concentrated sampling (CCS) has caught much attention. However, the robustness of the CCS model against sparse outliers remains unclear in the existing studies. In this paper, we aim to answer this question by exploring a novel Robust CCS Completion pro… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: 58th Annual Conference of Information Sciences and Systems

  11. arXiv:2310.19973  [pdf, other

    stat.ML cs.CR cs.LG math.ST stat.ME

    Unified Enhancement of Privacy Bounds for Mixture Mechanisms via $f$-Differential Privacy

    Authors: Chendi Wang, Buxin Su, Jiayuan Ye, Reza Shokri, Weijie J. Su

    Abstract: Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling. However, such randomness is difficult to take into account when proving differential privacy bounds because it induces mixture distributions for the algorithm's output that are difficult to analyze. This paper focuses on improving privacy… ▽ More

    Submitted 1 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

  12. arXiv:2308.06358  [pdf, other

    cs.HC

    CA2: Cyber Attacks Analytics

    Authors: Luyu Cheng, Bairui Su, Yumeng Xue, Xiaoyu Liu, Yunhai Wang

    Abstract: The VAST Challenge 2020 Mini-Challenge 1 requires participants to identify the responsible white hat groups behind a fictional Internet outage. To address this task, we have created a visual analytics system named CA2: Cyber Attacks Analytics. This system is designed to efficiently compare and match subgraphs within an extensive graph containing anonymized profiles. Additionally, we showcase an it… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: IEEE Conference on Visual Analytics Science and Technology (VAST) Challenge Workshop 2020

  13. arXiv:2308.05648  [pdf, other

    cs.CV

    Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization

    Authors: Zezhong Lv, Bing Su, Ji-Rong Wen

    Abstract: Video moment localization aims to retrieve the target segment of an untrimmed video according to the natural language query. Weakly supervised methods gains attention recently, as the precise temporal location of the target segment is not always available. However, one of the greatest challenges encountered by the weakly supervised method is implied in the mismatch between the video and language i… ▽ More

    Submitted 14 October, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  14. Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization

    Authors: Yujie Zhou, Wenwen Qiang, Anyi Rao, Ning Lin, Bing Su, Jiaqi Wang

    Abstract: Zero-shot skeleton-based action recognition aims to recognize actions of unseen categories after training on data of seen categories. The key is to build the connection between visual and semantic space from seen to unseen classes. Previous studies have primarily focused on encoding sequences into a singular feature vector, with subsequent map** the features to an identical anchor point within t… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  15. arXiv:2308.03072  [pdf, other

    cs.RO

    Customizing Textile and Tactile Skins for Interactive Industrial Robots

    Authors: Bo Ying Su, Zhongqi Wei, James McCann, Wenzhen Yuan, Changliu Liu

    Abstract: Tactile skins made from textiles enhance robot-human interaction by localizing contact points and measuring contact forces. This paper presents a solution for rapidly fabricating, calibrating, and deploying these skins on industrial robot arms. The novel automated skin calibration procedure maps skin locations to robot geometry and calibrates contact force. Through experiments on a FANUC LR Mate 2… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

  16. arXiv:2308.01850  [pdf, other

    cs.CV cs.AI

    Synthesizing Long-Term Human Motions with Diffusion Models via Coherent Sampling

    Authors: Zhao Yang, Bing Su, Ji-Rong Wen

    Abstract: Text-to-motion generation has gained increasing attention, but most existing methods are limited to generating short-term motions that correspond to a single sentence describing a single action. However, when a text stream describes a sequence of continuous motions, the generated motions corresponding to each sentence may not be coherently linked. Existing long-term motion generation methods face… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: Accepted at ACM MM 2023

  17. Spatio-Temporal Branching for Motion Prediction using Motion Increments

    Authors: Jiexin Wang, Yujie Zhou, Wenwen Qiang, Ying Ba, Bing Su, Ji-Rong Wen

    Abstract: Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications, but it remains a challenging task due to the stochastic and aperiodic nature of future poses. Traditional methods rely on hand-crafted features and machine learning techniques, which often struggle to model the complex dynamics of human motion. Recent deep learning-based methods have achieved suc… ▽ More

    Submitted 11 August, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

    Journal ref: ACM MM 2023

  18. arXiv:2305.12618  [pdf, other

    cs.LG cs.AI q-bio.QM

    Atomic and Subgraph-aware Bilateral Aggregation for Molecular Representation Learning

    Authors: Jiahao Chen, Yurou Liu, Jiangmeng Li, Bing Su, Jirong Wen

    Abstract: Molecular representation learning is a crucial task in predicting molecular properties. Molecules are often modeled as graphs where atoms and chemical bonds are represented as nodes and edges, respectively, and Graph Neural Networks (GNNs) have been commonly utilized to predict atom-related properties, such as reactivity and solubility. However, functional groups (subgraphs) are closely related to… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

  19. Toward Auto-evaluation with Confidence-based Category Relation-aware Regression

    Authors: Jiexin Wang, Jiahao Chen, Bing Su

    Abstract: Auto-evaluation aims to automatically evaluate a trained model on any test dataset without human annotations. Most existing methods utilize global statistics of features extracted by the model as the representation of a dataset. This ignores the influence of the classification head and loses category-wise confusion information of the model. However, ratios of instances assigned to different catego… ▽ More

    Submitted 9 May, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Journal ref: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5

  20. arXiv:2304.06537  [pdf, other

    cs.CV

    Transfer Knowledge from Head to Tail: Uncertainty Calibration under Long-tailed Distribution

    Authors: Jiahao Chen, Bing Su

    Abstract: How to estimate the uncertainty of a given model is a crucial problem. Current calibration techniques treat different classes equally and thus implicitly assume that the distribution of training data is balanced, but ignore the fact that real-world data often follows a long-tailed distribution. In this paper, we explore the problem of calibrating the model trained from a long-tailed distribution.… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  21. Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences

    Authors: Yujie Zhou, Haodong Duan, Anyi Rao, Bing Su, Jiaqi Wang

    Abstract: Self-supervised learning has demonstrated remarkable capability in representation learning for skeleton-based action recognition. Existing methods mainly focus on applying global data augmentation to generate different views of the skeleton sequence for contrastive learning. However, due to the rich action clues in the skeleton sequences, existing methods may only take a global perspective to lear… ▽ More

    Submitted 22 February, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: Accepted by AAAI 2023(Oral)

  22. arXiv:2302.05787  [pdf, other

    stat.ML cs.CR cs.LG stat.AP

    Differentially Private Normalizing Flows for Density Estimation, Data Synthesis, and Variational Inference with Application to Electronic Health Records

    Authors: Bingyue Su, Yu Wang, Daniele E. Schiavazzi, Fang Liu

    Abstract: Electronic health records (EHR) often contain sensitive medical information about individual patients, posing significant limitations to sharing or releasing EHR data for downstream learning and inferential tasks. We use normalizing flows (NF), a family of deep generative models, to estimate the probability density of a dataset with differential privacy (DP) guarantees, from which privacy-preservi… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

  23. arXiv:2210.15996  [pdf, other

    cs.CV

    Towards Generalized Few-Shot Open-Set Object Detection

    Authors: Binyi Su, Hua Zhang, **gzhi Li, Zhong Zhou

    Abstract: Open-set object detection (OSOD) aims to detect the known categories and reject unknown objects in a dynamic world, which has achieved significant attention. However, previous approaches only consider this problem in data-abundant conditions, while neglecting the few-shot scenes. In this paper, we seek a solution for the generalized few-shot open-set object detection (G-FOOD), which aims to avoid… ▽ More

    Submitted 21 February, 2024; v1 submitted 28 October, 2022; originally announced October 2022.

  24. arXiv:2210.13446  [pdf, other

    cs.RO

    Flying Trot Control Method for Quadruped Robot Based on Trajectory Planning

    Authors: Hongge Wang, Hui Chai, Bin Chen, Aizhen Xie, Rui Song, Bo Su

    Abstract: An intuitive control method for the flying trot, which combines offline trajectory planning with real-time balance control, is presented. The motion features of running animals in the vertical direction were analysed using the spring-load-inverted-pendulum (SLIP) model, and the foot trajectory of the robot was planned, so the robot could run similar to an animal capable of vertical flight, accordi… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: 30 pages, 20 figures, journal

  25. arXiv:2209.07902  [pdf, other

    cs.LG cs.CV

    MetaMask: Revisiting Dimensional Confounder for Self-Supervised Learning

    Authors: Jiangmeng Li, Wenwen Qiang, Yanan Zhang, Wenyi Mo, Changwen Zheng, Bing Su, Hui Xiong

    Abstract: As a successful approach to self-supervised learning, contrastive learning aims to learn invariant information shared among distortions of the input sample. While contrastive learning has yielded continuous advancements in sampling strategy and architecture design, it still remains two persistent defects: the interference of task-irrelevant information and sample inefficiency, which are related to… ▽ More

    Submitted 9 August, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: Accepted by NeurIPS 2022 as Spotlight

  26. Modeling Multiple Views via Implicitly Preserving Global Consistency and Local Complementarity

    Authors: Jiangmeng Li, Wenwen Qiang, Changwen Zheng, Bing Su, Farid Razzak, Ji-Rong Wen, Hui Xiong

    Abstract: While self-supervised learning techniques are often used to mining implicit knowledge from unlabeled data via modeling multiple views, it is unclear how to perform effective representation learning in a complex and inconsistent context. To this end, we propose a methodology, specifically consistency and complementarity network (CoCoNet), which avails of strict global inter-view consistency and loc… ▽ More

    Submitted 9 August, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE) 2022; Refer to https://ieeexplore.ieee.org/document/9857632

  27. arXiv:2209.06419  [pdf, other

    cs.IT eess.SP

    Frequency Reversal Alamouti Code-Based FBMC with Resilience to Inter-Antenna Frequency Offsets

    Authors: Cheng-Yu Lin, Borching Su, Kwonhue Choi

    Abstract: Transmit diversity schemes for filter bank multicarrier (FBMC) are known to be challenging. No existing schemes have considered the presence of inter-antenna frequency offset (IAFO), which will result in performance degradation. In this letter, a new transmit scheme based on the frequency reversal Alamouti code (FRAC)-based structure to address the issue of IAFO is proposed and is proven to inhere… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

  28. arXiv:2209.05481  [pdf, other

    cs.LG cs.AI

    A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language

    Authors: Bing Su, Dazhao Du, Zhao Yang, Yujie Zhou, Jiangmeng Li, Anyi Rao, Hao Sun, Zhiwu Lu, Ji-Rong Wen

    Abstract: Although artificial intelligence (AI) has made significant progress in understanding molecules in a wide range of fields, existing models generally acquire the single cognitive ability from the single molecular modality. Since the hierarchy of molecular knowledge is profound, even humans learn from different modalities including both intuitive diagrams and professional texts to assist their unders… ▽ More

    Submitted 11 September, 2022; originally announced September 2022.

  29. arXiv:2207.02454  [pdf, other

    cs.LG

    Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives

    Authors: Bin Su, Shaoguang Mao, Frank Soong, Zhiyong Wu

    Abstract: Ordinal regression with anchored reference samples (ORARS) has been proposed for predicting the subjective Mean Opinion Score (MOS) of input stimuli automatically. The ORARS addresses the MOS prediction problem by pairing a test sample with each of the pre-scored anchored reference samples. A trained binary classifier is then used to predict which sample, test or anchor, is better statistically. P… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

  30. arXiv:2206.14702  [pdf, other

    cs.CV

    Interventional Contrastive Learning with Meta Semantic Regularizer

    Authors: Wenwen Qiang, Jiangmeng Li, Changwen Zheng, Bing Su, Hui Xiong

    Abstract: Contrastive learning (CL)-based self-supervised learning models learn visual representations in a pairwise manner. Although the prevailing CL model has achieved great progress, in this paper, we uncover an ever-overlooked phenomenon: When the CL model is trained with full images, the performance tested in full images is better than that in foreground areas; when the CL model is trained with foregr… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: Accepted by ICML 2022

  31. arXiv:2206.10207  [pdf, other

    cs.CV

    SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders

    Authors: Gang Li, Heliang Zheng, Daqing Liu, Chaoyue Wang, Bing Su, Changwen Zheng

    Abstract: Recently, significant progress has been made in masked image modeling to catch up to masked language modeling. However, unlike words in NLP, the lack of semantic decomposition of images still makes masked autoencoding (MAE) different between vision and language. In this paper, we explore a potential visual analogue of words, i.e., semantic parts, and we integrate semantic information into the trai… ▽ More

    Submitted 5 October, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: Accepted by NeurIPS 2022

  32. arXiv:2205.14407  [pdf, ps, other

    cs.DS

    An efficient polynomial-time approximation scheme for parallel multi-stage open shops

    Authors: Jianming Dong, Ruyan **, Guohui Lin, Bing Su, Weitian Tong, Yao Xu

    Abstract: Various new scheduling problems have been arising from practical production processes and spawning new research areas in the scheduling field. We study the parallel multi-stage open shops problem, which generalizes the classic open shop scheduling and parallel machine scheduling problems. Given m identical k-stage open shops and a set of n jobs, we aim to process all jobs on these open shops with… ▽ More

    Submitted 28 May, 2022; originally announced May 2022.

  33. arXiv:2205.13425  [pdf, other

    cs.CV

    Do we really need temporal convolutions in action segmentation?

    Authors: Dazhao Du, Bing Su, Yu Li, Zhongang Qi, Lingyu Si, Ying Shan

    Abstract: Action classification has made great progress, but segmenting and recognizing actions from long untrimmed videos remains a challenging problem. Most state-of-the-art methods focus on designing temporal convolution-based models, but the inflexibility of temporal convolutions and the difficulties in modeling long-term temporal dependencies restrict the potential of these models. Transformer-based mo… ▽ More

    Submitted 22 November, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

  34. arXiv:2205.11100  [pdf, other

    cs.CV

    Supporting Vision-Language Model Inference with Confounder-pruning Knowledge Prompt

    Authors: Jiangmeng Li, Wenyi Mo, Wenwen Qiang, Bing Su, Changwen Zheng, Hui Xiong, Ji-Rong Wen

    Abstract: Vision-language models are pre-trained by aligning image-text pairs in a common space to deal with open-set visual concepts. To boost the transferability of the pre-trained models, recent works adopt fixed or learnable prompts, i.e., classification weights are synthesized from natural language describing task-relevant categories, to reduce the gap between tasks in the training and test phases. How… ▽ More

    Submitted 23 March, 2024; v1 submitted 23 May, 2022; originally announced May 2022.

  35. arXiv:2205.09669  [pdf, other

    cs.CR cs.LG

    Semi-WTC: A Practical Semi-supervised Framework for Attack Categorization through Weight-Task Consistency

    Authors: Zihan Li, Wentao Chen, Zhiqing Wei, Xingqi Luo, Bing Su

    Abstract: Supervised learning has been widely used for attack categorization, requiring high-quality data and labels. However, the data is often imbalanced and it is difficult to obtain sufficient annotations. Moreover, supervised models are subject to real-world deployment issues, such as defending against unseen artificial attacks. To tackle the challenges, we propose a semi-supervised fine-grained attack… ▽ More

    Submitted 2 September, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: Tech report

  36. arXiv:2203.05119  [pdf, other

    cs.CV

    MetAug: Contrastive Learning via Meta Feature Augmentation

    Authors: Jiangmeng Li, Wenwen Qiang, Changwen Zheng, Bing Su, Hui Xiong

    Abstract: What matters for contrastive learning? We argue that contrastive learning heavily relies on informative features, or "hard" (positive or negative) features. Early works include more informative features by applying complex data augmentations and large batch size or memory bank, and recent works design elaborate sampling approaches to explore informative features. The key challenge toward exploring… ▽ More

    Submitted 9 August, 2023; v1 submitted 9 March, 2022; originally announced March 2022.

    Comments: Accepted by ICML 2022

  37. arXiv:2203.04951  [pdf, other

    cs.RO cs.AI cs.HC cs.LG

    Learning from Physical Human Feedback: An Object-Centric One-Shot Adaptation Method

    Authors: Alvin Shek, Bo Ying Su, Rui Chen, Changliu Liu

    Abstract: For robots to be effectively deployed in novel environments and tasks, they must be able to understand the feedback expressed by humans during intervention. This can either correct undesirable behavior or indicate additional preferences. Existing methods either require repeated episodes of interactions or assume prior known reward features, which is data-inefficient and can hardly transfer to new… ▽ More

    Submitted 2 June, 2023; v1 submitted 9 March, 2022; originally announced March 2022.

    Comments: Accepted to ICRA 2023

  38. Robust Local Preserving and Global Aligning Network for Adversarial Domain Adaptation

    Authors: Wenwen Qiang, Jiangmeng Li, Changwen Zheng, Bing Su, Hui Xiong

    Abstract: Unsupervised domain adaptation (UDA) requires source domain samples with clean ground truth labels during training. Accurately labeling a large number of source domain samples is time-consuming and laborious. An alternative is to utilize samples with noisy labels for training. However, training with noisy labels can greatly reduce the performance of UDA. In this paper, we address the problem that… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

    Comments: Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE) 2022; Refer to https://ieeexplore.ieee.org/document/9540279

  39. arXiv:2202.11356  [pdf, other

    cs.LG stat.ML

    Preformer: Predictive Transformer with Multi-Scale Segment-wise Correlations for Long-Term Time Series Forecasting

    Authors: Dazhao Du, Bing Su, Zhewei Wei

    Abstract: Transformer-based methods have shown great potential in long-term time series forecasting. However, most of these methods adopt the standard point-wise self-attention mechanism, which not only becomes intractable for long-term forecasting since its complexity increases quadratically with the length of time series, but also cannot explicitly capture the predictive dependencies from contexts since t… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

  40. arXiv:2201.10471  [pdf, other

    cs.LG

    GIU-GANs: Global Information Utilization for Generative Adversarial Networks

    Authors: Yongqi Tian, Xueyuan Gong, Jialin Tang, Binghua Su, Xiaoxiang Liu, Xinyuan Zhang

    Abstract: In recent years, with the rapid development of artificial intelligence, image generation based on deep learning has dramatically advanced. Image generation based on Generative Adversarial Networks (GANs) is a promising study. However, since convolutions are limited by spatial-agnostic and channel-specific, features extracted by traditional GANs based on convolution are constrained. Therefore, GANs… ▽ More

    Submitted 15 March, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

  41. arXiv:2201.06125  [pdf, other

    cs.CL cs.LG

    Temporal Relation Extraction with a Graph-Based Deep Biaffine Attention Model

    Authors: Bo-Ying Su, Shang-Ling Hsu, Kuan-Yin Lai, Amarnath Gupta

    Abstract: Temporal information extraction plays a critical role in natural language understanding. Previous systems have incorporated advanced neural language models and have successfully enhanced the accuracy of temporal information extraction tasks. However, these systems have two major shortcomings. First, they fail to make use of the two-sided nature of temporal relations in prediction. Second, they inv… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

  42. arXiv:2112.00894  [pdf, other

    cs.CL

    Context-Dependent Semantic Parsing for Temporal Relation Extraction

    Authors: Bo-Ying Su, Shang-Ling Hsu, Kuan-Yin Lai, Jane Yung-jen Hsu

    Abstract: Extracting temporal relations among events from unstructured text has extensive applications, such as temporal reasoning and question answering. While it is difficult, recent development of Neural-symbolic methods has shown promising results on solving similar tasks. Current temporal relation extraction methods usually suffer from limited expressivity and inconsistent relation inference. For examp… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

  43. arXiv:2110.09924  [pdf, ps, other

    eess.AS cs.SD

    Speech Enhancement Based on Cyclegan with Noise-informed Training

    Authors: Wen-Yuan Ting, Syu-Siang Wang, Hsin-Li Chang, Borching Su, Yu Tsao

    Abstract: Cycle-consistent generative adversarial networks (CycleGAN) were successfully applied to speech enhancement (SE) tasks with unpaired noisy-clean training data. The CycleGAN SE system adopted two generators and two discriminators trained with losses from noisy-to-clean and clean-to-noisy conversions. CycleGAN showed promising results for numerous SE tasks. Herein, we investigate a potential limitat… ▽ More

    Submitted 6 December, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Journal ref: ISCSLP 2022

  44. arXiv:2109.02344   

    cs.CV cs.AI cs.LG

    Information Theory-Guided Heuristic Progressive Multi-View Coding

    Authors: Jiangmeng Li, Wenwen Qiang, Hang Gao, Bing Su, Farid Razzak, Jie Hu, Changwen Zheng, Hui Xiong

    Abstract: Multi-view representation learning captures comprehensive information from multiple views of a shared context. Recent works intuitively apply contrastive learning (CL) to learn representations, regarded as a pairwise manner, which is still scalable: view-specific noise is not filtered in learning view-shared representations; the fake negative pairs, where the negative terms are actually within the… ▽ More

    Submitted 21 August, 2023; v1 submitted 6 September, 2021; originally announced September 2021.

    Comments: We have uploaded a new version of this paper in arXiv:2308.10522, so that we have to withdrawal this paper

  45. arXiv:2107.11943  [pdf, other

    cs.CV

    Log-Polar Space Convolution for Convolutional Neural Networks

    Authors: Bing Su, Ji-Rong Wen

    Abstract: Convolutional neural networks use regular quadrilateral convolution kernels to extract features. Since the number of parameters increases quadratically with the size of the convolution kernel, many popular models use small convolution kernels, resulting in small local receptive fields in lower layers. This paper proposes a novel log-polar space convolution (LPSC) method, where the convolution kern… ▽ More

    Submitted 25 July, 2021; originally announced July 2021.

  46. arXiv:2107.08892  [pdf, other

    cs.CV

    Unsupervised Embedding Learning from Uncertainty Momentum Modeling

    Authors: Jiahuan Zhou, Yansong Tang, Bing Su, Ying Wu

    Abstract: Existing popular unsupervised embedding learning methods focus on enhancing the instance-level local discrimination of the given unlabeled images by exploring various negative data. However, the existed sample outliers which exhibit large intra-class divergences or small inter-class variations severely limit their learning performance. We justify that the performance limitation is caused by the gr… ▽ More

    Submitted 19 July, 2021; originally announced July 2021.

    Comments: 14 pages, in submission

  47. An Attribute-Aligned Strategy for Learning Speech Representation

    Authors: Yu-Lin Huang, Bo-Hao Su, Y. -W. Peter Hong, Chi-Chun Lee

    Abstract: Advancement in speech technology has brought convenience to our life. However, the concern is on the rise as speech signal contains multiple personal attributes, which would lead to either sensitive information leakage or bias toward decision. In this work, we propose an attribute-aligned learning strategy to derive speech representation that can flexibly address these issues by attribute-selectio… ▽ More

    Submitted 8 September, 2021; v1 submitted 5 June, 2021; originally announced June 2021.

    Comments: 5 pages, 2 figures; Accepted in Interspeech 2021

    Journal ref: Proceedings of INTERSPEECH 2021

  48. arXiv:2104.04953  [pdf, other

    cs.CV eess.IV

    SIGAN: A Novel Image Generation Method for Solar Cell Defect Segmentation and Augmentation

    Authors: Binyi Su, Zhong Zhou, Haiyong Chen, Xiaochun Cao

    Abstract: Solar cell electroluminescence (EL) defect segmentation is an interesting and challenging topic. Many methods have been proposed for EL defect detection, but these methods are still unsatisfactory due to the diversity of the defect and background. In this paper, we provide a new idea of using generative adversarial network (GAN) for defect segmentation. Firstly, the GAN-based method removes the de… ▽ More

    Submitted 11 April, 2021; originally announced April 2021.

    Comments: 11 pages, 11 figures

  49. Downlink SCMA Codebook Design with Low Error Rate by Maximizing Minimum Euclidean Distance of Superimposed Codewords

    Authors: Chinwei Huang, Borching Su, Tingyi Lin, Yenming Huang

    Abstract: Sparse code multiple access (SCMA), as a codebook-based non-orthogonal multiple access (NOMA) technique, has received research attention in recent years. The codebook design problem for SCMA has also been studied to some extent since codebook choices are highly related to the system's error rate performance. In this paper, we approach the SCMA codebook design problem by formulating an optimization… ▽ More

    Submitted 1 May, 2022; v1 submitted 9 January, 2021; originally announced January 2021.

    Comments: 15 pages, 12 figures. This version is accepted to IEEE Transactions on Vehicular Technology, and the copyright is transferred to IEEE

  50. arXiv:2012.11174  [pdf, other

    eess.AS cs.AI

    Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network

    Authors: Xiong Cai, Zhiyong Wu, Kuo Zhong, Bin Su, Dongyang Dai, Helen Meng

    Abstract: By using deep learning approaches, Speech Emotion Recog-nition (SER) on a single domain has achieved many excellentresults. However, cross-domain SER is still a challenging taskdue to the distribution shift between source and target domains.In this work, we propose a Domain Adversarial Neural Net-work (DANN) based approach to mitigate this distribution shiftproblem for cross-lingual SER. Specifica… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

    Comments: This paper has been accepted by ISCSLP2021

    ACM Class: I.2