Search | arXiv e-print repository

Aligning Human Motion Generation with Human Perceptions

Authors: Haoru Wang, Wentao Zhu, Luyi Miao, Yishu Xu, Feng Gao, Qi Tian, Yizhou Wang

Abstract: Human motion generation is a critical task with a wide range of applications. Achieving high realism in generated motions requires naturalness, smoothness, and plausibility. Despite rapid advancements in the field, current generation methods often fall short of these goals. Furthermore, existing evaluation metrics typically rely on ground-truth-based errors, simple heuristics, or distribution dist… ▽ More Human motion generation is a critical task with a wide range of applications. Achieving high realism in generated motions requires naturalness, smoothness, and plausibility. Despite rapid advancements in the field, current generation methods often fall short of these goals. Furthermore, existing evaluation metrics typically rely on ground-truth-based errors, simple heuristics, or distribution distances, which do not align well with human perceptions of motion quality. In this work, we propose a data-driven approach to bridge this gap by introducing a large-scale human perceptual evaluation dataset, MotionPercept, and a human motion critic model, MotionCritic, that capture human perceptual preferences. Our critic model offers a more accurate metric for assessing motion quality and could be readily integrated into the motion generation pipeline to enhance generation quality. Extensive experiments demonstrate the effectiveness of our approach in both evaluating and improving the quality of generated human motions by aligning with human perceptions. Code and data are publicly available at https://motioncritic.github.io/. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Project page: https://motioncritic.github.io/

arXiv:2406.00376 [pdf, other]

Approaching 100% Confidence in Stream Summary through ReliableSketch

Authors: Yuhan Wu, Hanbo Wu, Xilai Liu, Yikai Zhao, Tong Yang, Kaicheng Yang, Sha Wang, Lihua Miao, Gaogang Xie

Abstract: To approximate sums of values in key-value data streams, sketches are widely used in databases and networking systems. They offer high-confidence approximations for any given key while ensuring low time and space overhead. While existing sketches are proficient in estimating individual keys, they struggle to maintain this high confidence across all keys collectively, an objective that is criticall… ▽ More To approximate sums of values in key-value data streams, sketches are widely used in databases and networking systems. They offer high-confidence approximations for any given key while ensuring low time and space overhead. While existing sketches are proficient in estimating individual keys, they struggle to maintain this high confidence across all keys collectively, an objective that is critically important in both algorithm theory and its practical applications. We propose ReliableSketch, the first to control the error of all keys to less than $Λ$ with a small failure probability $Δ$, requiring only $O(1 + Δ\ln\ln(\frac{N}Λ))$ amortized time and $O(\frac{N}Λ + \ln(\frac{1}Δ))$ space. Furthermore, its simplicity makes it hardware-friendly, and we implement it on CPU servers, FPGAs, and programmable switches. Our experiments show that under the same small space, ReliableSketch not only keeps all keys' errors below $Λ$ but also achieves near-optimal throughput, outperforming competitors with thousands of uncontrolled estimations. We have made our source code publicly available. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2402.16267 [pdf, other]

doi 10.48550/arXiv.2402.16267

Infrared and visible Image Fusion with Language-driven Loss in CLIP Embedding Space

Authors: Yuhao Wang, Lingjuan Miao, Zhiqiang Zhou, Lei Zhang, Yajun Qiao

Abstract: Infrared-visible image fusion (IVIF) has attracted much attention owing to the highly-complementary properties of the two image modalities. Due to the lack of ground-truth fused images, the fusion output of current deep-learning based methods heavily depends on the loss functions defined mathematically. As it is hard to well mathematically define the fused image without ground truth, the performan… ▽ More Infrared-visible image fusion (IVIF) has attracted much attention owing to the highly-complementary properties of the two image modalities. Due to the lack of ground-truth fused images, the fusion output of current deep-learning based methods heavily depends on the loss functions defined mathematically. As it is hard to well mathematically define the fused image without ground truth, the performance of existing fusion methods is limited. In this paper, we first propose to use natural language to express the objective of IVIF, which can avoid the explicit mathematical modeling of fusion output in current losses, and make full use of the advantage of language expression to improve the fusion performance. For this purpose, we present a comprehensive language-expressed fusion objective, and encode relevant texts into the multi-modal embedding space using CLIP. A language-driven fusion model is then constructed in the embedding space, by establishing the relationship among the embedded vectors to represent the fusion objective and input image modalities. Finally, a language-driven loss is derived to make the actual IVIF aligned with the embedded language-driven fusion model via supervised training. Experiments show that our method can obtain much better fusion results than existing techniques. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.06951 [pdf, other]

Semantic Object-level Modeling for Robust Visual Camera Relocalization

Authors: Yifan Zhu, Lingjuan Miao, Haitao Wu, Zhiqiang Zhou, Weiyi Chen, Longwen Wu

Abstract: Visual relocalization is crucial for autonomous visual localization and navigation of mobile robotics. Due to the improvement of CNN-based object detection algorithm, the robustness of visual relocalization is greatly enhanced especially in viewpoints where classical methods fail. However, ellipsoids (quadrics) generated by axis-aligned object detection may limit the accuracy of the object-level r… ▽ More Visual relocalization is crucial for autonomous visual localization and navigation of mobile robotics. Due to the improvement of CNN-based object detection algorithm, the robustness of visual relocalization is greatly enhanced especially in viewpoints where classical methods fail. However, ellipsoids (quadrics) generated by axis-aligned object detection may limit the accuracy of the object-level representation and degenerate the performance of visual relocalization system. In this paper, we propose a novel method of automatic object-level voxel modeling for accurate ellipsoidal representations of objects. As for visual relocalization, we design a better pose optimization strategy for camera pose recovery, to fully utilize the projection characteristics of 2D fitted ellipses and the 3D accurate ellipsoids. All of these modules are entirely intergrated into visual SLAM system. Experimental results show that our semantic object-level map** and object-based visual relocalization methods significantly enhance the performance of visual relocalization in terms of robustness to new viewpoints. △ Less

Submitted 10 February, 2024; originally announced February 2024.

arXiv:2312.06868 [pdf, other]

RAFIC: Retrieval-Augmented Few-shot Image Classification

Authors: Hangfei Lin, Li Miao, Amir Ziai

Abstract: Few-shot image classification is the task of classifying unseen images to one of N mutually exclusive classes, using only a small number of training examples for each class. The limited availability of these examples (denoted as K) presents a significant challenge to classification accuracy in some cases. To address this, we have developed a method for augmenting the set of K with an addition set… ▽ More Few-shot image classification is the task of classifying unseen images to one of N mutually exclusive classes, using only a small number of training examples for each class. The limited availability of these examples (denoted as K) presents a significant challenge to classification accuracy in some cases. To address this, we have developed a method for augmenting the set of K with an addition set of A retrieved images. We call this system Retrieval-Augmented Few-shot Image Classification (RAFIC). Through a series of experiments, we demonstrate that RAFIC markedly improves performance of few-shot image classification across two challenging datasets. RAFIC consists of two main components: (a) a retrieval component which uses CLIP, LAION-5B, and faiss, in order to efficiently retrieve images similar to the supplied images, and (b) retrieval meta-learning, which learns to judiciously utilize the retrieved images. Code and data is available at github.com/amirziai/rafic. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2308.08630 [pdf]

Cooperation and interdependence in global science funding

Authors: Lili Miao, Vincent Larivière, Feifei Wang, Yong-Yeol Ahn, Cassidy R. Sugimoto

Abstract: Investments in research and development are key to scientific and economic growth and to the well-being of society. Scientific research demands significant resources making national scientific investment a crucial driver of scientific production. As scientific production becomes increasingly multinational, it is critical to study how nations' scientific activities are funded both domestically and… ▽ More Investments in research and development are key to scientific and economic growth and to the well-being of society. Scientific research demands significant resources making national scientific investment a crucial driver of scientific production. As scientific production becomes increasingly multinational, it is critical to study how nations' scientific activities are funded both domestically and internationally. By tracing research grants acknowledged in scholarly publications, our study reveals a shifting duopoly of China and the United States in the global funding landscape, with a contrasting funding pattern; while China has surpassed the United States in publications with acknowledged domestic and international funding, the United States largely maintains its role as the most important global research partner. Our results also highlight the precarity of low- and middle-income countries to global funding disruptions. By revealing the complex interdependence and collaboration between countries in the global scientific enterprise, this work informs future studies investigating the national and global scientific enterprise and how funding leads to both productive cooperation and vulnerable dependencies. △ Less

Submitted 3 February, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

Comments: 48 pages, 13 figures

arXiv:2210.04976 [pdf, other]

doi 10.52953/ANSC4385

Optimal wireless rate and power control in the presence of jammers using reinforcement learning

Authors: Fadlullah Raji, Lei Miao

Abstract: Future wireless networks require high throughput and energy efficiency. This paper studies using Reinforcement Learning (RL) to do transmission rate and power control for maximizing a joint reward function consisting of both throughput and energy consumption. We design the system state to include factors that reflect packet queue length, interference from other nodes, quality of the wireless chann… ▽ More Future wireless networks require high throughput and energy efficiency. This paper studies using Reinforcement Learning (RL) to do transmission rate and power control for maximizing a joint reward function consisting of both throughput and energy consumption. We design the system state to include factors that reflect packet queue length, interference from other nodes, quality of the wireless channel, battery status, etc. The reward function is normalized and does not involve unit conversion. It can be used to train three different types of agents: throughput-critical, energy-critical, and throughput and energy balanced. Using the NS-3 network simulation software, we implement and train these agents in an 802.11ac network with the presence of a jammer. We then test the agents with two jamming nodes interfering with the packets received at the receiver. We compare the performance of our RL optimal policies with the popular Minstrel rate adaptation algorithm: our approach can achieve (i) higher throughput when using the throughput-critical reward function; (ii) lower energy consumption when using the energy-critical reward function; and (iii) higher throughput and slightly higher energy when using the throughput and energy balanced reward function. Although our discussion is focused on 802.11ac networks, our method is readily applicable to other types of wireless networks. △ Less

Submitted 10 October, 2022; originally announced October 2022.

Comments: Published in International Telecommunication Union, Journal on Future and Evolving Technologies (ITU J-FET) 2022

Journal ref: ITU Journal on Future and Evolving Technologies. Volume 3, Issue 2, Pages 508-522 (2022)

arXiv:2209.04041 [pdf, other]

Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages

Authors: Li Miao, Jian Wu, Piyush Behre, Shuangyu Chang, Sarangarajan Parthasarathy

Abstract: It is challenging to train and deploy Transformer LMs for hybrid speech recognition 2nd pass re-ranking in low-resource languages due to (1) data scarcity in low-resource languages, (2) expensive computing costs for training and refreshing 100+ monolingual models, and (3) hosting inefficiency considering sparse traffic. In this study, we present a new way to group multiple low-resource locales tog… ▽ More It is challenging to train and deploy Transformer LMs for hybrid speech recognition 2nd pass re-ranking in low-resource languages due to (1) data scarcity in low-resource languages, (2) expensive computing costs for training and refreshing 100+ monolingual models, and (3) hosting inefficiency considering sparse traffic. In this study, we present a new way to group multiple low-resource locales together and optimize the performance of Multilingual Transformer LMs in ASR. Our Locale-group Multilingual Transformer LMs outperform traditional multilingual LMs along with reducing maintenance costs and operating expenses. Further, for low-resource but high-traffic locales where deploying monolingual models is feasible, we show that fine-tuning our locale-group multilingual LMs produces better monolingual LM candidates than baseline monolingual LMs. △ Less

Submitted 8 September, 2022; originally announced September 2022.

arXiv:2104.10812 [pdf]

doi 10.1038/s41562-022-01367-x

The latent structure of global scientific development

Authors: Lili Miao, Dakota Murray, Woo-Sung Jung, Vincent Larivière, Cassidy R. Sugimoto, Yong-Yeol Ahn

Abstract: Science is essential to innovation and economic prosperity. Although studies have shown that national scientific development is affected by geographic, historic, and economic factors, it remains unclear whether there are universal structures and trajectories of national scientific development that can inform forecasting and policymaking. Here, by examining countries' scientific 'exports'-publicati… ▽ More Science is essential to innovation and economic prosperity. Although studies have shown that national scientific development is affected by geographic, historic, and economic factors, it remains unclear whether there are universal structures and trajectories of national scientific development that can inform forecasting and policymaking. Here, by examining countries' scientific 'exports'-publications that are indexed in international databases-we reveal a three-cluster structure in the relatedness network of disciplines that underpin national scientific development and the organization of global science. Tracing the evolution of national research portfolios reveals that while nations are proceeding to more diverse research profiles individually, scientific production is increasingly specialized in global science over the past decades. By uncovering the underlying structure of scientific development and connecting it with economic development, our results may offer a new perspective on the evolution of global science. △ Less

Submitted 30 March, 2022; v1 submitted 21 April, 2021; originally announced April 2021.

Comments: 30 pages(main text), 5 figures(main text), 3 tables(main text)

arXiv:2104.02570 [pdf, other]

Learning from Noisy Labels via Dynamic Loss Thresholding

Authors: Hao Yang, Youzhi **, Ziyin Li, Deng-Bao Wang, Lei Miao, Xin Geng, Min-Ling Zhang

Abstract: Numerous researches have proved that deep neural networks (DNNs) can fit everything in the end even given data with noisy labels, and result in poor generalization performance. However, recent studies suggest that DNNs tend to gradually memorize the data, moving from correct data to mislabeled data. Inspired by this finding, we propose a novel method named Dynamic Loss Thresholding (DLT). During t… ▽ More Numerous researches have proved that deep neural networks (DNNs) can fit everything in the end even given data with noisy labels, and result in poor generalization performance. However, recent studies suggest that DNNs tend to gradually memorize the data, moving from correct data to mislabeled data. Inspired by this finding, we propose a novel method named Dynamic Loss Thresholding (DLT). During the training process, DLT records the loss value of each sample and calculates dynamic loss thresholds. Specifically, DLT compares the loss value of each sample with the current loss threshold. Samples with smaller losses can be considered as clean samples with higher probability and vice versa. Then, DLT discards the potentially corrupted labels and further leverages supervised learning techniques. Experiments on CIFAR-10/100 and Clothing1M demonstrate substantial improvements over recent state-of-the-art methods. In addition, we investigate two real-world problems for the first time. Firstly, we propose a novel approach to estimate the noise rates of datasets based on the loss difference between the early and late training stages of DNNs. Secondly, we explore the effect of hard samples (which are difficult to be distinguished) on the process of learning from noisy labels. △ Less

Submitted 1 April, 2021; originally announced April 2021.

arXiv:2103.11636 [pdf, other]

doi 10.1109/LGRS.2021.3115110

Optimization for Arbitrary-Oriented Object Detection via Representation Invariance Loss

Authors: Qi Ming, Lingjuan Miao, Zhiqiang Zhou, Xue Yang, Yunpeng Dong

Abstract: Arbitrary-oriented objects exist widely in natural scenes, and thus the oriented object detection has received extensive attention in recent years. The mainstream rotation detectors use oriented bounding boxes (OBB) or quadrilateral bounding boxes (QBB) to represent the rotating objects. However, these methods suffer from the representation ambiguity for oriented object definition, which leads to… ▽ More Arbitrary-oriented objects exist widely in natural scenes, and thus the oriented object detection has received extensive attention in recent years. The mainstream rotation detectors use oriented bounding boxes (OBB) or quadrilateral bounding boxes (QBB) to represent the rotating objects. However, these methods suffer from the representation ambiguity for oriented object definition, which leads to suboptimal regression optimization and the inconsistency between the loss metric and the localization accuracy of the predictions. In this paper, we propose a Representation Invariance Loss (RIL) to optimize the bounding box regression for the rotating objects. Specifically, RIL treats multiple representations of an oriented object as multiple equivalent local minima, and hence transforms bounding box regression into an adaptive matching process with these local minima. Then, the Hungarian matching algorithm is adopted to obtain the optimal regression strategy. We also propose a normalized rotation loss to alleviate the weak correlation between different variables and their unbalanced loss contribution in OBB representation. Extensive experiments on remote sensing datasets and scene text datasets show that our method achieves consistent and substantial improvement. The source code and trained models are available at https://github.com/ming71/RIDet. △ Less

Submitted 6 October, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

Comments: Accepted by IEEE Geoscience and Remote Sensing Letters.The code is available at https://github.com/ming71/RIDet

arXiv:2101.06849 [pdf, other]

doi 10.1109/TGRS.2021.3095186

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

Authors: Qi Ming, Lingjuan Miao, Zhiqiang Zhou, Yunpeng Dong

Abstract: Object detection in optical remote sensing images is an important and challenging task. In recent years, the methods based on convolutional neural networks have made good progress. However, due to the large variation in object scale, aspect ratio, and arbitrary orientation, the detection performance is difficult to be further improved. In this paper, we discuss the role of discriminative features… ▽ More Object detection in optical remote sensing images is an important and challenging task. In recent years, the methods based on convolutional neural networks have made good progress. However, due to the large variation in object scale, aspect ratio, and arbitrary orientation, the detection performance is difficult to be further improved. In this paper, we discuss the role of discriminative features in object detection, and then propose a Critical Feature Capturing Network (CFC-Net) to improve detection accuracy from three aspects: building powerful feature representation, refining preset anchors, and optimizing label assignment. Specifically, we first decouple the classification and regression features, and then construct robust critical features adapted to the respective tasks through the Polarization Attention Module (PAM). With the extracted discriminative regression features, the Rotation Anchor Refinement Module (R-ARM) performs localization refinement on preset horizontal anchors to obtain superior rotation anchors. Next, the Dynamic Anchor Learning (DAL) strategy is given to adaptively select high-quality anchors based on their ability to capture critical features. The proposed framework creates more powerful semantic representations for objects in remote sensing images and achieves high-performance real-time object detection. Experimental results on three remote sensing datasets including HRSC2016, DOTA, and UCAS-AOD show that our method achieves superior detection performance compared with many state-of-the-art approaches. Code and models are available at https://github.com/ming71/CFC-Net. △ Less

Submitted 16 August, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

Comments: Accepted to IEEE Transactions on Geoscience and Remote Sensing

arXiv:2012.04150 [pdf, other]

Dynamic Anchor Learning for Arbitrary-Oriented Object Detection

Authors: Qi Ming, Zhiqiang Zhou, Lingjuan Miao, Hongwei Zhang, Linhao Li

Abstract: Arbitrary-oriented objects widely appear in natural scenes, aerial photographs, remote sensing images, etc., thus arbitrary-oriented object detection has received considerable attention. Many current rotation detectors use plenty of anchors with different orientations to achieve spatial alignment with ground truth boxes, then Intersection-over-Union (IoU) is applied to sample the positive and nega… ▽ More Arbitrary-oriented objects widely appear in natural scenes, aerial photographs, remote sensing images, etc., thus arbitrary-oriented object detection has received considerable attention. Many current rotation detectors use plenty of anchors with different orientations to achieve spatial alignment with ground truth boxes, then Intersection-over-Union (IoU) is applied to sample the positive and negative candidates for training. However, we observe that the selected positive anchors cannot always ensure accurate detections after regression, while some negative samples can achieve accurate localization. It indicates that the quality assessment of anchors through IoU is not appropriate, and this further lead to inconsistency between classification confidence and localization accuracy. In this paper, we propose a dynamic anchor learning (DAL) method, which utilizes the newly defined matching degree to comprehensively evaluate the localization potential of the anchors and carry out a more efficient label assignment process. In this way, the detector can dynamically select high-quality anchors to achieve accurate object detection, and the divergence between classification and regression will be alleviated. With the newly introduced DAL, we achieve superior detection performance for arbitrary-oriented objects with only a few horizontal preset anchors. Experimental results on three remote sensing datasets HRSC2016, DOTA, UCAS-AOD as well as a scene text dataset ICDAR 2015 show that our method achieves substantial improvement compared with the baseline model. Besides, our approach is also universal for object detection using horizontal bound box. The code and models are available at https://github.com/ming71/DAL. △ Less

Submitted 15 December, 2020; v1 submitted 7 December, 2020; originally announced December 2020.

Comments: Accepted to AAAI 2021. The code and models are available at https://github.com/ming71/DAL

arXiv:2012.00796 [pdf, ps, other]

Wireless Secret Sharing Game between Two Legitimate Users and an Eavesdropper

Authors: Lei Miao, Dingde Jiang

Abstract: Wireless secret sharing is crucial to information security in the era of Internet of Things. One method is to utilize the effect of the randomness of the wireless channel in the data link layer to generate the common secret between two legitimate users Alice and Bob. This paper studies this secret sharing mechanism from the perspective of game theory. In particular, we formulate a non-cooperative… ▽ More Wireless secret sharing is crucial to information security in the era of Internet of Things. One method is to utilize the effect of the randomness of the wireless channel in the data link layer to generate the common secret between two legitimate users Alice and Bob. This paper studies this secret sharing mechanism from the perspective of game theory. In particular, we formulate a non-cooperative zero-sum game between the legitimate users and an eavesdropper Eve. In a symmetrical game where Eve has the same probability of successfully receiving a packet from Alice and Bob when the transmission distance is the same, we show that both pure and mixed strategy Nash equilibria exist. In an asymmetric game where Eve has different probabilities of successfully receiving a packet from Alice and Bob, a pure strategy may not exist; in this case, we show how a mixed strategy Nash equilibrium can be found. △ Less

Submitted 26 January, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

arXiv:2004.07124 [pdf, other]

doi 10.1109/TGRS.2020.2995477

A Novel CNN-based Method for Accurate Ship Detection in HR Optical Remote Sensing Images via Rotated Bounding Box

Authors: Linhao Li, Zhiqiang Zhou, Bo Wang, Lingjuan Miao, Hua Zong

Abstract: Currently, reliable and accurate ship detection in optical remote sensing images is still challenging. Even the state-of-the-art convolutional neural network (CNN) based methods cannot obtain very satisfactory results. To more accurately locate the ships in diverse orientations, some recent methods conduct the detection via the rotated bounding box. However, it further increases the difficulty of… ▽ More Currently, reliable and accurate ship detection in optical remote sensing images is still challenging. Even the state-of-the-art convolutional neural network (CNN) based methods cannot obtain very satisfactory results. To more accurately locate the ships in diverse orientations, some recent methods conduct the detection via the rotated bounding box. However, it further increases the difficulty of detection, because an additional variable of ship orientation must be accurately predicted in the algorithm. In this paper, a novel CNN-based ship detection method is proposed, by overcoming some common deficiencies of current CNN-based methods in ship detection. Specifically, to generate rotated region proposals, current methods have to predefine multi-oriented anchors, and predict all unknown variables together in one regression process, limiting the quality of overall prediction. By contrast, we are able to predict the orientation and other variables independently, and yet more effectively, with a novel dual-branch regression network, based on the observation that the ship targets are nearly rotation-invariant in remote sensing images. Next, a shape-adaptive pooling method is proposed, to overcome the limitation of typical regular ROI-pooling in extracting the features of the ships with various aspect ratios. Furthermore, we propose to incorporate multilevel features via the spatially-variant adaptive pooling. This novel approach, called multilevel adaptive pooling, leads to a compact feature representation more qualified for the simultaneous ship classification and localization. Finally, detailed ablation study performed on the proposed approaches is provided, along with some useful insights. Experimental results demonstrate the great superiority of the proposed method in ship detection. △ Less

Submitted 7 May, 2020; v1 submitted 15 April, 2020; originally announced April 2020.

Journal ref: [J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 59(1): 686-699

arXiv:2001.05621 [pdf, other]

OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera

Authors: Yuan Liang, Hsuan-Wei Fan, Zhujun Fang, Leiying Miao, Wen Li, Xuan Zhang, Weibin Sun, Kun Wang, Lei He, Xiang Anthony Chen

Abstract: Due to a lack of medical resources or oral health awareness, oral diseases are often left unexamined and untreated, affecting a large population worldwide. With the advent of low-cost, sensor-equipped smartphones, mobile apps offer a promising possibility for promoting oral health. However, to the best of our knowledge, no mobile health (mHealth) solutions can directly support a user to self-exami… ▽ More Due to a lack of medical resources or oral health awareness, oral diseases are often left unexamined and untreated, affecting a large population worldwide. With the advent of low-cost, sensor-equipped smartphones, mobile apps offer a promising possibility for promoting oral health. However, to the best of our knowledge, no mobile health (mHealth) solutions can directly support a user to self-examine their oral health condition. This paper presents OralCam, the first interactive app that enables end-users' self-examination of five common oral conditions (diseases or early disease signals) by taking smartphone photos of one's oral cavity. OralCam allows a user to annotate additional information (e.g. living habits, pain, and bleeding) to augment the input image, and presents the output hierarchically, probabilistically and with visual explanations to help a laymen user understand examination results. Developed on our in-house dataset that consists of 3,182 oral photos annotated by dental experts, our deep learning based framework achieved an average detection sensitivity of 0.787 over five conditions with high localization accuracy. In a week-long in-the-wild user study (N=18), most participants had no trouble using OralCam and interpreting the examination results. Two expert interviews further validate the feasibility of OralCam for promoting users' awareness of oral health. △ Less

Submitted 22 January, 2020; v1 submitted 15 January, 2020; originally announced January 2020.

Comments: 13 pages, CHI2020 accepted

arXiv:1906.05346

Optimal low rank tensor recovery

Authors: Jian-Feng Cai, Lizhang Miao, Yang Wang, Yin Xian

Abstract: We investigate the sample size requirement for exact recovery of a high order tensor of low rank from a subset of its entries. In the Tucker decomposition framework, we show that the Riemannian optimization algorithm with initial value obtained from a spectral method can reconstruct a tensor of size $n\times n \times\cdots \times n$ tensor of ranks $(r,\cdots,r)$ with high probability from as few… ▽ More We investigate the sample size requirement for exact recovery of a high order tensor of low rank from a subset of its entries. In the Tucker decomposition framework, we show that the Riemannian optimization algorithm with initial value obtained from a spectral method can reconstruct a tensor of size $n\times n \times\cdots \times n$ tensor of ranks $(r,\cdots,r)$ with high probability from as few as $O((r^d+dnr)\log(d))$ entries. In the case of order 3 tensor, the entries can be asymptotically as few as $O(nr)$ for a low rank large tensor. We show the theoretical guarantee condition for the recovery. The analysis relies on the tensor restricted isometry property (tensor RIP) and the curvature of the low rank tensor manifold. Our algorithm is computationally efficient and easy to implement. Numerical results verify that the algorithms are able to recover a low rank tensor from minimum number of measurements. The experiments on hyperspectral images recovery also show that our algorithm is capable of real world signal processing problems. △ Less

Submitted 11 November, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

Comments: There is an error in the paper and need to be correct

arXiv:1408.5240 [pdf, other]

doi 10.1016/j.physa.2014.10.021

Whether Information Network Supplements Friendship Network

Authors: Lili Miao, Qian-Ming Zhang, Da-Chen Nie, Shi-Min Cai

Abstract: Homophily is a significant mechanism for link prediction in complex network, of which principle describes that people with similar profiles or experiences tend to tie with each other. In a multi-relationship network, friendship among people has been utilized to reinforce similarity of taste for recommendation system whose basic idea is similar to homophily, yet how the taste inversely affects frie… ▽ More Homophily is a significant mechanism for link prediction in complex network, of which principle describes that people with similar profiles or experiences tend to tie with each other. In a multi-relationship network, friendship among people has been utilized to reinforce similarity of taste for recommendation system whose basic idea is similar to homophily, yet how the taste inversely affects friendship prediction is little discussed. This paper contributes to address the issue by analyzing two benchmark datasets both including user's behavioral information of taste and friendship based on the principle of homophily. It can be found that the creation of friendship tightly associates with personal taste. Especially, the behavioral information of taste involving with popular objects is much more effective to improve the performance of friendship prediction. However, this result seems to be contradictory to the finding in [Q.M. Zhang, et al., PLoS ONE 8(2013)e62624] that the behavior information of taste involving with popular objects is redundant in recommendation system. We thus discuss this inconformity to comprehensively understand the correlation between them. △ Less

Submitted 22 August, 2014; originally announced August 2014.

Comments: 8 pages, 5 figures

Journal ref: Physica A 419, 301 (2015)

arXiv:1408.0845 [pdf, ps, other]

doi 10.1038/srep12261

Predicting missing links and their weights via reliable-route-based method

Authors: **g Zhao, Lili Miao, Haiyang Fang, Qian-Ming Zhang, Min Nie, Tao Zhou

Abstract: Link prediction aims to uncover missing links or predict the emergence of future relationships according to the current networks structure. Plenty of algorithms have been developed for link prediction in unweighted networks, with only a very few of them having been extended to weighted networks. Thus far, how to predict weights of links is important but rarely studied. In this Letter, we present a… ▽ More Link prediction aims to uncover missing links or predict the emergence of future relationships according to the current networks structure. Plenty of algorithms have been developed for link prediction in unweighted networks, with only a very few of them having been extended to weighted networks. Thus far, how to predict weights of links is important but rarely studied. In this Letter, we present a reliable-route-based method to extend unweighted local similarity indices to weighted indices and propose a method to predict both the link existence and link weights accordingly. Experiments on different real networks suggest that the weighted resource allocation index has the best performance to predict the existence of links, while the reliable-route-based weighted resource allocation index performs noticeably better on weight prediction. Further analysis shows a strong correlation for both link prediction and weight prediction: the larger the clustering coefficient, the higher the prediction accuracy. △ Less

Submitted 4 August, 2014; originally announced August 2014.

Comments: 5 pages, 4 tables

Journal ref: Scientific Reports 5 (2015) 12261

Showing 1–19 of 19 results for author: Miao, L