-
Recourse for reclamation: Chatting with generative language models
Authors:
Jennifer Chien,
Kevin R. McKee,
Jackie Kay,
William Isaac
Abstract:
Researchers and developers increasingly rely on toxicity scoring to moderate generative language model outputs, in settings such as customer service, information retrieval, and content generation. However, toxicity scoring may render pertinent information inaccessible, rigidify or "value-lock" cultural norms, and prevent language reclamation processes, particularly for marginalized people. In this…
▽ More
Researchers and developers increasingly rely on toxicity scoring to moderate generative language model outputs, in settings such as customer service, information retrieval, and content generation. However, toxicity scoring may render pertinent information inaccessible, rigidify or "value-lock" cultural norms, and prevent language reclamation processes, particularly for marginalized people. In this work, we extend the concept of algorithmic recourse to generative language models: we provide users a novel mechanism to achieve their desired prediction by dynamically setting thresholds for toxicity filtering. Users thereby exercise increased agency relative to interactions with the baseline system. A pilot study ($n = 30$) supports the potential of our proposed recourse mechanism, indicating improvements in usability compared to fixed-threshold toxicity-filtering of model outputs. Future work should explore the intersection of toxicity scoring, model controllability, user agency, and language reclamation processes -- particularly with regard to the bias that many communities encounter when interacting with generative language models.
△ Less
Submitted 21 April, 2024; v1 submitted 21 March, 2024;
originally announced March 2024.
-
Beyond Behaviorist Representational Harms: A Plan for Measurement and Mitigation
Authors:
Jennifer Chien,
David Danks
Abstract:
Algorithmic harms are commonly categorized as either allocative or representational. This study specifically addresses the latter, focusing on an examination of current definitions of representational harms to discern what is included and what is not. This analysis motivates our expansion beyond behavioral definitions to encompass harms to cognitive and affective states. The paper outlines high-le…
▽ More
Algorithmic harms are commonly categorized as either allocative or representational. This study specifically addresses the latter, focusing on an examination of current definitions of representational harms to discern what is included and what is not. This analysis motivates our expansion beyond behavioral definitions to encompass harms to cognitive and affective states. The paper outlines high-level requirements for measurement: identifying the necessary expertise to implement this approach and illustrating it through a case study. Our work highlights the unique vulnerabilities of large language models to perpetrating representational harms, particularly when these harms go unmeasured and unmitigated. The work concludes by presenting proposed mitigations and delineating when to employ them. The overarching aim of this research is to establish a framework for broadening the definition of representational harms and to translate insights from fairness research into practical measurement and mitigation praxis.
△ Less
Submitted 6 May, 2024; v1 submitted 24 January, 2024;
originally announced February 2024.
-
Generative Expressive Robot Behaviors using Large Language Models
Authors:
Karthik Mahadevan,
Jonathan Chien,
Noah Brown,
Zhuo Xu,
Carolina Parada,
Fei Xia,
Andy Zeng,
Leila Takayama,
Dorsa Sadigh
Abstract:
People employ expressive behaviors to effectively communicate and coordinate their actions with others, such as nodding to acknowledge a person glancing at them or saying "excuse me" to pass people in a busy corridor. We would like robots to also demonstrate expressive behaviors in human-robot interaction. Prior work proposes rule-based methods that struggle to scale to new communication modalitie…
▽ More
People employ expressive behaviors to effectively communicate and coordinate their actions with others, such as nodding to acknowledge a person glancing at them or saying "excuse me" to pass people in a busy corridor. We would like robots to also demonstrate expressive behaviors in human-robot interaction. Prior work proposes rule-based methods that struggle to scale to new communication modalities or social situations, while data-driven methods require specialized datasets for each social situation the robot is used in. We propose to leverage the rich social context available from large language models (LLMs) and their ability to generate motion based on instructions or user preferences, to generate expressive robot motion that is adaptable and composable, building upon each other. Our approach utilizes few-shot chain-of-thought prompting to translate human language instructions into parametrized control code using the robot's available and learned skills. Through user studies and simulation experiments, we demonstrate that our approach produces behaviors that users found to be competent and easy to understand. Supplementary material can be found at https://generative-expressive-motion.github.io/.
△ Less
Submitted 30 January, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
The illusion of artificial inclusion
Authors:
William Agnew,
A. Stevie Bergman,
Jennifer Chien,
Mark Díaz,
Seliem El-Sayed,
Jaylen Pittman,
Shakir Mohamed,
Kevin R. McKee
Abstract:
Human participants play a central role in the development of modern artificial intelligence (AI) technology, in psychological science, and in user research. Recent advances in generative AI have attracted growing interest to the possibility of replacing human participants in these domains with AI surrogates. We survey several such "substitution proposals" to better understand the arguments for and…
▽ More
Human participants play a central role in the development of modern artificial intelligence (AI) technology, in psychological science, and in user research. Recent advances in generative AI have attracted growing interest to the possibility of replacing human participants in these domains with AI surrogates. We survey several such "substitution proposals" to better understand the arguments for and against substituting human participants with modern generative AI. Our sco** review indicates that the recent wave of these proposals is motivated by goals such as reducing the costs of research and development work and increasing the diversity of collected data. However, these proposals ignore and ultimately conflict with foundational values of work with human participants: representation, inclusion, and understanding. This paper critically examines the principles and goals underlying human participation to help chart out paths for future work that truly centers and empowers participants.
△ Less
Submitted 5 February, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
Attention-Guided Adaptation for Code-Switching Speech Recognition
Authors:
Bobbi Aditya,
Mahdin Rohmatillah,
Liang-Hsuan Tai,
Jen-Tzung Chien
Abstract:
The prevalence of the powerful multilingual models, such as Whisper, has significantly advanced the researches on speech recognition. However, these models often struggle with handling the code-switching setting, which is essential in multilingual speech recognition. Recent studies have attempted to address this setting by separating the modules for different languages to ensure distinct latent re…
▽ More
The prevalence of the powerful multilingual models, such as Whisper, has significantly advanced the researches on speech recognition. However, these models often struggle with handling the code-switching setting, which is essential in multilingual speech recognition. Recent studies have attempted to address this setting by separating the modules for different languages to ensure distinct latent representations for languages. Some other methods considered the switching mechanism based on language identification. In this study, a new attention-guided adaptation is proposed to conduct parameter-efficient learning for bilingual ASR. This method selects those attention heads in a model which closely express language identities and then guided those heads to be correctly attended with their corresponding languages. The experiments on the Mandarin-English code-switching speech corpus show that the proposed approach achieves a 14.2% mixed error rate, surpassing state-of-the-art method, where only 5.6% additional parameters over Whisper are trained.
△ Less
Submitted 12 January, 2024; v1 submitted 14 December, 2023;
originally announced December 2023.
-
Contrastive Speaker Embedding With Sequential Disentanglement
Authors:
Youzhi Tu,
Man-Wai Mak,
Jen-Tzung Chien
Abstract:
Contrastive speaker embedding assumes that the contrast between the positive and negative pairs of speech segments is attributed to speaker identity only. However, this assumption is incorrect because speech signals contain not only speaker identity but also linguistic content. In this paper, we propose a contrastive learning framework with sequential disentanglement to remove linguistic content b…
▽ More
Contrastive speaker embedding assumes that the contrast between the positive and negative pairs of speech segments is attributed to speaker identity only. However, this assumption is incorrect because speech signals contain not only speaker identity but also linguistic content. In this paper, we propose a contrastive learning framework with sequential disentanglement to remove linguistic content by incorporating a disentangled sequential variational autoencoder (DSVAE) into the conventional SimCLR framework. The DSVAE aims to disentangle speaker factors from content factors in an embedding space so that only the speaker factors are used for constructing a contrastive loss objective. Because content factors have been removed from the contrastive learning, the resulting speaker embeddings will be content-invariant. Experimental results on VoxCeleb1-test show that the proposed method consistently outperforms SimCLR. This suggests that applying sequential disentanglement is beneficial to learning speaker-discriminative embeddings.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
Fairness Vs. Personalization: Towards Equity in Epistemic Utility
Authors:
Jennifer Chien,
David Danks
Abstract:
The applications of personalized recommender systems are rapidly expanding: encompassing social media, online shop**, search engine results, and more. These systems offer a more efficient way to navigate the vast array of items available. However, alongside this growth, there has been increased recognition of the potential for algorithmic systems to exhibit and perpetuate biases, risking unfairn…
▽ More
The applications of personalized recommender systems are rapidly expanding: encompassing social media, online shop**, search engine results, and more. These systems offer a more efficient way to navigate the vast array of items available. However, alongside this growth, there has been increased recognition of the potential for algorithmic systems to exhibit and perpetuate biases, risking unfairness in personalized domains. In this work, we explicate the inherent tension between personalization and conventional implementations of fairness. As an alternative, we propose equity to achieve fairness in the context of epistemic utility. We provide a map** between goals and practical implementations and detail policy recommendations across key stakeholders to forge a path towards achieving fairness in personalized systems.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Asymmetric Clean Segments-Guided Self-Supervised Learning for Robust Speaker Verification
Authors:
Chong-Xin Gan,
Man-Wai Mak,
Weiwei Lin,
Jen-Tzung Chien
Abstract:
Contrastive self-supervised learning (CSL) for speaker verification (SV) has drawn increasing interest recently due to its ability to exploit unlabeled data. Performing data augmentation on raw waveforms, such as adding noise or reverberation, plays a pivotal role in achieving promising results in SV. Data augmentation, however, demands meticulous calibration to ensure intact speaker-specific info…
▽ More
Contrastive self-supervised learning (CSL) for speaker verification (SV) has drawn increasing interest recently due to its ability to exploit unlabeled data. Performing data augmentation on raw waveforms, such as adding noise or reverberation, plays a pivotal role in achieving promising results in SV. Data augmentation, however, demands meticulous calibration to ensure intact speaker-specific information, which is difficult to achieve without speaker labels. To address this issue, we introduce a novel framework by incorporating clean and augmented segments into the contrastive training pipeline. The clean segments are repurposed to pair with noisy segments to form additional positive and negative pairs. Moreover, the contrastive loss is weighted to increase the difference between the clean and augmented embeddings of different speakers. Experimental results on Voxceleb1 suggest that the proposed framework can achieve a remarkable 19% improvement over the conventional methods, and it surpasses many existing state-of-the-art techniques.
△ Less
Submitted 11 March, 2024; v1 submitted 8 September, 2023;
originally announced September 2023.
-
Text2Layer: Layered Image Generation using Latent Diffusion Model
Authors:
Xinyang Zhang,
Wentian Zhao,
Xin Lu,
Jeff Chien
Abstract:
Layer compositing is one of the most popular image editing workflows among both amateurs and professionals. Motivated by the success of diffusion models, we explore layer compositing from a layered image generation perspective. Instead of generating an image, we propose to generate background, foreground, layer mask, and the composed image simultaneously. To achieve layered image generation, we tr…
▽ More
Layer compositing is one of the most popular image editing workflows among both amateurs and professionals. Motivated by the success of diffusion models, we explore layer compositing from a layered image generation perspective. Instead of generating an image, we propose to generate background, foreground, layer mask, and the composed image simultaneously. To achieve layered image generation, we train an autoencoder that is able to reconstruct layered images and train diffusion models on the latent representation. One benefit of the proposed problem is to enable better compositing workflows in addition to the high-quality image output. Another benefit is producing higher-quality layer masks compared to masks produced by a separate step of image segmentation. Experimental results show that the proposed method is able to generate high-quality layered images and initiates a benchmark for future work.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Parameter-Efficient Learning for Text-to-Speech Accent Adaptation
Authors:
Li-Jen Yang,
Chao-Han Huck Yang,
Jen-Tzung Chien
Abstract:
This paper presents a parameter-efficient learning (PEL) to develop a low-resource accent adaptation for text-to-speech (TTS). A resource-efficient adaptation from a frozen pre-trained TTS model is developed by using only 1.2\% to 0.8\% of original trainable parameters to achieve competitive performance in voice synthesis. Motivated by a theoretical foundation of optimal transport (OT), this study…
▽ More
This paper presents a parameter-efficient learning (PEL) to develop a low-resource accent adaptation for text-to-speech (TTS). A resource-efficient adaptation from a frozen pre-trained TTS model is developed by using only 1.2\% to 0.8\% of original trainable parameters to achieve competitive performance in voice synthesis. Motivated by a theoretical foundation of optimal transport (OT), this study carries out PEL for TTS where an auxiliary unsupervised loss based on OT is introduced to maximize a difference between the pre-trained source domain and the (unseen) target domain, in addition to its supervised training loss. Further, we leverage upon this unsupervised loss refinement to boost system performance via either sliced Wasserstein distance or maximum mean discrepancy. The merit of this work is demonstrated by fulfilling PEL solutions based on residual adapter learning, and model reprogramming when evaluating the Mandarin accent adaptation. Experiment results show that the proposed methods can achieve competitive naturalness with parameter-efficient decoder fine-tuning, and the auxiliary unsupervised loss improves model performance empirically.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Algorithmic Censoring in Dynamic Learning Systems
Authors:
Jennifer Chien,
Margaret Roberts,
Berk Ustun
Abstract:
Dynamic learning systems subject to selective labeling exhibit censoring, i.e. persistent negative predictions assigned to one or more subgroups of points. In applications like consumer finance, this results in groups of applicants that are persistently denied and thus never enter into the training data. In this work, we formalize censoring, demonstrate how it can arise, and highlight difficulties…
▽ More
Dynamic learning systems subject to selective labeling exhibit censoring, i.e. persistent negative predictions assigned to one or more subgroups of points. In applications like consumer finance, this results in groups of applicants that are persistently denied and thus never enter into the training data. In this work, we formalize censoring, demonstrate how it can arise, and highlight difficulties in detection. We consider safeguards against censoring - recourse and randomized-exploration - both of which ensure we collect labels for points that would otherwise go unobserved. The resulting techniques allow examples from censored groups to enter into the training data and correct the model. Our results highlight the otherwise unmeasured harms of censoring and demonstrate the effectiveness of mitigation strategies across a range of data generating processes.
△ Less
Submitted 29 June, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
Dual-aptamer Drift Cancelling Techniques to Improve Long-term Stability of Real-Time Structure-Switching Aptasensors
Authors:
Ya-Chen Tsai,
Wei-Yang Weng,
Yu-Tong Yeh,
Jun-Chau Chien
Abstract:
This paper presents a dual-aptamer scheme to cancel the signal drifts from structure-switching aptamers during long-term monitoring. Electrochemical aptamer-based (E-AB) biosensors recently demonstrated their great potential for in vivo continuous monitoring. Nevertheless, the detection accuracy is often limited by the signaling drifts. Conventionally, these drifts are removed by the kinetic diffe…
▽ More
This paper presents a dual-aptamer scheme to cancel the signal drifts from structure-switching aptamers during long-term monitoring. Electrochemical aptamer-based (E-AB) biosensors recently demonstrated their great potential for in vivo continuous monitoring. Nevertheless, the detection accuracy is often limited by the signaling drifts. Conventionally, these drifts are removed by the kinetic differential measurements (KDM) when coupled with square-wave voltammetry. Yet we discover that KDM does not apply to every aptamer as the responses at different SWV frequencies heavily depend on its structure-switching characteristics and the redox reporters' electron transfer (ET) kinetics. To this end, we present a "dual-aptamer" scheme that uses two aptamers responding differentially to the same molecular target for drift cancellation. We identify these paired aptamers through (1) screening from the existing aptamers pool and (2) engineering the signaling behavior of the redox reporters. We demonstrate their differential signaling to ampicillin and ATP molecules and show that the aptamer pair bears common drifts in undilute goat serum. Through cancellation, sensor drift is reduced by 370-fold. Benefiting from the "differential" signaling, the recording throughput is also doubled using differential readout electronics. The authors believe the proposed technique is beneficial for long-term in vivo monitoring.
△ Less
Submitted 29 December, 2022;
originally announced December 2022.
-
Actionable Recourse via GANs for Mobile Health
Authors:
Jennifer Chien,
Anna Guitart,
Ana Fernandez del Rio,
Africa Perianez,
Lauren Bellhouse
Abstract:
Mobile health apps provide a unique means of collecting data that can be used to deliver adaptive interventions.The predicted outcomes considerably influence the selection of such interventions. Recourse via counterfactuals provides tangible mechanisms to modify user predictions. By identifying plausible actions that increase the likelihood of a desired prediction, stakeholders are afforded agency…
▽ More
Mobile health apps provide a unique means of collecting data that can be used to deliver adaptive interventions.The predicted outcomes considerably influence the selection of such interventions. Recourse via counterfactuals provides tangible mechanisms to modify user predictions. By identifying plausible actions that increase the likelihood of a desired prediction, stakeholders are afforded agency over their predictions. Furthermore, recourse mechanisms enable counterfactual reasoning that can help provide insights into candidates for causal interventional features. We demonstrate the feasibility of GAN-generated recourse for mobile health applications on ensemble-survival-analysis-based prediction of medium-term engagement in the Safe Delivery App, a digital training tool for skilled birth attendants.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Rydberg States of H$_3$ and HeH as Potential Coolants for Primordial Star Formation
Authors:
Gokul Kannan,
Jeremy R. Chien,
Anthony J. Benjamin,
Niranjan Bhatia,
Richard J. Saykally
Abstract:
Current theory and measurements establish the age of the universe as ca. 13.8 billion years. For the first several hundred million years of its existence, it was a dark, opaque void. After that, the hydrogen atoms comprising most of the "ordinary" matter began to condense and ionize, eventually forming the first stars that would illuminate the sky. Details of how these "primordial" stars formed ha…
▽ More
Current theory and measurements establish the age of the universe as ca. 13.8 billion years. For the first several hundred million years of its existence, it was a dark, opaque void. After that, the hydrogen atoms comprising most of the "ordinary" matter began to condense and ionize, eventually forming the first stars that would illuminate the sky. Details of how these "primordial" stars formed have been widely debated, but remain elusive. A central issue in this process is the mechanism by which the primordial gas (mainly hydrogen and helium atoms) collected via the action of dark matter cools and further accretes to fusion densities. Current models invoke collisional excitation of H$_2$ molecular rotations and subsequent radiative rotational transitions allowed by the weak molecular quadrupole moment. In this article, we review the salient considerations, and present some new ideas, bases on recent spectroscopic observations of neutral H$_3$ Rydberg electronic state emission in the mid-infrared.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
Anticipation and Negative Group Delay in a Retina
Authors:
Po-Yu Chou,
Jo-Fan Chien,
Kevin Sean Chen,
Yu-Ting Huang,
Chun-Chung Chen,
C. K. Chan
Abstract:
The mechanism of negative group delay (NGD) is used to understand the anticipatory capability of a retina. Experiments with retinas from bull frogs are performed to compare with the predictions of the NGD model. In particulars, whole field stochastic stimulation with various time correlations are used to probe anticipatory responses from the retina. We find that the NGD model can reproduce essenti…
▽ More
The mechanism of negative group delay (NGD) is used to understand the anticipatory capability of a retina. Experiments with retinas from bull frogs are performed to compare with the predictions of the NGD model. In particulars, whole field stochastic stimulation with various time correlations are used to probe anticipatory responses from the retina. We find that the NGD model can reproduce essential features of experimental observations characterized by the cross correlations between the stimulation and the retinal responses. The prediction horizon of a retina is found to depend on the correlation time of the stimulation as predicted by the NGD model. Experiments with dark and bright Gaussian light pulses further support the NGD mechanism; but only for the dark pulses indicating that the NGD effect of a retina might originate from its OFF response. Our finding suggests that sensory systems capable of using negative feedback for adaptation can give rise to anticipation as a consequence of the delay in the system.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Transporter Networks: Rearranging the Visual World for Robotic Manipulation
Authors:
Andy Zeng,
Pete Florence,
Jonathan Tompson,
Stefan Welker,
Jonathan Chien,
Maria Attarian,
Travis Armstrong,
Ivan Krasin,
Dan Duong,
Ayzaan Wahid,
Vikas Sindhwani,
Johnny Lee
Abstract:
Robotic manipulation can be formulated as inducing a sequence of spatial displacements: where the space being moved can encompass an object, part of an object, or end effector. In this work, we propose the Transporter Network, a simple model architecture that rearranges deep features to infer spatial displacements from visual input - which can parameterize robot actions. It makes no assumptions of…
▽ More
Robotic manipulation can be formulated as inducing a sequence of spatial displacements: where the space being moved can encompass an object, part of an object, or end effector. In this work, we propose the Transporter Network, a simple model architecture that rearranges deep features to infer spatial displacements from visual input - which can parameterize robot actions. It makes no assumptions of objectness (e.g. canonical poses, models, or keypoints), it exploits spatial symmetries, and is orders of magnitude more sample efficient than our benchmarked alternatives in learning vision-based manipulation tasks: from stacking a pyramid of blocks, to assembling kits with unseen objects; from manipulating deformable ropes, to pushing piles of small objects with closed-loop feedback. Our method can represent complex multi-modal policy distributions and generalizes to multi-step sequential tasks, as well as 6DoF pick-and-place. Experiments on 10 simulated tasks show that it learns faster and generalizes better than a variety of end-to-end baselines, including policies that use ground-truth object poses. We validate our methods with hardware in the real world. Experiment videos and code are available at https://transporternets.github.io
△ Less
Submitted 5 January, 2022; v1 submitted 27 October, 2020;
originally announced October 2020.
-
Design and Analysis of a Sample-and-Hold CMOS Electrochemical Sensor for Aptamer-based Therapeutic Drug Monitoring
Authors:
Jun-Chau Chien,
Sam W. Baker,
H. Tom Soh,
Amin Arbabian
Abstract:
In this paper, we present the design and the analysis of an electrochemical circuit for measuring the concentrations of therapeutic drugs using structure-switching aptamers. Aptamers are single-stranded nucleic acids, whose sequence is selected to exhibit high affinity and specificity toward a molecular target, and change its conformation upon binding. This property, when coupled with a redox repo…
▽ More
In this paper, we present the design and the analysis of an electrochemical circuit for measuring the concentrations of therapeutic drugs using structure-switching aptamers. Aptamers are single-stranded nucleic acids, whose sequence is selected to exhibit high affinity and specificity toward a molecular target, and change its conformation upon binding. This property, when coupled with a redox reporter and electrochemical detection, enables reagent-free biosensing with a sub-minute temporal resolution for in vivo therapeutic drug monitoring. Specifically, we design a chronoamperometry-based electrochemical circuit that measures the direct changes in the electron transfer (ET) kinetics of a methylene blue reporter conjugated at the distal-end of the aptamer. To overcome the high-frequency noise amplification issue when interfacing with a large-size (> 0.25 mm2) implantable electrode, we present a sample-and-hold (S/H) circuit technique in which the desired electrode potentials are held onto noiseless capacitors during the recording of the redox currents. This allows disconnecting the feedback amplifiers to avoid its noise injection while reducing the total power consumption. A prototype circuit implemented in 65-nm CMOS demonstrates a cell-capacitance-insensitive input-referred noise (IRN) current of 15.2 pArms at a 2.5-kHz filtering bandwidth. Tested in human whole blood samples, changes in the ET kinetics from the redox-labeled aminoglycoside aptamers at different kanamycin concentrations are measured from the recorded current waveforms. By employing principal component analysis (PCA) to compensate for the sampling errors, a detection limit (SNR = 1) of 3.1 uM under 1-sec acquisition is achieved at 0.22-mW power consumption.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
Stationary Wave Profiles for Nonlocal Particle Models of Traffic Flow on Rough Roads
Authors:
Jereme Chien,
Wen Shen
Abstract:
We study a nonlocal particle model describing traffic flow on rough roads. In the model, each driver adjusts the speed of the car according to the condition over an interval in the front, leading to a system of nonlocal ODEs which we refer to as the FtLs (follow-the-leaders) model. Assuming that the road condition is discontinuous at the origin, we seek stationary wave profiles (see Definition 1.1…
▽ More
We study a nonlocal particle model describing traffic flow on rough roads. In the model, each driver adjusts the speed of the car according to the condition over an interval in the front, leading to a system of nonlocal ODEs which we refer to as the FtLs (follow-the-leaders) model. Assuming that the road condition is discontinuous at the origin, we seek stationary wave profiles (see Definition 1.1) for the system of ODEs across this discontinuity. We derive a non-local delay differential equation with discontinuous coefficient, satisfied by the profiles, together with conditions on the asymptotic values as $x\to\pm\infty$. Results on existence, uniqueness, and local stability are proved, for all cases. We show that, depending on the case, there might exist a unique profile, infinitely many profiles, or no profiles. The stability result also depends on cases. Various numerical simulations are presented. Finally, we establish convergence of these profiles to those of a local particle model, as well as those of a nonlocal PDE model.
△ Less
Submitted 9 November, 2019; v1 submitted 22 February, 2019;
originally announced February 2019.
-
Unsupervised Meta-learning of Figure-Ground Segmentation via Imitating Visual Effects
Authors:
Ding-Jie Chen,
Jui-Ting Chien,
Hwann-Tzong Chen,
Tyng-Luh Liu
Abstract:
This paper presents a "learning to learn" approach to figure-ground image segmentation. By exploring webly-abundant images of specific visual effects, our method can effectively learn the visual-effect internal representations in an unsupervised manner and uses this knowledge to differentiate the figure from the ground in an image. Specifically, we formulate the meta-learning process as a composit…
▽ More
This paper presents a "learning to learn" approach to figure-ground image segmentation. By exploring webly-abundant images of specific visual effects, our method can effectively learn the visual-effect internal representations in an unsupervised manner and uses this knowledge to differentiate the figure from the ground in an image. Specifically, we formulate the meta-learning process as a compositional image editing task that learns to imitate a certain visual effect and derive the corresponding internal representation. Such a generative process can help instantiate the underlying figure-ground notion and enables the system to accomplish the intended image segmentation. Whereas existing generative methods are mostly tailored to image synthesis or style transfer, our approach offers a flexible learning mechanism to model a general concept of figure-ground segmentation from unorganized images that have no explicit pixel-level annotations. We validate our approach via extensive experiments on six datasets to demonstrate that the proposed model can be end-to-end trained without ground-truth pixel labeling yet outperforms the existing methods of unsupervised segmentation tasks.
△ Less
Submitted 20 December, 2018;
originally announced December 2018.
-
BALSON: Bayesian Least Squares Optimization with Nonnegative L1-Norm Constraint
Authors:
Jiyang Xie,
Zhanyu Ma,
Guoqiang Zhang,
**g-Hao Xue,
Jen-Tzung Chien,
Zhiqing Lin,
Jun Guo
Abstract:
A Bayesian approach termed BAyesian Least Squares Optimization with Nonnegative L1-norm constraint (BALSON) is proposed. The error distribution of data fitting is described by Gaussian likelihood. The parameter distribution is assumed to be a Dirichlet distribution. With the Bayes rule, searching for the optimal parameters is equivalent to finding the mode of the posterior distribution. In order t…
▽ More
A Bayesian approach termed BAyesian Least Squares Optimization with Nonnegative L1-norm constraint (BALSON) is proposed. The error distribution of data fitting is described by Gaussian likelihood. The parameter distribution is assumed to be a Dirichlet distribution. With the Bayes rule, searching for the optimal parameters is equivalent to finding the mode of the posterior distribution. In order to explicitly characterize the nonnegative L1-norm constraint of the parameters, we further approximate the true posterior distribution by a Dirichlet distribution. We estimate the statistics of the approximating Dirichlet posterior distribution by sampling methods. Four sampling methods have been introduced. With the estimated posterior distributions, the original parameters can be effectively reconstructed in polynomial fitting problems, and the BALSON framework is found to perform better than conventional methods.
△ Less
Submitted 8 July, 2018;
originally announced July 2018.
-
Context-aware Cascade Attention-based RNN for Video Emotion Recognition
Authors:
Man-Chin Sun,
Shih-Huan Hsu,
Min-Chun Yang,
Jen-Hsien Chien
Abstract:
Emotion recognition can provide crucial information about the user in many applications when building human-computer interaction (HCI) systems. Most of current researches on visual emotion recognition are focusing on exploring facial features. However, context information including surrounding environment and human body can also provide extra clues to recognize emotion more accurately. Inspired by…
▽ More
Emotion recognition can provide crucial information about the user in many applications when building human-computer interaction (HCI) systems. Most of current researches on visual emotion recognition are focusing on exploring facial features. However, context information including surrounding environment and human body can also provide extra clues to recognize emotion more accurately. Inspired by "sequence to sequence model" for neural machine translation, which models input and output sequences by an encoder and a decoder in recurrent neural network (RNN) architecture respectively, a novel architecture, "CACA-RNN", is proposed in this work. The proposed network consists of two RNNs in a cascaded architecture to process both context and facial information to perform video emotion classification. Results of the model were submitted to video emotion recognition sub-challenge in Multimodal Emotion Recognition Challenge (MEC2017). CACA-RNN outperforms the MEC2017 baseline (mAP of 21.7%): it achieved mAP of 45.51% on the testing set in the video only challenge.
△ Less
Submitted 30 May, 2018;
originally announced May 2018.
-
Self Adversarial Training for Human Pose Estimation
Authors:
Chia-Jung Chou,
Jui-Ting Chien,
Hwann-Tzong Chen
Abstract:
This paper presents a deep learning based approach to the problem of human pose estimation. We employ generative adversarial networks as our learning paradigm in which we set up two stacked hourglass networks with the same architecture, one as the generator and the other as the discriminator. The generator is used as a human pose estimator after the training is done. The discriminator distinguishe…
▽ More
This paper presents a deep learning based approach to the problem of human pose estimation. We employ generative adversarial networks as our learning paradigm in which we set up two stacked hourglass networks with the same architecture, one as the generator and the other as the discriminator. The generator is used as a human pose estimator after the training is done. The discriminator distinguishes ground-truth heatmaps from generated ones, and back-propagates the adversarial loss to the generator. This process enables the generator to learn plausible human body configurations and is shown to be useful for improving the prediction accuracy.
△ Less
Submitted 15 August, 2017; v1 submitted 8 July, 2017;
originally announced July 2017.
-
Virtual Links with Finite Medial Bikei
Authors:
Julien Chien,
Sam Nelson
Abstract:
We consider the question of which virtual knots have finite fundamental medial bikei. We describe and implement an algorithm for completing a presentation matrix of a medial bikei to an operation table, determining both the cardinality and isomorphism class of the fundamental medial bikei, each of which are link invariants. As an example, we compute the fundamental medial bikei for all of the prim…
▽ More
We consider the question of which virtual knots have finite fundamental medial bikei. We describe and implement an algorithm for completing a presentation matrix of a medial bikei to an operation table, determining both the cardinality and isomorphism class of the fundamental medial bikei, each of which are link invariants. As an example, we compute the fundamental medial bikei for all of the prime virtual knots with up to four classical crossings as listed in the knot atlas.
△ Less
Submitted 3 April, 2017;
originally announced April 2017.
-
Predicting the Plant Root-Associated Ecological Niche of 21 Pseudomonas Species Using Machine Learning and Metabolic Modeling
Authors:
Jennifer Chien,
Peter Larsen
Abstract:
Plants rarely occur in isolated systems. Bacteria can inhabit either the endosphere, the region inside the plant root, or the rhizosphere, the soil region just outside the plant root. Our goal is to understand if using genomic data and media dependent metabolic model information is better for training machine learning of predicting bacterial ecological niche than media independent models or pure g…
▽ More
Plants rarely occur in isolated systems. Bacteria can inhabit either the endosphere, the region inside the plant root, or the rhizosphere, the soil region just outside the plant root. Our goal is to understand if using genomic data and media dependent metabolic model information is better for training machine learning of predicting bacterial ecological niche than media independent models or pure genome based species trees. We considered three machine learning techniques: support vector machine, non-negative matrix factorization, and artificial neural networks. In all three machine-learning approaches, the media-based metabolic models and flux balance analyses were more effective at predicting bacterial niche than the genome or PRMT models. Support Vector Machine trained on a minimal media base with Mannose, Proline and Valine was most predictive of all models and media types with an f-score of 0.8 for rhizosphere and 0.97 for endosphere. Thus we can conclude that media-based metabolic modeling provides a holistic view of the metabolome, allowing machine learning algorithms to highlight the differences between and categorize endosphere and rhizosphere bacteria. There was no single media type that best highlighted differences between endosphere and rhizosphere bacteria metabolism and therefore no single enzyme, reaction, or compound that defined whether a bacteria's origin was of the endosphere or rhizosphere.
△ Less
Submitted 11 January, 2017;
originally announced January 2017.
-
Network-based Isoform Quantification with RNA-Seq Data for Cancer Transcriptome Analysis
Authors:
Wei Zhang,
Jae-Woong Chang,
Lilong Lin,
Kay Minn,
Baolin Wu,
Jeremy Chien,
Jeongsik Yong,
Hui Zheng,
Rui Kuang
Abstract:
High-throughput mRNA sequencing (RNA-Seq) is widely used for transcript quantification of gene isoforms. Since RNA-Seq data alone is often not sufficient to accurately identify the read origins from the isoforms for quantification, we propose to explore protein domain-domain interactions as prior knowledge for integrative analysis with RNA-seq data. We introduce a Network-based method for RNA-Seq-…
▽ More
High-throughput mRNA sequencing (RNA-Seq) is widely used for transcript quantification of gene isoforms. Since RNA-Seq data alone is often not sufficient to accurately identify the read origins from the isoforms for quantification, we propose to explore protein domain-domain interactions as prior knowledge for integrative analysis with RNA-seq data. We introduce a Network-based method for RNA-Seq-based Transcript Quantification (Net-RSTQ) to integrate protein domain-domain interaction network with short read alignments for transcript abundance estimation. Based on our observation that the abundances of the neighboring isoforms by domain-domain interactions in the network are positively correlated, Net-RSTQ models the expression of the neighboring transcripts as Dirichlet priors on the likelihood of the observed read alignments against the transcripts in one gene. The transcript abundances of all the genes are then jointly estimated with alternating optimization of multiple EM problems. In simulation Net-RSTQ effectively improved isoform transcript quantifications when isoform co-expressions correlate with their interactions. qRT-PCR results on 25 multi-isoform genes in a stem cell line, an ovarian cancer cell line, and a breast cancer cell line also showed that Net-RSTQ estimated more consistent isoform proportions with RNA-Seq data. In the experiments on the RNA-Seq data in The Cancer Genome Atlas (TCGA), the transcript abundances estimated by Net-RSTQ are more informative for patient sample classification of ovarian cancer, breast cancer and lung cancer. All experimental results collectively support that Net-RSTQ is a promising approach for isoform quantification.
△ Less
Submitted 15 September, 2015; v1 submitted 19 March, 2014;
originally announced March 2014.
-
Free-carrier relaxation and lattice heating in photoexcited bismuth thin films
Authors:
Y. M. Sheu,
Y. J. Chien,
C. Uher,
S. Fahy,
D. A. Reis
Abstract:
We report ultrafast surface pump and interface probe experiments on photoexcited carrier transport across single crystal bismuth films on sapphire. The film thickness is sufficient to separate carrier dynamics from lattice heating and strain, allowing us to investigate the time-scales of momentum relaxation, heat transfer to the lattice and electron-hole recombination. The measured electron-hole (…
▽ More
We report ultrafast surface pump and interface probe experiments on photoexcited carrier transport across single crystal bismuth films on sapphire. The film thickness is sufficient to separate carrier dynamics from lattice heating and strain, allowing us to investigate the time-scales of momentum relaxation, heat transfer to the lattice and electron-hole recombination. The measured electron-hole ($e-h$) recombination time is 12--26 ps and ambipolar diffusivity is 18--40 cm$^{2}$/s for carrier excitation up to $\sim 10^{19} \text{cm}^{-3}$. By comparing the heating of the front and back sides of the film, we put lower limits on the rate of heat transfer to the lattice, and by observing the decay of the plasma at the back of the film, we estimate the timescale of electron-hole recombination. We interpret each of these timescales within a common framework of electron-phonon scattering and find qualitative agreement between the various relaxation times observed. We find that the carrier density is not determined by the $e-h$ plasma temperature after a few picoseconds. The diffusion and recombination become nonlinear with initial excitation $\gtrsim 10^{20} \text{cm}^{-3}$.
△ Less
Submitted 16 November, 2012;
originally announced November 2012.