-
Improving Transformers using Faithful Positional Encoding
Authors:
Tsuyoshi Idé,
Jokin Labaien,
Pin-Yu Chen
Abstract:
We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach is based on solid mathematical grounds and has a guarantee of not losing information about the positional order of the input sequence. We show that the new encoding approach systematically improves the prediction performance in the t…
▽ More
We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach is based on solid mathematical grounds and has a guarantee of not losing information about the positional order of the input sequence. We show that the new encoding approach systematically improves the prediction performance in the time-series classification task.
△ Less
Submitted 16 May, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
Decentralized Collaborative Learning Framework with External Privacy Leakage Analysis
Authors:
Tsuyoshi Idé,
Dzung T. Phan,
Rudy Raymond
Abstract:
This paper presents two methodological advancements in decentralized multi-task learning under privacy constraints, aiming to pave the way for future developments in next-generation Blockchain platforms. First, we expand the existing framework for collaborative dictionary learning (CollabDict), which has previously been limited to Gaussian mixture models, by incorporating deep variational autoenco…
▽ More
This paper presents two methodological advancements in decentralized multi-task learning under privacy constraints, aiming to pave the way for future developments in next-generation Blockchain platforms. First, we expand the existing framework for collaborative dictionary learning (CollabDict), which has previously been limited to Gaussian mixture models, by incorporating deep variational autoencoders (VAEs) into the framework, with a particular focus on anomaly detection. We demonstrate that the VAE-based anomaly score function shares the same mathematical structure as the non-deep model, and provide comprehensive qualitative comparison. Second, considering the widespread use of "pre-trained models," we provide a mathematical analysis on data privacy leakage when models trained with CollabDict are shared externally. We show that the CollabDict approach, when applied to Gaussian mixtures, adheres to a Renyi differential privacy criterion. Additionally, we propose a practical metric for monitoring internal privacy breaches during the learning process.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
A resource-constrained stochastic scheduling algorithm for homeless street outreach and gleaning edible food
Authors:
Conor M. Artman,
Aditya Mate,
Ezinne Nwankwo,
Aliza Heching,
Tsuyoshi Idé,
Jiří Navrátil,
Karthikeyan Shanmugam,
Wei Sun,
Kush R. Varshney,
Lauri Goldkind,
Gidi Kroch,
Jaclyn Sawyer,
Ian Watson
Abstract:
We developed a common algorithmic solution addressing the problem of resource-constrained outreach encountered by social change organizations with different missions and operations: Breaking Ground -- an organization that helps individuals experiencing homelessness in New York transition to permanent housing and Leket -- the national food bank of Israel that rescues food from farms and elsewhere t…
▽ More
We developed a common algorithmic solution addressing the problem of resource-constrained outreach encountered by social change organizations with different missions and operations: Breaking Ground -- an organization that helps individuals experiencing homelessness in New York transition to permanent housing and Leket -- the national food bank of Israel that rescues food from farms and elsewhere to feed the hungry. Specifically, we developed an estimation and optimization approach for partially-observed episodic restless bandits under $k$-step transitions. The results show that our Thompson sampling with Markov chain recovery (via Stein variational gradient descent) algorithm significantly outperforms baselines for the problems of both organizations. We carried out this work in a prospective manner with the express goal of devising a flexible-enough but also useful-enough solution that can help overcome a lack of sustainable impact in data science for social good.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Learning Granger Causality from Instance-wise Self-attentive Hawkes Processes
Authors:
Dongxia Wu,
Tsuyoshi Idé,
Aurélie Lozano,
Georgios Kollias,
Jiří Navrátil,
Naoki Abe,
Yi-An Ma,
Rose Yu
Abstract:
We address the problem of learning Granger causality from asynchronous, interdependent, multi-type event sequences. In particular, we are interested in discovering instance-level causal structures in an unsupervised manner. Instance-level causality identifies causal relationships among individual events, providing more fine-grained information for decision-making. Existing work in the literature e…
▽ More
We address the problem of learning Granger causality from asynchronous, interdependent, multi-type event sequences. In particular, we are interested in discovering instance-level causal structures in an unsupervised manner. Instance-level causality identifies causal relationships among individual events, providing more fine-grained information for decision-making. Existing work in the literature either requires strong assumptions, such as linearity in the intensity function, or heuristically defined model parameters that do not necessarily meet the requirements of Granger causality. We propose Instance-wise Self-Attentive Hawkes Processes (ISAHP), a novel deep learning framework that can directly infer the Granger causality at the event instance level. ISAHP is the first neural point process model that meets the requirements of Granger causality. It leverages the self-attention mechanism of the transformer to align with the principles of Granger causality. We empirically demonstrate that ISAHP is capable of discovering complex instance-level causal structures that cannot be handled by classical models. We also show that ISAHP achieves state-of-the-art performance in proxy tasks involving type-level causal discovery and instance-level event type prediction.
△ Less
Submitted 29 February, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Deep Reinforcement Learning for Multi-Truck Vehicle Routing Problems with Multi-Leg Demand Routes
Authors:
Joshua Levin,
Randall Correll,
Takanori Ide,
Takafumi Suzuki,
Takaho Saito,
Alan Arai
Abstract:
Deep reinforcement learning (RL) has been shown to be effective in producing approximate solutions to some vehicle routing problems (VRPs), especially when using policies generated by encoder-decoder attention mechanisms. While these techniques have been quite successful for relatively simple problem instances, there are still under-researched and highly complex VRP variants for which no effective…
▽ More
Deep reinforcement learning (RL) has been shown to be effective in producing approximate solutions to some vehicle routing problems (VRPs), especially when using policies generated by encoder-decoder attention mechanisms. While these techniques have been quite successful for relatively simple problem instances, there are still under-researched and highly complex VRP variants for which no effective RL method has been demonstrated. In this work we focus on one such VRP variant, which contains multiple trucks and multi-leg routing requirements. In these problems, demand is required to move along sequences of nodes, instead of just from a start node to an end node. With the goal of making deep RL a viable strategy for real-world industrial-scale supply chain logistics, we develop new extensions to existing encoder-decoder attention models which allow them to handle multiple trucks and multi-leg routing requirements. Our models have the advantage that they can be trained for a small number of trucks and nodes, and then embedded into a large supply chain to yield solutions for larger numbers of trucks and nodes. We test our approach on a real supply chain environment arising in the operations of Japanese automotive parts manufacturer Aisin Corporation, and find that our algorithm outperforms Aisin's previous best solution.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
PHALM: Building a Knowledge Graph from Scratch by Prompting Humans and a Language Model
Authors:
Tatsuya Ide,
Eiki Murata,
Daisuke Kawahara,
Takato Yamazaki,
Shengzhe Li,
Kenta Shinzato,
Toshinori Sato
Abstract:
Despite the remarkable progress in natural language understanding with pretrained Transformers, neural language models often do not handle commonsense knowledge well. Toward commonsense-aware models, there have been attempts to obtain knowledge, ranging from automatic acquisition to crowdsourcing. However, it is difficult to obtain a high-quality knowledge base at a low cost, especially from scrat…
▽ More
Despite the remarkable progress in natural language understanding with pretrained Transformers, neural language models often do not handle commonsense knowledge well. Toward commonsense-aware models, there have been attempts to obtain knowledge, ranging from automatic acquisition to crowdsourcing. However, it is difficult to obtain a high-quality knowledge base at a low cost, especially from scratch. In this paper, we propose PHALM, a method of building a knowledge graph from scratch, by prompting both crowdworkers and a large language model (LLM). We used this method to build a Japanese event knowledge graph and trained Japanese commonsense generation models. Experimental results revealed the acceptability of the built graph and inferences generated by the trained models. We also report the difference in prompting humans and an LLM. Our code, data, and models are available at github.com/nlp-waseda/comet-atomic-ja.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Generative Perturbation Analysis for Probabilistic Black-Box Anomaly Attribution
Authors:
Tsuyoshi Idé,
Naoki Abe
Abstract:
We address the task of probabilistic anomaly attribution in the black-box regression setting, where the goal is to compute the probability distribution of the attribution score of each input variable, given an observed anomaly. The training dataset is assumed to be unavailable. This task differs from the standard XAI (explainable AI) scenario, since we wish to explain the anomalous deviation from…
▽ More
We address the task of probabilistic anomaly attribution in the black-box regression setting, where the goal is to compute the probability distribution of the attribution score of each input variable, given an observed anomaly. The training dataset is assumed to be unavailable. This task differs from the standard XAI (explainable AI) scenario, since we wish to explain the anomalous deviation from a black-box prediction rather than the black-box model itself.
We begin by showing that mainstream model-agnostic explanation methods, such as the Shapley values, are not suitable for this task because of their ``deviation-agnostic property.'' We then propose a novel framework for probabilistic anomaly attribution that allows us to not only compute attribution scores as the predictive mean but also quantify the uncertainty of those scores. This is done by considering a generative process for perturbations that counter-factually bring the observed anomalous observation back to normalcy. We introduce a variational Bayes algorithm for deriving the distributions of per variable attribution scores. To the best of our knowledge, this is the first probabilistic anomaly attribution framework that is free from being deviation-agnostic.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Black-Box Anomaly Attribution
Authors:
Tsuyoshi Idé,
Naoki Abe
Abstract:
When the prediction of a black-box machine learning model deviates from the true observation, what can be said about the reason behind that deviation? This is a fundamental and ubiquitous question that the end user in a business or industrial AI application often asks. The deviation may be due to a sub-optimal black-box model, or it may be simply because the sample in question is an outlier. In ei…
▽ More
When the prediction of a black-box machine learning model deviates from the true observation, what can be said about the reason behind that deviation? This is a fundamental and ubiquitous question that the end user in a business or industrial AI application often asks. The deviation may be due to a sub-optimal black-box model, or it may be simply because the sample in question is an outlier. In either case, one would ideally wish to obtain some form of attribution score -- a value indicative of the extent to which an input variable is responsible for the anomaly.
In the present paper we address this task of ``anomaly attribution,'' particularly in the setting in which the model is black-box and the training data are not available. Specifically, we propose a novel likelihood-based attribution framework we call the ``likelihood compensation (LC),'' in which the responsibility score is equated with the correction on each input variable needed to attain the highest possible likelihood. We begin by showing formally why mainstream model-agnostic explanation methods, such as the local linear surrogate modeling and Shapley values, are not designed to explain anomalies. In particular, we show that they are ``deviation-agnostic,'' namely, that their explanations are blind to the fact that there is a deviation in the model prediction for the sample of interest. We do this by positioning these existing methods under the unified umbrella of a function family we call the ``integrated gradient family.'' We validate the effectiveness of the proposed LC approach using publicly available data sets. We also conduct a case study with a real-world building energy prediction task and confirm its usefulness in practice based on expert feedback.
△ Less
Submitted 28 May, 2023;
originally announced May 2023.
-
Diagnostic Spatio-temporal Transformer with Faithful Encoding
Authors:
Jokin Labaien,
Tsuyoshi Idé,
Pin-Yu Chen,
Ekhi Zugasti,
Xabier De Carlos
Abstract:
This paper addresses the task of anomaly diagnosis when the underlying data generation process has a complex spatio-temporal (ST) dependency. The key technical challenge is to extract actionable insights from the dependency tensor characterizing high-order interactions among temporal and spatial indices. We formalize the problem as supervised dependency discovery, where the ST dependency is learne…
▽ More
This paper addresses the task of anomaly diagnosis when the underlying data generation process has a complex spatio-temporal (ST) dependency. The key technical challenge is to extract actionable insights from the dependency tensor characterizing high-order interactions among temporal and spatial indices. We formalize the problem as supervised dependency discovery, where the ST dependency is learned as a side product of multivariate time-series classification. We show that temporal positional encoding used in existing ST transformer works has a serious limitation in capturing higher frequencies (short time scales). We propose a new positional encoding with a theoretical guarantee, based on discrete Fourier transform. We also propose a new ST dependency discovery framework, which can provide readily consumable diagnostic information in both spatial and temporal directions. Finally, we demonstrate the utility of the proposed model, DFStrans (Diagnostic Fourier-based Spatio-temporal Transformer), in a real industrial application of building elevator control.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
Quantum Neural Networks for a Supply Chain Logistics Application
Authors:
Randall Correll,
Sean J. Weinberg,
Fabio Sanches,
Takanori Ide,
Takafumi Suzuki
Abstract:
Problem instances of a size suitable for practical applications are not likely to be addressed during the noisy intermediate-scale quantum (NISQ) period with (almost) pure quantum algorithms. Hybrid classical-quantum algorithms have potential, however, to achieve good performance on much larger problem instances. We investigate one such hybrid algorithm on a problem of substantial importance: vehi…
▽ More
Problem instances of a size suitable for practical applications are not likely to be addressed during the noisy intermediate-scale quantum (NISQ) period with (almost) pure quantum algorithms. Hybrid classical-quantum algorithms have potential, however, to achieve good performance on much larger problem instances. We investigate one such hybrid algorithm on a problem of substantial importance: vehicle routing for supply chain logistics with multiple trucks and complex demand structure. We use reinforcement learning with neural networks with embedded quantum circuits. In such neural networks, projecting high-dimensional feature vectors down to smaller vectors is necessary to accommodate restrictions on the number of qubits of NISQ hardware. However, we use a multi-head attention mechanism where, even in classical machine learning, such projections are natural and desirable. We consider data from the truck routing logistics of a company in the automotive sector, and apply our methodology by decomposing into small teams of trucks, and we find results comparable to human truck assignment.
△ Less
Submitted 2 December, 2022; v1 submitted 30 November, 2022;
originally announced December 2022.
-
Reinforcement Learning for Multi-Truck Vehicle Routing Problems
Authors:
Randall Correll,
Sean J. Weinberg,
Fabio Sanches,
Takanori Ide,
Takafumi Suzuki
Abstract:
Vehicle routing problems and other combinatorial optimization problems have been approximately solved by reinforcement learning agents with policies based on encoder-decoder models with attention mechanisms. These techniques are of substantial interest but still cannot solve the complex routing problems that arise in a realistic setting which can have many trucks and complex requirements. With the…
▽ More
Vehicle routing problems and other combinatorial optimization problems have been approximately solved by reinforcement learning agents with policies based on encoder-decoder models with attention mechanisms. These techniques are of substantial interest but still cannot solve the complex routing problems that arise in a realistic setting which can have many trucks and complex requirements. With the aim of making reinforcement learning a viable technique for supply chain optimization, we develop new extensions to encoder-decoder models for vehicle routing that allow for complex supply chains using classical computing today and quantum computing in the future. We make two major generalizations. First, our model allows for routing problems with multiple trucks. Second, we move away from the simple requirement of having a truck deliver items from nodes to one special depot node, and instead allow for a complex tensor demand structure. We show how our model, even if trained only for a small number of trucks, can be embedded into a large supply chain to yield viable solutions.
△ Less
Submitted 10 December, 2022; v1 submitted 30 November, 2022;
originally announced November 2022.
-
Anomaly Attribution with Likelihood Compensation
Authors:
Tsuyoshi Idé,
Amit Dhurandhar,
Jiří Navrátil,
Moninder Singh,
Naoki Abe
Abstract:
This paper addresses the task of explaining anomalous predictions of a black-box regression model. When using a black-box model, such as one to predict building energy consumption from many sensor measurements, we often have a situation where some observed samples may significantly deviate from their prediction. It may be due to a sub-optimal black-box model, or simply because those samples are ou…
▽ More
This paper addresses the task of explaining anomalous predictions of a black-box regression model. When using a black-box model, such as one to predict building energy consumption from many sensor measurements, we often have a situation where some observed samples may significantly deviate from their prediction. It may be due to a sub-optimal black-box model, or simply because those samples are outliers. In either case, one would ideally want to compute a ``responsibility score'' indicative of the extent to which an input variable is responsible for the anomalous output. In this work, we formalize this task as a statistical inverse problem: Given model deviation from the expected value, infer the responsibility score of each of the input variables. We propose a new method called likelihood compensation (LC), which is founded on the likelihood principle and computes a correction to each input variable. To the best of our knowledge, this is the first principled framework that computes a responsibility score for real valued anomalous model deviations. We apply our approach to a real-world building energy prediction task and confirm its utility based on expert feedback.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Decentralized Collaborative Learning with Probabilistic Data Protection
Authors:
Tsuyoshi Idé,
Rudy Raymond
Abstract:
We discuss future directions of Blockchain as a collaborative value co-creation platform, in which network participants can gain extra insights that cannot be accessed when disconnected from the others. As such, we propose a decentralized machine learning framework that is carefully designed to respect the values of democracy, diversity, and privacy. Specifically, we propose a federated multi-task…
▽ More
We discuss future directions of Blockchain as a collaborative value co-creation platform, in which network participants can gain extra insights that cannot be accessed when disconnected from the others. As such, we propose a decentralized machine learning framework that is carefully designed to respect the values of democracy, diversity, and privacy. Specifically, we propose a federated multi-task learning framework that integrates a privacy-preserving dynamic consensus algorithm. We show that a specific network topology called the expander graph dramatically improves the scalability of global consensus building. We conclude the paper by making some remarks on open problems.
△ Less
Submitted 23 August, 2022; v1 submitted 22 August, 2022;
originally announced August 2022.
-
Cardinality-Regularized Hawkes-Granger Model
Authors:
Tsuyoshi Idé,
Georgios Kollias,
Dzung T. Phan,
Naoki Abe
Abstract:
We propose a new sparse Granger-causal learning framework for temporal event data. We focus on a specific class of point processes called the Hawkes process. We begin by pointing out that most of the existing sparse causal learning algorithms for the Hawkes process suffer from a singularity in maximum likelihood estimation. As a result, their sparse solutions can appear only as numerical artifacts…
▽ More
We propose a new sparse Granger-causal learning framework for temporal event data. We focus on a specific class of point processes called the Hawkes process. We begin by pointing out that most of the existing sparse causal learning algorithms for the Hawkes process suffer from a singularity in maximum likelihood estimation. As a result, their sparse solutions can appear only as numerical artifacts. In this paper, we propose a mathematically well-defined sparse causal learning framework based on a cardinality-regularized Hawkes process, which remedies the pathological issues of existing approaches. We leverage the proposed algorithm for the task of instance-wise causal event analysis, where sparsity plays a critical role. We validate the proposed framework with two real use-cases, one from the power grid and the other from the cloud data center management domain.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Targeted Advertising on Social Networks Using Online Variational Tensor Regression
Authors:
Tsuyoshi Idé,
Keerthiram Murugesan,
Djallel Bouneffouf,
Naoki Abe
Abstract:
This paper is concerned with online targeted advertising on social networks. The main technical task we address is to estimate the activation probability for user pairs, which quantifies the influence one user may have on another towards purchasing decisions. This is a challenging task because one marketing episode typically involves a multitude of marketing campaigns/strategies of different produ…
▽ More
This paper is concerned with online targeted advertising on social networks. The main technical task we address is to estimate the activation probability for user pairs, which quantifies the influence one user may have on another towards purchasing decisions. This is a challenging task because one marketing episode typically involves a multitude of marketing campaigns/strategies of different products for highly diverse customers. In this paper, we propose what we believe is the first tensor-based contextual bandit framework for online targeted advertising. The proposed framework is designed to accommodate any number of feature vectors in the form of multi-mode tensor, thereby enabling to capture the heterogeneity that may exist over user preferences, products, and campaign strategies in a unified manner. To handle inter-dependency of tensor modes, we introduce an online variational algorithm with a mean-field approximation. We empirically confirm that the proposed TensorUCB algorithm achieves a significant improvement in influence maximization tasks over the benchmarks, which is attributable to its capability of capturing the user-product heterogeneity.
△ Less
Submitted 9 October, 2022; v1 submitted 22 August, 2022;
originally announced August 2022.
-
Building a Dialogue Corpus Annotated with Expressed and Experienced Emotions
Authors:
Tatsuya Ide,
Daisuke Kawahara
Abstract:
In communication, a human would recognize the emotion of an interlocutor and respond with an appropriate emotion, such as empathy and comfort. Toward develo** a dialogue system with such a human-like ability, we propose a method to build a dialogue corpus annotated with two kinds of emotions. We collect dialogues from Twitter and annotate each utterance with the emotion that a speaker put into t…
▽ More
In communication, a human would recognize the emotion of an interlocutor and respond with an appropriate emotion, such as empathy and comfort. Toward develo** a dialogue system with such a human-like ability, we propose a method to build a dialogue corpus annotated with two kinds of emotions. We collect dialogues from Twitter and annotate each utterance with the emotion that a speaker put into the utterance (expressed emotion) and the emotion that a listener felt after listening to the utterance (experienced emotion). We built a dialogue corpus in Japanese using this method, and its statistical analysis revealed the differences between expressed and experienced emotions. We conducted experiments on recognition of the two kinds of emotions. The experimental results indicated the difficulty in recognizing experienced emotions and the effectiveness of multi-task learning of the two kinds of emotions. We hope that the constructed corpus will facilitate the study on emotion recognition in a dialogue and emotion-aware dialogue response generation.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Supply Chain Logistics with Quantum and Classical Annealing Algorithms
Authors:
Sean J. Weinberg,
Fabio Sanches,
Takanori Ide,
Kazumitzu Kamiya,
Randall Correll
Abstract:
Noisy intermediate-scale quantum (NISQ) hardware is almost universally incompatible with full-scale optimization problems of practical importance which can have many variables and unwieldy objective functions. As a consequence, there is a growing body of literature that tests quantum algorithms on miniaturized versions of problems that arise in an operations research setting. Rather than taking th…
▽ More
Noisy intermediate-scale quantum (NISQ) hardware is almost universally incompatible with full-scale optimization problems of practical importance which can have many variables and unwieldy objective functions. As a consequence, there is a growing body of literature that tests quantum algorithms on miniaturized versions of problems that arise in an operations research setting. Rather than taking this approach, we investigate a problem of substantial commercial value, multi-truck vehicle routing for supply chain logistics, at the scale used by a corporation in their operations. Such a problem is too complex to be fully embedded on any near-term quantum hardware or simulator; we avoid confronting this challenge by taking a hybrid workflow approach: we iteratively assign routes for trucks by generating a new binary optimization problem instance one truck at a time. Each instance has $\sim 2500$ quadratic binary variables, putting it in a range that is feasible for NISQ quantum computing, especially quantum annealing hardware. We test our methods using simulated annealing and the D-Wave Hybrid solver as a place-holder in wait of quantum hardware developments. After feeding the vehicle routes suggested by these runs into a highly realistic classical supply chain simulation, we find excellent performance for the full supply chain. Our work gives a set of techniques that can be adopted in contexts beyond vehicle routing to apply NISQ devices in a hybrid fashion to large-scale problems of commercial interest.
△ Less
Submitted 9 May, 2022;
originally announced May 2022.
-
Directed Graph Auto-Encoders
Authors:
Georgios Kollias,
Vasileios Kalantzis,
Tsuyoshi Idé,
Aurélie Lozano,
Naoki Abe
Abstract:
We introduce a new class of auto-encoders for directed graphs, motivated by a direct extension of the Weisfeiler-Leman algorithm to pairs of node labels. The proposed model learns pairs of interpretable latent representations for the nodes of directed graphs, and uses parameterized graph convolutional network (GCN) layers for its encoder and an asymmetric inner product decoder. Parameters in the e…
▽ More
We introduce a new class of auto-encoders for directed graphs, motivated by a direct extension of the Weisfeiler-Leman algorithm to pairs of node labels. The proposed model learns pairs of interpretable latent representations for the nodes of directed graphs, and uses parameterized graph convolutional network (GCN) layers for its encoder and an asymmetric inner product decoder. Parameters in the encoder control the weighting of representations exchanged between neighboring nodes. We demonstrate the ability of the proposed model to learn meaningful latent embeddings and achieve superior performance on the directed link prediction task on several popular network datasets.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Short Quantum Circuits in Reinforcement Learning Policies for the Vehicle Routing Problem
Authors:
Fabio Sanches,
Sean Weinberg,
Takanori Ide,
Kazumitsu Kamiya
Abstract:
Quantum computing and machine learning have potential for symbiosis. However, in addition to the hardware limitations from current devices, there are still basic issues that must be addressed before quantum circuits can usefully incorporate with current machine learning tasks. We report a new strategy for such an integration in the context of attention models used for reinforcement learning. Agent…
▽ More
Quantum computing and machine learning have potential for symbiosis. However, in addition to the hardware limitations from current devices, there are still basic issues that must be addressed before quantum circuits can usefully incorporate with current machine learning tasks. We report a new strategy for such an integration in the context of attention models used for reinforcement learning. Agents that implement attention mechanisms have successfully been applied to certain cases of combinatorial routing problems by first encoding nodes on a graph and then sequentially decoding nodes until a route is selected. We demonstrate that simple quantum circuits can used in place of classical attention head layers while maintaining performance. Our method modifies the networks used in [1] by replacing key and query vectors for every node with quantum states that are entangled before being measured. The resulting hybrid classical-quantum agent is tested in the context of vehicle routing problems where its performance is competitive with the original classical approach. We regard our model as a prototype that can be scaled up and as an avenue for further study on the role of quantum computing in reinforcement learning.
△ Less
Submitted 15 September, 2021;
originally announced September 2021.
-
Multi-Task Learning of Generation and Classification for Emotion-Aware Dialogue Response Generation
Authors:
Tatsuya Ide,
Daisuke Kawahara
Abstract:
For a computer to naturally interact with a human, it needs to be human-like. In this paper, we propose a neural response generation model with multi-task learning of generation and classification, focusing on emotion. Our model based on BART (Lewis et al., 2020), a pre-trained transformer encoder-decoder model, is trained to generate responses and recognize emotions simultaneously. Furthermore, w…
▽ More
For a computer to naturally interact with a human, it needs to be human-like. In this paper, we propose a neural response generation model with multi-task learning of generation and classification, focusing on emotion. Our model based on BART (Lewis et al., 2020), a pre-trained transformer encoder-decoder model, is trained to generate responses and recognize emotions simultaneously. Furthermore, we weight the losses for the tasks to control the update of parameters. Automatic evaluations and crowdsourced manual evaluations show that the proposed model makes generated responses more emotionally aware.
△ Less
Submitted 25 May, 2021;
originally announced May 2021.
-
Influence of interface dipole layers on the performance of graphene field effect transistors
Authors:
Naoka Nagamura,
Hirokazu Fukidome,
Kosuke Nagashio,
Koji Horiba,
Takayuki Ide,
Kazutoshi Funakubo,
Keiichiro Tashima,
Akira Toriumi,
Maki Suemitsu,
Karsten Horn,
Masaharu Oshima
Abstract:
The linear band dispersion of graphene's bands near the Fermi level gives rise to its unique electronic properties, such as a giant carrier mobility, and this has triggered extensive research in applications, such as graphene field-effect transistors (GFETs). However, GFETs generally exhibit a device performance much inferior compared to the expected one. This has been attributed to a strong depen…
▽ More
The linear band dispersion of graphene's bands near the Fermi level gives rise to its unique electronic properties, such as a giant carrier mobility, and this has triggered extensive research in applications, such as graphene field-effect transistors (GFETs). However, GFETs generally exhibit a device performance much inferior compared to the expected one. This has been attributed to a strong dependence of the electronic properties of graphene on the surrounding interfaces. Here we study the interface between a graphene channel and SiO$_{2}$, and by means of photoelectron spectromicroscopy achieve a detailed determination of the course of band alignment at the interface. Our results show that the electronic properties of graphene are modulated by a hydrophilic SiO$_{2}$ surface, but not by a hydrophobic one. By combining photoelectron spectromicroscopy with GFET transport property characterization, we demonstrate that the presence of electrical dipoles in the interface, which reflects the SiO$_{2}$ surface electrochemistry, determines the GFET device performance. A hysteresis in the resistance vs. gate voltage as a function of polarity is ascribed to a reversal of the dipole layer by the gate voltage. These data pave the way for GFET device optimization.
△ Less
Submitted 7 July, 2019;
originally announced July 2019.
-
Experiments to investigate the effects of radiative cooling on plasma jet collimation
Authors:
C. D. Gregory,
A. Diziere,
H. Aoki,
M. Besio,
S. Bouquet,
E. Falize,
T. Ide,
B. Loupias,
C. Michaut,
T. Morita,
S. A. Pikuz Jr.,
A. Ravasio,
Y. Kuramtisu,
Y. Sakawa,
H. Takabe,
H. Tanji,
N. C. Woolsey,
M. Koenig
Abstract:
Preliminary experiments have been performed to investigate the effects of radiative cooling on plasma jets. Thin (3 um - 5 um) conical shells were irradiated with an intense laser, driving jets with velocities > 100 km/s. Through use of different target materials - aluminium, copper and gold - the degree of radiative losses was altered, and their importance for jet collimation investigated. A numb…
▽ More
Preliminary experiments have been performed to investigate the effects of radiative cooling on plasma jets. Thin (3 um - 5 um) conical shells were irradiated with an intense laser, driving jets with velocities > 100 km/s. Through use of different target materials - aluminium, copper and gold - the degree of radiative losses was altered, and their importance for jet collimation investigated. A number of temporally resoved optical diagnostics was used, providing information about the jet evolution. Gold jets were seen to be narrower than those from copper targets, while aluminium targets produced the least collimated flows.
△ Less
Submitted 3 November, 2010;
originally announced November 2010.
-
Successive phase transitions to antiferromagnetic and weak-ferromagnetic long-range orders in quasi-one-dimensional antiferromagnet Cu$_3$Mo$_2$O$_9$
Authors:
Tomoaki Hamasaki,
Tomoyuki Ide,
Haruhiko Kuroe,
Tomoyuki Sekine,
Masashi Hase,
Ichiro Tsukada,
Toshiro Sakakibara
Abstract:
Investigation of the magnetism of Cu$_3$Mo$_2$O$_9$ single crystal, which has antiferromagnetic (AF) linear chains interacting with AF dimers, reveals an AF second-order phase transition at $T_{\rm N} = 7.9$ K. Although weak ferromagnetic-like behavior appears at lower temperatures in low magnetic fields, complete remanent magnetization cannot be detected down to 0.5 K. However, a jump is observ…
▽ More
Investigation of the magnetism of Cu$_3$Mo$_2$O$_9$ single crystal, which has antiferromagnetic (AF) linear chains interacting with AF dimers, reveals an AF second-order phase transition at $T_{\rm N} = 7.9$ K. Although weak ferromagnetic-like behavior appears at lower temperatures in low magnetic fields, complete remanent magnetization cannot be detected down to 0.5 K. However, a jump is observed in the magnetization below weak ferromagnetic (WF) phase transition at $T_{\rm c} \simeq 2.5$ K when a tiny magnetic field along the a axis is reversed, suggesting that the coercive force is very weak. A component of magnetic moment parallel to the chain forms AF long-range order (LRO) below $T_{\rm N}$, while a perpendicular component is disordered above $T_{\rm c}$ at zero magnetic field and forms WF-LRO below $T_{\rm c}$. Moreover, the WF-LRO is also realized with applying magnetic fields even between $T_{\rm c}$ and $T_{\rm N}$. These results are explainable by both magnetic frustration among symmetric exchange interactions and competition between symmetric and asymmetric Dzyaloshinskii-Moriya exchange interactions.
△ Less
Submitted 19 March, 2008;
originally announced March 2008.
-
Accidental cloning of a single-photon qubit in two-channel continuous-variable quantum teleportation
Authors:
Toshiki Ide,
Holger F. Hofmann
Abstract:
The information encoded in the polarization of a single photon can be transferred to a remote location by two-channel continuous-variable quantum teleportation. However, the finite entanglement used in the teleportation causes random changes in photon number. If more than one photon appears in the output, the continuous-variable teleportation accidentally produces clones of the original input ph…
▽ More
The information encoded in the polarization of a single photon can be transferred to a remote location by two-channel continuous-variable quantum teleportation. However, the finite entanglement used in the teleportation causes random changes in photon number. If more than one photon appears in the output, the continuous-variable teleportation accidentally produces clones of the original input photon. In this paper, we derive the polarization statistics of the $N$-photon output components and show that they can be decomposed into an optimal cloning term and completely unpolarized noise. We find that the accidental cloning of the input photon is nearly optimal at experimentally feasible squeezing levels, indicating that the loss of polarization information is partially compensated by the availability of clones.
△ Less
Submitted 5 April, 2007; v1 submitted 22 February, 2007;
originally announced February 2007.
-
Optimal cloning of single photon polarization by coherent feedback of beam splitter losses
Authors:
Holger F. Hofmann,
Toshiki Ide
Abstract:
Light fields can be amplified by measuring the field amplitude reflected at a beam splitter of reflectivity R and adding a coherent amplitude proportional to the measurement result to the transmitted field. By applying the quantum optical realization of this amplification scheme to single photon inputs, it is possible to clone the polarization states of photons. We show that optimal cloning of s…
▽ More
Light fields can be amplified by measuring the field amplitude reflected at a beam splitter of reflectivity R and adding a coherent amplitude proportional to the measurement result to the transmitted field. By applying the quantum optical realization of this amplification scheme to single photon inputs, it is possible to clone the polarization states of photons. We show that optimal cloning of single photon polarization is possible when the gain factor of the amplification is equal to the inverse squareroot of 1-R.
△ Less
Submitted 30 July, 2006; v1 submitted 30 November, 2005;
originally announced December 2005.
-
Transfer of single photon polarization states by two-channel continuous variable teleportation
Authors:
Toshiki Ide,
Holger F. Hofmann
Abstract:
Superpositions of two orthogonal single-photon polarization states are commonly used as optical qubits. If such qubits are sent by continuous variable quantum teleportation, the modifications of the qubit states due to imperfect entanglement cause an increase in the average photon number of the output state. This effect can be interpreted as an accidental quantum cloning of the single photon inp…
▽ More
Superpositions of two orthogonal single-photon polarization states are commonly used as optical qubits. If such qubits are sent by continuous variable quantum teleportation, the modifications of the qubit states due to imperfect entanglement cause an increase in the average photon number of the output state. This effect can be interpreted as an accidental quantum cloning of the single photon input. We analyze the output statistics of the single photon teleportation and derive the transfer and cloning fidelities from the equations of the polarization qubit.
△ Less
Submitted 22 November, 2005;
originally announced November 2005.
-
Continuous variable teleportation of single photon states (Proceedings version)
Authors:
Toshiki Ide,
Holger F. Hofmann,
Takayoshi Kobayashi,
Akira Furusawa
Abstract:
We investigate the changes to a single photon state caused by the non-maximal entanglement in continuous variable quantum teleportation. It is shown that the teleportation measurement introduces field coherence in the output.
We investigate the changes to a single photon state caused by the non-maximal entanglement in continuous variable quantum teleportation. It is shown that the teleportation measurement introduces field coherence in the output.
△ Less
Submitted 3 December, 2001;
originally announced December 2001.
-
Gain tuning and fidelity in continuous variable quantum teleportation
Authors:
Toshiki Ide,
Holger F. Hofmann,
Akira Furusawa,
Takayoshi Kobayashi
Abstract:
The fidelity of continuous variable teleportation can be optimized by changing the gain in the modulation of the output field. We discuss the gain dependence of fidelity for coherent, vacuum and one photon inputs and propose optimal gain tuning strategies for corresponding input selections.
The fidelity of continuous variable teleportation can be optimized by changing the gain in the modulation of the output field. We discuss the gain dependence of fidelity for coherent, vacuum and one photon inputs and propose optimal gain tuning strategies for corresponding input selections.
△ Less
Submitted 28 March, 2002; v1 submitted 24 November, 2001;
originally announced November 2001.
-
Information extraction and quantum state distortions in continuous variable quantum teleportation
Authors:
Holger F. Hofmann,
Toshiki Ide,
Takayoshi Kobayashi,
Akira Furusawa
Abstract:
We analyze the loss of fidelity in continuous variable teleportation due to non-maximal entanglement. It is shown that the quantum state distortions correspond to the measurement back-action of a field amplitude measurement. Results for coherent states and for photon number states are presented.
We analyze the loss of fidelity in continuous variable teleportation due to non-maximal entanglement. It is shown that the quantum state distortions correspond to the measurement back-action of a field amplitude measurement. Results for coherent states and for photon number states are presented.
△ Less
Submitted 22 October, 2001;
originally announced October 2001.
-
Continuous variable teleportation of single photon states
Authors:
Toshiki Ide,
Holger F. Hofmann,
Takayoshi Kobayashi,
Akira Furusawa
Abstract:
The properties of continuous variable teleportation of single photon states are investigated. The output state is different from the input state due to the non-maximal entanglement in the EPR beams. The photon statistics of the teleportation output are determined and the correlation between the field information beta obtained in the teleportation process and the change in photon number is discus…
▽ More
The properties of continuous variable teleportation of single photon states are investigated. The output state is different from the input state due to the non-maximal entanglement in the EPR beams. The photon statistics of the teleportation output are determined and the correlation between the field information beta obtained in the teleportation process and the change in photon number is discussed. The results of the output photon statistics are applied to the transmission of a qbit encoded in the polarization of a single photon.
△ Less
Submitted 25 July, 2001; v1 submitted 3 April, 2001;
originally announced April 2001.
-
Information losses in continuous variable quantum teleportation
Authors:
Holger F. Hofmann,
Toshiki Ide,
Takayoshi Kobayashi,
Akira Furusawa
Abstract:
It is shown that the information losses due to the limited fidelity of continuous variable quantum teleportation are equivalent to the losses induced by a beam splitter of appropriate reflectivity.
It is shown that the information losses due to the limited fidelity of continuous variable quantum teleportation are equivalent to the losses induced by a beam splitter of appropriate reflectivity.
△ Less
Submitted 5 July, 2001; v1 submitted 20 February, 2001;
originally announced February 2001.
-
Fidelity and information in the quantum teleportation of continuous variables
Authors:
Holger F. Hofmann,
Toshiki Ide,
Takayoshi Kobayashi,
Akira Furusawa
Abstract:
Ideally, quantum teleportation should transfer a quantum state without distortion and without providing any information about that state. However, quantum teleportation of continuous electromagnetic field variables introduces additional noise, limiting the fidelity of the quantum state transfer. In this article, the operator describing the quantum state transfer is derived. The transfer operator…
▽ More
Ideally, quantum teleportation should transfer a quantum state without distortion and without providing any information about that state. However, quantum teleportation of continuous electromagnetic field variables introduces additional noise, limiting the fidelity of the quantum state transfer. In this article, the operator describing the quantum state transfer is derived. The transfer operator modifies the probability amplitudes of the quantum state in a shifted photon number base by enhancing low photon numbers and suppressing high photon numbers. This modification of the statistical weight corresponds to a measurement of finite resolution performed on the original quantum state. The limited fidelity of quantum teleportation is thus shown to be a direct consequence of the information obtained in the measurement.
△ Less
Submitted 6 June, 2000; v1 submitted 15 March, 2000;
originally announced March 2000.