Search | arXiv e-print repository

Improving Transformers using Faithful Positional Encoding

Authors: Tsuyoshi Idé, Jokin Labaien, Pin-Yu Chen

Abstract: We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach is based on solid mathematical grounds and has a guarantee of not losing information about the positional order of the input sequence. We show that the new encoding approach systematically improves the prediction performance in the t… ▽ More We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach is based on solid mathematical grounds and has a guarantee of not losing information about the positional order of the input sequence. We show that the new encoding approach systematically improves the prediction performance in the time-series classification task. △ Less

Submitted 16 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2305.17149

arXiv:2404.01270 [pdf, other]

Decentralized Collaborative Learning Framework with External Privacy Leakage Analysis

Authors: Tsuyoshi Idé, Dzung T. Phan, Rudy Raymond

Abstract: This paper presents two methodological advancements in decentralized multi-task learning under privacy constraints, aiming to pave the way for future developments in next-generation Blockchain platforms. First, we expand the existing framework for collaborative dictionary learning (CollabDict), which has previously been limited to Gaussian mixture models, by incorporating deep variational autoenco… ▽ More This paper presents two methodological advancements in decentralized multi-task learning under privacy constraints, aiming to pave the way for future developments in next-generation Blockchain platforms. First, we expand the existing framework for collaborative dictionary learning (CollabDict), which has previously been limited to Gaussian mixture models, by incorporating deep variational autoencoders (VAEs) into the framework, with a particular focus on anomaly detection. We demonstrate that the VAE-based anomaly score function shares the same mathematical structure as the non-deep model, and provide comprehensive qualitative comparison. Second, considering the widespread use of "pre-trained models," we provide a mathematical analysis on data privacy leakage when models trained with CollabDict are shared externally. We show that the CollabDict approach, when applied to Gaussian mixtures, adheres to a Renyi differential privacy criterion. Additionally, we propose a practical metric for monitoring internal privacy breaches during the learning process. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: To appear in Proceeding of 2023 International workshop Blockchain Kaigi (BCK 23), JPS Conference Proceedings, 2024

arXiv:2403.10638 [pdf, other]

A resource-constrained stochastic scheduling algorithm for homeless street outreach and gleaning edible food

Authors: Conor M. Artman, Aditya Mate, Ezinne Nwankwo, Aliza Heching, Tsuyoshi Idé, Jiří Navrátil, Karthikeyan Shanmugam, Wei Sun, Kush R. Varshney, Lauri Goldkind, Gidi Kroch, Jaclyn Sawyer, Ian Watson

Abstract: We developed a common algorithmic solution addressing the problem of resource-constrained outreach encountered by social change organizations with different missions and operations: Breaking Ground -- an organization that helps individuals experiencing homelessness in New York transition to permanent housing and Leket -- the national food bank of Israel that rescues food from farms and elsewhere t… ▽ More We developed a common algorithmic solution addressing the problem of resource-constrained outreach encountered by social change organizations with different missions and operations: Breaking Ground -- an organization that helps individuals experiencing homelessness in New York transition to permanent housing and Leket -- the national food bank of Israel that rescues food from farms and elsewhere to feed the hungry. Specifically, we developed an estimation and optimization approach for partially-observed episodic restless bandits under $k$-step transitions. The results show that our Thompson sampling with Markov chain recovery (via Stein variational gradient descent) algorithm significantly outperforms baselines for the problems of both organizations. We carried out this work in a prospective manner with the express goal of devising a flexible-enough but also useful-enough solution that can help overcome a lack of sustainable impact in data science for social good. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2402.03726 [pdf, other]

Learning Granger Causality from Instance-wise Self-attentive Hawkes Processes

Authors: Dongxia Wu, Tsuyoshi Idé, Aurélie Lozano, Georgios Kollias, Jiří Navrátil, Naoki Abe, Yi-An Ma, Rose Yu

Abstract: We address the problem of learning Granger causality from asynchronous, interdependent, multi-type event sequences. In particular, we are interested in discovering instance-level causal structures in an unsupervised manner. Instance-level causality identifies causal relationships among individual events, providing more fine-grained information for decision-making. Existing work in the literature e… ▽ More We address the problem of learning Granger causality from asynchronous, interdependent, multi-type event sequences. In particular, we are interested in discovering instance-level causal structures in an unsupervised manner. Instance-level causality identifies causal relationships among individual events, providing more fine-grained information for decision-making. Existing work in the literature either requires strong assumptions, such as linearity in the intensity function, or heuristically defined model parameters that do not necessarily meet the requirements of Granger causality. We propose Instance-wise Self-Attentive Hawkes Processes (ISAHP), a novel deep learning framework that can directly infer the Granger causality at the event instance level. ISAHP is the first neural point process model that meets the requirements of Granger causality. It leverages the self-attention mechanism of the transformer to align with the principles of Granger causality. We empirically demonstrate that ISAHP is capable of discovering complex instance-level causal structures that cannot be handled by classical models. We also show that ISAHP achieves state-of-the-art performance in proxy tasks involving type-level causal discovery and instance-level event type prediction. △ Less

Submitted 29 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

arXiv:2401.08669 [pdf, other]

Deep Reinforcement Learning for Multi-Truck Vehicle Routing Problems with Multi-Leg Demand Routes

Authors: Joshua Levin, Randall Correll, Takanori Ide, Takafumi Suzuki, Takaho Saito, Alan Arai

Abstract: Deep reinforcement learning (RL) has been shown to be effective in producing approximate solutions to some vehicle routing problems (VRPs), especially when using policies generated by encoder-decoder attention mechanisms. While these techniques have been quite successful for relatively simple problem instances, there are still under-researched and highly complex VRP variants for which no effective… ▽ More Deep reinforcement learning (RL) has been shown to be effective in producing approximate solutions to some vehicle routing problems (VRPs), especially when using policies generated by encoder-decoder attention mechanisms. While these techniques have been quite successful for relatively simple problem instances, there are still under-researched and highly complex VRP variants for which no effective RL method has been demonstrated. In this work we focus on one such VRP variant, which contains multiple trucks and multi-leg routing requirements. In these problems, demand is required to move along sequences of nodes, instead of just from a start node to an end node. With the goal of making deep RL a viable strategy for real-world industrial-scale supply chain logistics, we develop new extensions to existing encoder-decoder attention models which allow them to handle multiple trucks and multi-leg routing requirements. Our models have the advantage that they can be trained for a small number of trucks and nodes, and then embedded into a large supply chain to yield solutions for larger numbers of trucks and nodes. We test our approach on a real supply chain environment arising in the operations of Japanese automotive parts manufacturer Aisin Corporation, and find that our algorithm outperforms Aisin's previous best solution. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 13 pages, 4 figures

arXiv:2310.07170 [pdf, other]

PHALM: Building a Knowledge Graph from Scratch by Prompting Humans and a Language Model

Authors: Tatsuya Ide, Eiki Murata, Daisuke Kawahara, Takato Yamazaki, Shengzhe Li, Kenta Shinzato, Toshinori Sato

Abstract: Despite the remarkable progress in natural language understanding with pretrained Transformers, neural language models often do not handle commonsense knowledge well. Toward commonsense-aware models, there have been attempts to obtain knowledge, ranging from automatic acquisition to crowdsourcing. However, it is difficult to obtain a high-quality knowledge base at a low cost, especially from scrat… ▽ More Despite the remarkable progress in natural language understanding with pretrained Transformers, neural language models often do not handle commonsense knowledge well. Toward commonsense-aware models, there have been attempts to obtain knowledge, ranging from automatic acquisition to crowdsourcing. However, it is difficult to obtain a high-quality knowledge base at a low cost, especially from scratch. In this paper, we propose PHALM, a method of building a knowledge graph from scratch, by prompting both crowdworkers and a large language model (LLM). We used this method to build a Japanese event knowledge graph and trained Japanese commonsense generation models. Experimental results revealed the acceptability of the built graph and inferences generated by the trained models. We also report the difference in prompting humans and an LLM. Our code, data, and models are available at github.com/nlp-waseda/comet-atomic-ja. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2308.04708 [pdf, other]

doi 10.1145/3580305.3599365

Generative Perturbation Analysis for Probabilistic Black-Box Anomaly Attribution

Authors: Tsuyoshi Idé, Naoki Abe

Abstract: We address the task of probabilistic anomaly attribution in the black-box regression setting, where the goal is to compute the probability distribution of the attribution score of each input variable, given an observed anomaly. The training dataset is assumed to be unavailable. This task differs from the standard XAI (explainable AI) scenario, since we wish to explain the anomalous deviation from… ▽ More We address the task of probabilistic anomaly attribution in the black-box regression setting, where the goal is to compute the probability distribution of the attribution score of each input variable, given an observed anomaly. The training dataset is assumed to be unavailable. This task differs from the standard XAI (explainable AI) scenario, since we wish to explain the anomalous deviation from a black-box prediction rather than the black-box model itself. We begin by showing that mainstream model-agnostic explanation methods, such as the Shapley values, are not suitable for this task because of their ``deviation-agnostic property.'' We then propose a novel framework for probabilistic anomaly attribution that allows us to not only compute attribution scores as the predictive mean but also quantify the uncertainty of those scores. This is done by considering a generative process for perturbations that counter-factually bring the observed anomalous observation back to normalcy. We introduce a variational Bayes algorithm for deriving the distributions of per variable attribution scores. To the best of our knowledge, this is the first probabilistic anomaly attribution framework that is free from being deviation-agnostic. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Journal ref: KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 2023, pp.845-856

arXiv:2305.18440 [pdf, other]

Black-Box Anomaly Attribution

Authors: Tsuyoshi Idé, Naoki Abe

Abstract: When the prediction of a black-box machine learning model deviates from the true observation, what can be said about the reason behind that deviation? This is a fundamental and ubiquitous question that the end user in a business or industrial AI application often asks. The deviation may be due to a sub-optimal black-box model, or it may be simply because the sample in question is an outlier. In ei… ▽ More When the prediction of a black-box machine learning model deviates from the true observation, what can be said about the reason behind that deviation? This is a fundamental and ubiquitous question that the end user in a business or industrial AI application often asks. The deviation may be due to a sub-optimal black-box model, or it may be simply because the sample in question is an outlier. In either case, one would ideally wish to obtain some form of attribution score -- a value indicative of the extent to which an input variable is responsible for the anomaly. In the present paper we address this task of ``anomaly attribution,'' particularly in the setting in which the model is black-box and the training data are not available. Specifically, we propose a novel likelihood-based attribution framework we call the ``likelihood compensation (LC),'' in which the responsibility score is equated with the correction on each input variable needed to attain the highest possible likelihood. We begin by showing formally why mainstream model-agnostic explanation methods, such as the local linear surrogate modeling and Shapley values, are not designed to explain anomalies. In particular, we show that they are ``deviation-agnostic,'' namely, that their explanations are blind to the fact that there is a deviation in the model prediction for the sample of interest. We do this by positioning these existing methods under the unified umbrella of a function family we call the ``integrated gradient family.'' We validate the effectiveness of the proposed LC approach using publicly available data sets. We also conduct a case study with a real-world building energy prediction task and confirm its usefulness in practice based on expert feedback. △ Less

Submitted 28 May, 2023; originally announced May 2023.

arXiv:2305.17149 [pdf, other]

doi 10.1016/j.knosys.2023.110639

Diagnostic Spatio-temporal Transformer with Faithful Encoding

Authors: Jokin Labaien, Tsuyoshi Idé, Pin-Yu Chen, Ekhi Zugasti, Xabier De Carlos

Abstract: This paper addresses the task of anomaly diagnosis when the underlying data generation process has a complex spatio-temporal (ST) dependency. The key technical challenge is to extract actionable insights from the dependency tensor characterizing high-order interactions among temporal and spatial indices. We formalize the problem as supervised dependency discovery, where the ST dependency is learne… ▽ More This paper addresses the task of anomaly diagnosis when the underlying data generation process has a complex spatio-temporal (ST) dependency. The key technical challenge is to extract actionable insights from the dependency tensor characterizing high-order interactions among temporal and spatial indices. We formalize the problem as supervised dependency discovery, where the ST dependency is learned as a side product of multivariate time-series classification. We show that temporal positional encoding used in existing ST transformer works has a serious limitation in capturing higher frequencies (short time scales). We propose a new positional encoding with a theoretical guarantee, based on discrete Fourier transform. We also propose a new ST dependency discovery framework, which can provide readily consumable diagnostic information in both spatial and temporal directions. Finally, we demonstrate the utility of the proposed model, DFStrans (Diagnostic Fourier-based Spatio-temporal Transformer), in a real industrial application of building elevator control. △ Less

Submitted 26 May, 2023; originally announced May 2023.

arXiv:2212.00576 [pdf, other]

Quantum Neural Networks for a Supply Chain Logistics Application

Authors: Randall Correll, Sean J. Weinberg, Fabio Sanches, Takanori Ide, Takafumi Suzuki

Abstract: Problem instances of a size suitable for practical applications are not likely to be addressed during the noisy intermediate-scale quantum (NISQ) period with (almost) pure quantum algorithms. Hybrid classical-quantum algorithms have potential, however, to achieve good performance on much larger problem instances. We investigate one such hybrid algorithm on a problem of substantial importance: vehi… ▽ More Problem instances of a size suitable for practical applications are not likely to be addressed during the noisy intermediate-scale quantum (NISQ) period with (almost) pure quantum algorithms. Hybrid classical-quantum algorithms have potential, however, to achieve good performance on much larger problem instances. We investigate one such hybrid algorithm on a problem of substantial importance: vehicle routing for supply chain logistics with multiple trucks and complex demand structure. We use reinforcement learning with neural networks with embedded quantum circuits. In such neural networks, projecting high-dimensional feature vectors down to smaller vectors is necessary to accommodate restrictions on the number of qubits of NISQ hardware. However, we use a multi-head attention mechanism where, even in classical machine learning, such projections are natural and desirable. We consider data from the truck routing logistics of a company in the automotive sector, and apply our methodology by decomposing into small teams of trucks, and we find results comparable to human truck assignment. △ Less

Submitted 2 December, 2022; v1 submitted 30 November, 2022; originally announced December 2022.

Comments: 14 pages, 11 figures. arXiv admin note: text overlap with arXiv:2211.17078 - updated citation [3] to reference arXiv:2211.17078

arXiv:2211.17078 [pdf, other]

Reinforcement Learning for Multi-Truck Vehicle Routing Problems

Authors: Randall Correll, Sean J. Weinberg, Fabio Sanches, Takanori Ide, Takafumi Suzuki

Abstract: Vehicle routing problems and other combinatorial optimization problems have been approximately solved by reinforcement learning agents with policies based on encoder-decoder models with attention mechanisms. These techniques are of substantial interest but still cannot solve the complex routing problems that arise in a realistic setting which can have many trucks and complex requirements. With the… ▽ More Vehicle routing problems and other combinatorial optimization problems have been approximately solved by reinforcement learning agents with policies based on encoder-decoder models with attention mechanisms. These techniques are of substantial interest but still cannot solve the complex routing problems that arise in a realistic setting which can have many trucks and complex requirements. With the aim of making reinforcement learning a viable technique for supply chain optimization, we develop new extensions to encoder-decoder models for vehicle routing that allow for complex supply chains using classical computing today and quantum computing in the future. We make two major generalizations. First, our model allows for routing problems with multiple trucks. Second, we move away from the simple requirement of having a truck deliver items from nodes to one special depot node, and instead allow for a complex tensor demand structure. We show how our model, even if trained only for a small number of trucks, can be embedded into a large supply chain to yield viable solutions. △ Less

Submitted 10 December, 2022; v1 submitted 30 November, 2022; originally announced November 2022.

Comments: 18 pages, 7 figures v2 updates citations [13] and [14]

arXiv:2208.10679 [pdf, other]

doi 10.1609/aaai.v35i5.16535

Anomaly Attribution with Likelihood Compensation

Authors: Tsuyoshi Idé, Amit Dhurandhar, Jiří Navrátil, Moninder Singh, Naoki Abe

Abstract: This paper addresses the task of explaining anomalous predictions of a black-box regression model. When using a black-box model, such as one to predict building energy consumption from many sensor measurements, we often have a situation where some observed samples may significantly deviate from their prediction. It may be due to a sub-optimal black-box model, or simply because those samples are ou… ▽ More This paper addresses the task of explaining anomalous predictions of a black-box regression model. When using a black-box model, such as one to predict building energy consumption from many sensor measurements, we often have a situation where some observed samples may significantly deviate from their prediction. It may be due to a sub-optimal black-box model, or simply because those samples are outliers. In either case, one would ideally want to compute a ``responsibility score'' indicative of the extent to which an input variable is responsible for the anomalous output. In this work, we formalize this task as a statistical inverse problem: Given model deviation from the expected value, infer the responsibility score of each of the input variables. We propose a new method called likelihood compensation (LC), which is founded on the likelihood principle and computes a correction to each input variable. To the best of our knowledge, this is the first principled framework that computes a responsibility score for real valued anomalous model deviations. We apply our approach to a real-world building energy prediction task and confirm its utility based on expert feedback. △ Less

Submitted 22 August, 2022; originally announced August 2022.

Comments: 8 pages, 7 figures

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 35(5), 4131-4138, 2021

arXiv:2208.10674 [pdf, other]

Decentralized Collaborative Learning with Probabilistic Data Protection

Authors: Tsuyoshi Idé, Rudy Raymond

Abstract: We discuss future directions of Blockchain as a collaborative value co-creation platform, in which network participants can gain extra insights that cannot be accessed when disconnected from the others. As such, we propose a decentralized machine learning framework that is carefully designed to respect the values of democracy, diversity, and privacy. Specifically, we propose a federated multi-task… ▽ More We discuss future directions of Blockchain as a collaborative value co-creation platform, in which network participants can gain extra insights that cannot be accessed when disconnected from the others. As such, we propose a decentralized machine learning framework that is carefully designed to respect the values of democracy, diversity, and privacy. Specifically, we propose a federated multi-task learning framework that integrates a privacy-preserving dynamic consensus algorithm. We show that a specific network topology called the expander graph dramatically improves the scalability of global consensus building. We conclude the paper by making some remarks on open problems. △ Less

Submitted 23 August, 2022; v1 submitted 22 August, 2022; originally announced August 2022.

Comments: Tsuyoshi Idé and Rudy Raymond, "Decentralized Collaborative Learning with Probabilistic Data Protection," In Proceedings of the 2021 IEEE International Conference on Smart Data Services (SMDS 21, September 5-10, 2021, virtual), pp.234-243

arXiv:2208.10671 [pdf, other]

Cardinality-Regularized Hawkes-Granger Model

Authors: Tsuyoshi Idé, Georgios Kollias, Dzung T. Phan, Naoki Abe

Abstract: We propose a new sparse Granger-causal learning framework for temporal event data. We focus on a specific class of point processes called the Hawkes process. We begin by pointing out that most of the existing sparse causal learning algorithms for the Hawkes process suffer from a singularity in maximum likelihood estimation. As a result, their sparse solutions can appear only as numerical artifacts… ▽ More We propose a new sparse Granger-causal learning framework for temporal event data. We focus on a specific class of point processes called the Hawkes process. We begin by pointing out that most of the existing sparse causal learning algorithms for the Hawkes process suffer from a singularity in maximum likelihood estimation. As a result, their sparse solutions can appear only as numerical artifacts. In this paper, we propose a mathematically well-defined sparse causal learning framework based on a cardinality-regularized Hawkes process, which remedies the pathological issues of existing approaches. We leverage the proposed algorithm for the task of instance-wise causal event analysis, where sparsity plays a critical role. We validate the proposed framework with two real use-cases, one from the power grid and the other from the cloud data center management domain. △ Less

Submitted 22 August, 2022; originally announced August 2022.

Comments: 17 pages, 9 figures

arXiv:2208.10627 [pdf, other]

Targeted Advertising on Social Networks Using Online Variational Tensor Regression

Authors: Tsuyoshi Idé, Keerthiram Murugesan, Djallel Bouneffouf, Naoki Abe

Abstract: This paper is concerned with online targeted advertising on social networks. The main technical task we address is to estimate the activation probability for user pairs, which quantifies the influence one user may have on another towards purchasing decisions. This is a challenging task because one marketing episode typically involves a multitude of marketing campaigns/strategies of different produ… ▽ More This paper is concerned with online targeted advertising on social networks. The main technical task we address is to estimate the activation probability for user pairs, which quantifies the influence one user may have on another towards purchasing decisions. This is a challenging task because one marketing episode typically involves a multitude of marketing campaigns/strategies of different products for highly diverse customers. In this paper, we propose what we believe is the first tensor-based contextual bandit framework for online targeted advertising. The proposed framework is designed to accommodate any number of feature vectors in the form of multi-mode tensor, thereby enabling to capture the heterogeneity that may exist over user preferences, products, and campaign strategies in a unified manner. To handle inter-dependency of tensor modes, we introduce an online variational algorithm with a mean-field approximation. We empirically confirm that the proposed TensorUCB algorithm achieves a significant improvement in influence maximization tasks over the benchmarks, which is attributable to its capability of capturing the user-product heterogeneity. △ Less

Submitted 9 October, 2022; v1 submitted 22 August, 2022; originally announced August 2022.

Comments: 18 pages, 7 figures

MSC Class: 68T05

arXiv:2205.11867 [pdf, other]

Building a Dialogue Corpus Annotated with Expressed and Experienced Emotions

Authors: Tatsuya Ide, Daisuke Kawahara

Abstract: In communication, a human would recognize the emotion of an interlocutor and respond with an appropriate emotion, such as empathy and comfort. Toward develo** a dialogue system with such a human-like ability, we propose a method to build a dialogue corpus annotated with two kinds of emotions. We collect dialogues from Twitter and annotate each utterance with the emotion that a speaker put into t… ▽ More In communication, a human would recognize the emotion of an interlocutor and respond with an appropriate emotion, such as empathy and comfort. Toward develo** a dialogue system with such a human-like ability, we propose a method to build a dialogue corpus annotated with two kinds of emotions. We collect dialogues from Twitter and annotate each utterance with the emotion that a speaker put into the utterance (expressed emotion) and the emotion that a listener felt after listening to the utterance (experienced emotion). We built a dialogue corpus in Japanese using this method, and its statistical analysis revealed the differences between expressed and experienced emotions. We conducted experiments on recognition of the two kinds of emotions. The experimental results indicated the difficulty in recognizing experienced emotions and the effectiveness of multi-task learning of the two kinds of emotions. We hope that the constructed corpus will facilitate the study on emotion recognition in a dialogue and emotion-aware dialogue response generation. △ Less

Submitted 24 May, 2022; originally announced May 2022.

Comments: ACL Student Research Workshop (SRW) 2022

arXiv:2205.04435 [pdf, other]

Supply Chain Logistics with Quantum and Classical Annealing Algorithms

Authors: Sean J. Weinberg, Fabio Sanches, Takanori Ide, Kazumitzu Kamiya, Randall Correll

Abstract: Noisy intermediate-scale quantum (NISQ) hardware is almost universally incompatible with full-scale optimization problems of practical importance which can have many variables and unwieldy objective functions. As a consequence, there is a growing body of literature that tests quantum algorithms on miniaturized versions of problems that arise in an operations research setting. Rather than taking th… ▽ More Noisy intermediate-scale quantum (NISQ) hardware is almost universally incompatible with full-scale optimization problems of practical importance which can have many variables and unwieldy objective functions. As a consequence, there is a growing body of literature that tests quantum algorithms on miniaturized versions of problems that arise in an operations research setting. Rather than taking this approach, we investigate a problem of substantial commercial value, multi-truck vehicle routing for supply chain logistics, at the scale used by a corporation in their operations. Such a problem is too complex to be fully embedded on any near-term quantum hardware or simulator; we avoid confronting this challenge by taking a hybrid workflow approach: we iteratively assign routes for trucks by generating a new binary optimization problem instance one truck at a time. Each instance has $\sim 2500$ quadratic binary variables, putting it in a range that is feasible for NISQ quantum computing, especially quantum annealing hardware. We test our methods using simulated annealing and the D-Wave Hybrid solver as a place-holder in wait of quantum hardware developments. After feeding the vehicle routes suggested by these runs into a highly realistic classical supply chain simulation, we find excellent performance for the full supply chain. Our work gives a set of techniques that can be adopted in contexts beyond vehicle routing to apply NISQ devices in a hybrid fashion to large-scale problems of commercial interest. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: 16 pages, 8 figures

arXiv:2202.12449 [pdf, other]

Directed Graph Auto-Encoders

Authors: Georgios Kollias, Vasileios Kalantzis, Tsuyoshi Idé, Aurélie Lozano, Naoki Abe

Abstract: We introduce a new class of auto-encoders for directed graphs, motivated by a direct extension of the Weisfeiler-Leman algorithm to pairs of node labels. The proposed model learns pairs of interpretable latent representations for the nodes of directed graphs, and uses parameterized graph convolutional network (GCN) layers for its encoder and an asymmetric inner product decoder. Parameters in the e… ▽ More We introduce a new class of auto-encoders for directed graphs, motivated by a direct extension of the Weisfeiler-Leman algorithm to pairs of node labels. The proposed model learns pairs of interpretable latent representations for the nodes of directed graphs, and uses parameterized graph convolutional network (GCN) layers for its encoder and an asymmetric inner product decoder. Parameters in the encoder control the weighting of representations exchanged between neighboring nodes. We demonstrate the ability of the proposed model to learn meaningful latent embeddings and achieve superior performance on the directed link prediction task on several popular network datasets. △ Less

Submitted 24 February, 2022; originally announced February 2022.

Comments: AAAI 2022

arXiv:2109.07498 [pdf, other]

doi 10.1103/PhysRevA.105.062403

Short Quantum Circuits in Reinforcement Learning Policies for the Vehicle Routing Problem

Authors: Fabio Sanches, Sean Weinberg, Takanori Ide, Kazumitsu Kamiya

Abstract: Quantum computing and machine learning have potential for symbiosis. However, in addition to the hardware limitations from current devices, there are still basic issues that must be addressed before quantum circuits can usefully incorporate with current machine learning tasks. We report a new strategy for such an integration in the context of attention models used for reinforcement learning. Agent… ▽ More Quantum computing and machine learning have potential for symbiosis. However, in addition to the hardware limitations from current devices, there are still basic issues that must be addressed before quantum circuits can usefully incorporate with current machine learning tasks. We report a new strategy for such an integration in the context of attention models used for reinforcement learning. Agents that implement attention mechanisms have successfully been applied to certain cases of combinatorial routing problems by first encoding nodes on a graph and then sequentially decoding nodes until a route is selected. We demonstrate that simple quantum circuits can used in place of classical attention head layers while maintaining performance. Our method modifies the networks used in [1] by replacing key and query vectors for every node with quantum states that are entangled before being measured. The resulting hybrid classical-quantum agent is tested in the context of vehicle routing problems where its performance is competitive with the original classical approach. We regard our model as a prototype that can be scaled up and as an avenue for further study on the role of quantum computing in reinforcement learning. △ Less

Submitted 15 September, 2021; originally announced September 2021.

Comments: 15 pages, 9 figures

arXiv:2105.11696 [pdf, other]

Multi-Task Learning of Generation and Classification for Emotion-Aware Dialogue Response Generation

Authors: Tatsuya Ide, Daisuke Kawahara

Abstract: For a computer to naturally interact with a human, it needs to be human-like. In this paper, we propose a neural response generation model with multi-task learning of generation and classification, focusing on emotion. Our model based on BART (Lewis et al., 2020), a pre-trained transformer encoder-decoder model, is trained to generate responses and recognize emotions simultaneously. Furthermore, w… ▽ More For a computer to naturally interact with a human, it needs to be human-like. In this paper, we propose a neural response generation model with multi-task learning of generation and classification, focusing on emotion. Our model based on BART (Lewis et al., 2020), a pre-trained transformer encoder-decoder model, is trained to generate responses and recognize emotions simultaneously. Furthermore, we weight the losses for the tasks to control the update of parameters. Automatic evaluations and crowdsourced manual evaluations show that the proposed model makes generated responses more emotionally aware. △ Less

Submitted 25 May, 2021; originally announced May 2021.

Comments: NAACL Student Research Workshop (SRW) 2021

arXiv:1907.03310 [pdf]

doi 10.1016/j.carbon.2019.06.038

Influence of interface dipole layers on the performance of graphene field effect transistors

Authors: Naoka Nagamura, Hirokazu Fukidome, Kosuke Nagashio, Koji Horiba, Takayuki Ide, Kazutoshi Funakubo, Keiichiro Tashima, Akira Toriumi, Maki Suemitsu, Karsten Horn, Masaharu Oshima

Abstract: The linear band dispersion of graphene's bands near the Fermi level gives rise to its unique electronic properties, such as a giant carrier mobility, and this has triggered extensive research in applications, such as graphene field-effect transistors (GFETs). However, GFETs generally exhibit a device performance much inferior compared to the expected one. This has been attributed to a strong depen… ▽ More The linear band dispersion of graphene's bands near the Fermi level gives rise to its unique electronic properties, such as a giant carrier mobility, and this has triggered extensive research in applications, such as graphene field-effect transistors (GFETs). However, GFETs generally exhibit a device performance much inferior compared to the expected one. This has been attributed to a strong dependence of the electronic properties of graphene on the surrounding interfaces. Here we study the interface between a graphene channel and SiO$_{2}$, and by means of photoelectron spectromicroscopy achieve a detailed determination of the course of band alignment at the interface. Our results show that the electronic properties of graphene are modulated by a hydrophilic SiO$_{2}$ surface, but not by a hydrophobic one. By combining photoelectron spectromicroscopy with GFET transport property characterization, we demonstrate that the presence of electrical dipoles in the interface, which reflects the SiO$_{2}$ surface electrochemistry, determines the GFET device performance. A hysteresis in the resistance vs. gate voltage as a function of polarity is ascribed to a reversal of the dipole layer by the gate voltage. These data pave the way for GFET device optimization. △ Less

Submitted 7 July, 2019; originally announced July 2019.

Comments: 29 pages, 7 figures

Journal ref: Carbon 152 680-687 (2019)

arXiv:1011.0891 [pdf, ps, other]

Experiments to investigate the effects of radiative cooling on plasma jet collimation

Authors: C. D. Gregory, A. Diziere, H. Aoki, M. Besio, S. Bouquet, E. Falize, T. Ide, B. Loupias, C. Michaut, T. Morita, S. A. Pikuz Jr., A. Ravasio, Y. Kuramtisu, Y. Sakawa, H. Takabe, H. Tanji, N. C. Woolsey, M. Koenig

Abstract: Preliminary experiments have been performed to investigate the effects of radiative cooling on plasma jets. Thin (3 um - 5 um) conical shells were irradiated with an intense laser, driving jets with velocities > 100 km/s. Through use of different target materials - aluminium, copper and gold - the degree of radiative losses was altered, and their importance for jet collimation investigated. A numb… ▽ More Preliminary experiments have been performed to investigate the effects of radiative cooling on plasma jets. Thin (3 um - 5 um) conical shells were irradiated with an intense laser, driving jets with velocities > 100 km/s. Through use of different target materials - aluminium, copper and gold - the degree of radiative losses was altered, and their importance for jet collimation investigated. A number of temporally resoved optical diagnostics was used, providing information about the jet evolution. Gold jets were seen to be narrower than those from copper targets, while aluminium targets produced the least collimated flows. △ Less

Submitted 3 November, 2010; originally announced November 2010.

Comments: Presented at the 8th High Energy Density Laboratory Astrophysics conference, March 15th - 18th, Caltech, CA, USA

arXiv:0803.2787 [pdf, ps, other]

doi 10.1103/PhysRevB.77.134419

Successive phase transitions to antiferromagnetic and weak-ferromagnetic long-range orders in quasi-one-dimensional antiferromagnet Cu$_3$Mo$_2$O$_9$

Authors: Tomoaki Hamasaki, Tomoyuki Ide, Haruhiko Kuroe, Tomoyuki Sekine, Masashi Hase, Ichiro Tsukada, Toshiro Sakakibara

Abstract: Investigation of the magnetism of Cu$_3$Mo$_2$O$_9$ single crystal, which has antiferromagnetic (AF) linear chains interacting with AF dimers, reveals an AF second-order phase transition at $T_{\rm N} = 7.9$ K. Although weak ferromagnetic-like behavior appears at lower temperatures in low magnetic fields, complete remanent magnetization cannot be detected down to 0.5 K. However, a jump is observ… ▽ More Investigation of the magnetism of Cu$_3$Mo$_2$O$_9$ single crystal, which has antiferromagnetic (AF) linear chains interacting with AF dimers, reveals an AF second-order phase transition at $T_{\rm N} = 7.9$ K. Although weak ferromagnetic-like behavior appears at lower temperatures in low magnetic fields, complete remanent magnetization cannot be detected down to 0.5 K. However, a jump is observed in the magnetization below weak ferromagnetic (WF) phase transition at $T_{\rm c} \simeq 2.5$ K when a tiny magnetic field along the a axis is reversed, suggesting that the coercive force is very weak. A component of magnetic moment parallel to the chain forms AF long-range order (LRO) below $T_{\rm N}$, while a perpendicular component is disordered above $T_{\rm c}$ at zero magnetic field and forms WF-LRO below $T_{\rm c}$. Moreover, the WF-LRO is also realized with applying magnetic fields even between $T_{\rm c}$ and $T_{\rm N}$. These results are explainable by both magnetic frustration among symmetric exchange interactions and competition between symmetric and asymmetric Dzyaloshinskii-Moriya exchange interactions. △ Less

Submitted 19 March, 2008; originally announced March 2008.

Comments: 7 pages, 7 figures

arXiv:quant-ph/0702204 [pdf, ps, other]

doi 10.1103/PhysRevA.75.062311

Accidental cloning of a single-photon qubit in two-channel continuous-variable quantum teleportation

Authors: Toshiki Ide, Holger F. Hofmann

Abstract: The information encoded in the polarization of a single photon can be transferred to a remote location by two-channel continuous-variable quantum teleportation. However, the finite entanglement used in the teleportation causes random changes in photon number. If more than one photon appears in the output, the continuous-variable teleportation accidentally produces clones of the original input ph… ▽ More The information encoded in the polarization of a single photon can be transferred to a remote location by two-channel continuous-variable quantum teleportation. However, the finite entanglement used in the teleportation causes random changes in photon number. If more than one photon appears in the output, the continuous-variable teleportation accidentally produces clones of the original input photon. In this paper, we derive the polarization statistics of the $N$-photon output components and show that they can be decomposed into an optimal cloning term and completely unpolarized noise. We find that the accidental cloning of the input photon is nearly optimal at experimentally feasible squeezing levels, indicating that the loss of polarization information is partially compensated by the availability of clones. △ Less

Submitted 5 April, 2007; v1 submitted 22 February, 2007; originally announced February 2007.

Comments: 9 pages, 4 figures, improved explanation of cloning fidelity

Journal ref: Phys. Rev. A 75, 062311 (2007)

arXiv:quant-ph/0512002 [pdf, ps, other]

doi 10.1088/1367-2630/8/8/130

Optimal cloning of single photon polarization by coherent feedback of beam splitter losses

Authors: Holger F. Hofmann, Toshiki Ide

Abstract: Light fields can be amplified by measuring the field amplitude reflected at a beam splitter of reflectivity R and adding a coherent amplitude proportional to the measurement result to the transmitted field. By applying the quantum optical realization of this amplification scheme to single photon inputs, it is possible to clone the polarization states of photons. We show that optimal cloning of s… ▽ More Light fields can be amplified by measuring the field amplitude reflected at a beam splitter of reflectivity R and adding a coherent amplitude proportional to the measurement result to the transmitted field. By applying the quantum optical realization of this amplification scheme to single photon inputs, it is possible to clone the polarization states of photons. We show that optimal cloning of single photon polarization is possible when the gain factor of the amplification is equal to the inverse squareroot of 1-R. △ Less

Submitted 30 July, 2006; v1 submitted 30 November, 2005; originally announced December 2005.

Comments: 10 pages, including 1 figure, extended from letter to full paper, to be published in New Journal of Physics

Journal ref: New J. Phys. 8 (2006) 130

arXiv:quant-ph/0511220 [pdf, ps, other]

Transfer of single photon polarization states by two-channel continuous variable teleportation

Authors: Toshiki Ide, Holger F. Hofmann

Abstract: Superpositions of two orthogonal single-photon polarization states are commonly used as optical qubits. If such qubits are sent by continuous variable quantum teleportation, the modifications of the qubit states due to imperfect entanglement cause an increase in the average photon number of the output state. This effect can be interpreted as an accidental quantum cloning of the single photon inp… ▽ More Superpositions of two orthogonal single-photon polarization states are commonly used as optical qubits. If such qubits are sent by continuous variable quantum teleportation, the modifications of the qubit states due to imperfect entanglement cause an increase in the average photon number of the output state. This effect can be interpreted as an accidental quantum cloning of the single photon input. We analyze the output statistics of the single photon teleportation and derive the transfer and cloning fidelities from the equations of the polarization qubit. △ Less

Submitted 22 November, 2005; originally announced November 2005.

Comments: 4 pages, 4 figures, The 13th Quantum Information Technology Symposium (QIT13)

arXiv:quant-ph/0112018 [pdf, ps, other]

Continuous variable teleportation of single photon states (Proceedings version)

Authors: Toshiki Ide, Holger F. Hofmann, Takayoshi Kobayashi, Akira Furusawa

Abstract: We investigate the changes to a single photon state caused by the non-maximal entanglement in continuous variable quantum teleportation. It is shown that the teleportation measurement introduces field coherence in the output. We investigate the changes to a single photon state caused by the non-maximal entanglement in continuous variable quantum teleportation. It is shown that the teleportation measurement introduces field coherence in the output. △ Less

Submitted 3 December, 2001; originally announced December 2001.

Comments: 5pages, 2 figures, Proceedings for ISQM-Tokyo'01

arXiv:quant-ph/0111127 [pdf, ps, other]

doi 10.1103/PhysRevA.65.062303

Gain tuning and fidelity in continuous variable quantum teleportation

Authors: Toshiki Ide, Holger F. Hofmann, Akira Furusawa, Takayoshi Kobayashi

Abstract: The fidelity of continuous variable teleportation can be optimized by changing the gain in the modulation of the output field. We discuss the gain dependence of fidelity for coherent, vacuum and one photon inputs and propose optimal gain tuning strategies for corresponding input selections. The fidelity of continuous variable teleportation can be optimized by changing the gain in the modulation of the output field. We discuss the gain dependence of fidelity for coherent, vacuum and one photon inputs and propose optimal gain tuning strategies for corresponding input selections. △ Less

Submitted 28 March, 2002; v1 submitted 24 November, 2001; originally announced November 2001.

Comments: 23 pages, 10 figures

Journal ref: Phys. Rev. A 65, 062303 (2002)

arXiv:quant-ph/0110127 [pdf, ps, other]

doi 10.1142/9789812776716_0013

Information extraction and quantum state distortions in continuous variable quantum teleportation

Authors: Holger F. Hofmann, Toshiki Ide, Takayoshi Kobayashi, Akira Furusawa

Abstract: We analyze the loss of fidelity in continuous variable teleportation due to non-maximal entanglement. It is shown that the quantum state distortions correspond to the measurement back-action of a field amplitude measurement. Results for coherent states and for photon number states are presented. We analyze the loss of fidelity in continuous variable teleportation due to non-maximal entanglement. It is shown that the quantum state distortions correspond to the measurement back-action of a field amplitude measurement. Results for coherent states and for photon number states are presented. △ Less

Submitted 22 October, 2001; originally announced October 2001.

Comments: 4 pages Latex, contribution to the proceedings of the ISQM'01 conference held August 27th to 30th in Tokyo

arXiv:quant-ph/0104014 [pdf, ps, other]

doi 10.1103/PhysRevA.65.012313

Continuous variable teleportation of single photon states

Authors: Toshiki Ide, Holger F. Hofmann, Takayoshi Kobayashi, Akira Furusawa

Abstract: The properties of continuous variable teleportation of single photon states are investigated. The output state is different from the input state due to the non-maximal entanglement in the EPR beams. The photon statistics of the teleportation output are determined and the correlation between the field information beta obtained in the teleportation process and the change in photon number is discus… ▽ More The properties of continuous variable teleportation of single photon states are investigated. The output state is different from the input state due to the non-maximal entanglement in the EPR beams. The photon statistics of the teleportation output are determined and the correlation between the field information beta obtained in the teleportation process and the change in photon number is discussed. The results of the output photon statistics are applied to the transmission of a qbit encoded in the polarization of a single photon. △ Less

Submitted 25 July, 2001; v1 submitted 3 April, 2001; originally announced April 2001.

Comments: 14 pages, including 6 figures

Journal ref: Phys. Rev. A 65, 012313 (2002)

arXiv:quant-ph/0102097 [pdf, ps, other]

doi 10.1103/PhysRevA.64.040301

Information losses in continuous variable quantum teleportation

Authors: Holger F. Hofmann, Toshiki Ide, Takayoshi Kobayashi, Akira Furusawa

Abstract: It is shown that the information losses due to the limited fidelity of continuous variable quantum teleportation are equivalent to the losses induced by a beam splitter of appropriate reflectivity. It is shown that the information losses due to the limited fidelity of continuous variable quantum teleportation are equivalent to the losses induced by a beam splitter of appropriate reflectivity. △ Less

Submitted 5 July, 2001; v1 submitted 20 February, 2001; originally announced February 2001.

Comments: 7 pages, including one figure, added references and clarifications

Journal ref: Phys. Rev. A 64, 040301 (2001)

arXiv:quant-ph/0003053 [pdf, ps, other]

doi 10.1103/PhysRevA.62.062304

Fidelity and information in the quantum teleportation of continuous variables

Authors: Holger F. Hofmann, Toshiki Ide, Takayoshi Kobayashi, Akira Furusawa

Abstract: Ideally, quantum teleportation should transfer a quantum state without distortion and without providing any information about that state. However, quantum teleportation of continuous electromagnetic field variables introduces additional noise, limiting the fidelity of the quantum state transfer. In this article, the operator describing the quantum state transfer is derived. The transfer operator… ▽ More Ideally, quantum teleportation should transfer a quantum state without distortion and without providing any information about that state. However, quantum teleportation of continuous electromagnetic field variables introduces additional noise, limiting the fidelity of the quantum state transfer. In this article, the operator describing the quantum state transfer is derived. The transfer operator modifies the probability amplitudes of the quantum state in a shifted photon number base by enhancing low photon numbers and suppressing high photon numbers. This modification of the statistical weight corresponds to a measurement of finite resolution performed on the original quantum state. The limited fidelity of quantum teleportation is thus shown to be a direct consequence of the information obtained in the measurement. △ Less

Submitted 6 June, 2000; v1 submitted 15 March, 2000; originally announced March 2000.

Comments: 10 pages, including one figure, minor clarifications and added reference

Journal ref: Phys. Rev. A 62, 062304 (2000)

Showing 1–32 of 32 results for author: Ide, T