Search | arXiv e-print repository

One Stone, Four Birds: A Comprehensive Solution for QA System Using Supervised Contrastive Learning

Abstract: This paper presents a novel and comprehensive solution to enhance both the robustness and efficiency of question answering (QA) systems through supervised contrastive learning (SCL). Training a high-performance QA system has become straightforward with pre-trained language models, requiring only a small amount of data and simple fine-tuning. However, despite recent advances, existing QA systems st… ▽ More This paper presents a novel and comprehensive solution to enhance both the robustness and efficiency of question answering (QA) systems through supervised contrastive learning (SCL). Training a high-performance QA system has become straightforward with pre-trained language models, requiring only a small amount of data and simple fine-tuning. However, despite recent advances, existing QA systems still exhibit significant deficiencies in functionality and training efficiency. We address the functionality issue by defining four key tasks: user input intent classification, out-of-domain input detection, new intent discovery, and continual learning. We then leverage a unified SCL-based representation learning method to efficiently build an intra-class compact and inter-class scattered feature space, facilitating both known intent classification and unknown intent detection and discovery. Consequently, with minimal additional tuning on downstream tasks, our approach significantly improves model efficiency and achieves new state-of-the-art performance across all tasks. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 14 pages, under review

arXiv:2406.12032 [pdf, other]

Balancing Embedding Spectrum for Recommendation

Authors: Shaowen Peng, Kazunari Sugiyama, Xin Liu, Tsunenori Mine

Abstract: Modern recommender systems heavily rely on high-quality representations learned from high-dimensional sparse data. While significant efforts have been invested in designing powerful algorithms for extracting user preferences, the factors contributing to good representations have remained relatively unexplored. In this work, we shed light on an issue in the existing pair-wise learning paradigm (i.e… ▽ More Modern recommender systems heavily rely on high-quality representations learned from high-dimensional sparse data. While significant efforts have been invested in designing powerful algorithms for extracting user preferences, the factors contributing to good representations have remained relatively unexplored. In this work, we shed light on an issue in the existing pair-wise learning paradigm (i.e., the embedding collapse problem), that the representations tend to span a subspace of the whole embedding space, leading to a suboptimal solution and reducing the model capacity. Specifically, optimization on observed interactions is equivalent to a low pass filter causing users/items to have the same representations and resulting in a complete collapse. While negative sampling acts as a high pass filter to alleviate the collapse by balancing the embedding spectrum, its effectiveness is only limited to certain losses, which still leads to an incomplete collapse. To tackle this issue, we propose a novel method called DirectSpec, acting as a reliable all pass filter to balance the spectrum distribution of the embeddings during training, ensuring that users/items effectively span the entire embedding space. Additionally, we provide a thorough analysis of DirectSpec from a decorrelation perspective and propose an enhanced variant, DirectSpec+, which employs self-paced gradients to optimize irrelevant samples more effectively. Moreover, we establish a close connection between DirectSpec+ and uniformity, demonstrating that contrastive learning (CL) can alleviate the collapse issue by indirectly balancing the spectrum. Finally, we implement DirectSpec and DirectSpec+ on two popular recommender models: MF and LightGCN. Our experimental results demonstrate its effectiveness and efficiency over competitive baselines. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.08827 [pdf, other]

How Powerful is Graph Filtering for Recommendation

Authors: Shaowen Peng, Xin Liu, Kazunari Sugiyama, Tsunenori Mine

Abstract: It has been shown that the effectiveness of graph convolutional network (GCN) for recommendation is attributed to the spectral graph filtering. Most GCN-based methods consist of a graph filter or followed by a low-rank map** optimized based on supervised training. However, we show two limitations suppressing the power of graph filtering: (1) Lack of generality. Due to the varied noise distributi… ▽ More It has been shown that the effectiveness of graph convolutional network (GCN) for recommendation is attributed to the spectral graph filtering. Most GCN-based methods consist of a graph filter or followed by a low-rank map** optimized based on supervised training. However, we show two limitations suppressing the power of graph filtering: (1) Lack of generality. Due to the varied noise distribution, graph filters fail to denoise sparse data where noise is scattered across all frequencies, while supervised training results in worse performance on dense data where noise is concentrated in middle frequencies that can be removed by graph filters without training. (2) Lack of expressive power. We theoretically show that linear GCN (LGCN) that is effective on collaborative filtering (CF) cannot generate arbitrary embeddings, implying the possibility that optimal data representation might be unreachable. To tackle the first limitation, we show close relation between noise distribution and the sharpness of spectrum where a sharper spectral distribution is more desirable causing data noise to be separable from important features without training. Based on this observation, we propose a generalized graph normalization G^2N to adjust the sharpness of spectral distribution in order to redistribute data noise to assure that it can be removed by graph filtering without training. As for the second limitation, we propose an individualized graph filter (IGF) adapting to the different confidence levels of the user preference that interactions can reflect, which is proved to be able to generate arbitrary embeddings. By simplifying LGCN, we further propose a simplified graph filtering (SGFCF) which only requires the top-K singular values for recommendation. Finally, experimental results on four datasets with different density settings demonstrate the effectiveness and efficiency of our proposed methods. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted to KDD'24

arXiv:2406.02256 [pdf, ps, other]

Integrated density of states for the Poisson point interactions on $\mathbf{R}^3$

Authors: Masahiro Kaminaga, Takuya Mine, Fumihiko Nakano

Abstract: We determine the principal term of the asymptotics of the integrated density of states (IDS) $N(λ)$ for the Schrödinger operator with point interactions on $\mathbf{R}^3$ as $λ\to -\infty$, provided that the set of positions of the point obstacles is the Poisson configuration, and the interaction parameters are bounded i.i.d.\ random variables. In particular, we prove $N(λ) =O(|λ|^{-3/2})$ as… ▽ More We determine the principal term of the asymptotics of the integrated density of states (IDS) $N(λ)$ for the Schrödinger operator with point interactions on $\mathbf{R}^3$ as $λ\to -\infty$, provided that the set of positions of the point obstacles is the Poisson configuration, and the interaction parameters are bounded i.i.d.\ random variables. In particular, we prove $N(λ) =O(|λ|^{-3/2})$ as $λ\to -\infty$. In the case that all interaction parameters are equal to a constant, we give a more detailed asymptotics of $N(λ)$, and verify the result by a numerical method using R. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 30 pages, 4 figures

arXiv:2403.07345 [pdf, ps, other]

The transition operator of a random walk perturbated by sparse potentials

Authors: Takuya Mine, Nobuo Yoshida

Abstract: We consider an operator $P_V=(1+V)P$ on $\ell^2(Z^d)$, where $P$ is the transition operator of a symmetric irreducible random walk, and $V$ is a ``sparse'' potential. We first characterize the essential spectra of this operator. Secondly, we prove that all the eigenfunctions which correspond to discrete spectra decay exponentially fast. Thirdly, we give a sufficient condition for this operator to… ▽ More We consider an operator $P_V=(1+V)P$ on $\ell^2(Z^d)$, where $P$ is the transition operator of a symmetric irreducible random walk, and $V$ is a ``sparse'' potential. We first characterize the essential spectra of this operator. Secondly, we prove that all the eigenfunctions which correspond to discrete spectra decay exponentially fast. Thirdly, we give a sufficient condition for this operator to have an absolute spectral gap at the right edge of the spectra. Finally, as an application of the absolute spectral gap and the exponential decay of the eigenfunctions, we prove a limit theorem for the random walk under the Gibbs measure associated to the potential $V$. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2401.04423 [pdf, other]

Privacy-Preserving Sequential Recommendation with Collaborative Confusion

Authors: Wei Wang, Yujie Lin, Pengjie Ren, Zhumin Chen, Tsunenori Mine, Jianli Zhao, Qiang Zhao, Moyan Zhang, Xianye Ben, Yujun Li

Abstract: Sequential recommendation has attracted a lot of attention from both academia and industry, however the privacy risks associated to gathering and transferring users' personal interaction data are often underestimated or ignored. Existing privacy-preserving studies are mainly applied to traditional collaborative filtering or matrix factorization rather than sequential recommendation. Moreover, thes… ▽ More Sequential recommendation has attracted a lot of attention from both academia and industry, however the privacy risks associated to gathering and transferring users' personal interaction data are often underestimated or ignored. Existing privacy-preserving studies are mainly applied to traditional collaborative filtering or matrix factorization rather than sequential recommendation. Moreover, these studies are mostly based on differential privacy or federated learning, which often leads to significant performance degradation, or has high requirements for communication. In this work, we address privacy-preserving from a different perspective. Unlike existing research, we capture collaborative signals of neighbor interaction sequences and directly inject indistinguishable items into the target sequence before the recommendation process begins, thereby increasing the perplexity of the target sequence. Even if the target interaction sequence is obtained by attackers, it is difficult to discern which ones are the actual user interaction records. To achieve this goal, we propose a CoLlaborative-cOnfusion seqUential recommenDer, namely CLOUD, which incorporates a collaborative confusion mechanism to edit the raw interaction sequences before conducting recommendation. Specifically, CLOUD first calculates the similarity between the target interaction sequence and other neighbor sequences to find similar sequences. Then, CLOUD considers the shared representation of the target sequence and similar sequences to determine the operation to be performed: keep, delete, or insert. We design a copy mechanism to make items from similar sequences have a higher probability to be inserted into the target sequence. Finally, the modified sequence is used to train the recommender and predict the next item. △ Less

Submitted 9 January, 2024; originally announced January 2024.

arXiv:2308.05271 [pdf, other]

Single electron routing in a silicon quantum-dot array

Authors: Takeru Utsugi, Takuma Kuno, Noriyuki Lee, Ryuta Tsuchiya, Toshiyuki Mine, Digh Hisamoto, Shinichi Saito, Hiroyuki Mizuno

Abstract: The ability to transport single electrons on a quantum dot array dramatically increases the freedom in designing quantum computation schemes that can be implemented on solid-state devices. So far, however, routing schemes to precisely control the transport paths of single electrons have yet to be established. Here, we propose a silicon single-electron router that transports pumped electrons along… ▽ More The ability to transport single electrons on a quantum dot array dramatically increases the freedom in designing quantum computation schemes that can be implemented on solid-state devices. So far, however, routing schemes to precisely control the transport paths of single electrons have yet to be established. Here, we propose a silicon single-electron router that transports pumped electrons along the desired route on the branches of a T-shaped quantum dot array by inputting a synchronous phase-controlled signal to multiple gates. Notably, we show that it is possible to achieve a routing accuracy above 99% by implementing assist gates in front of the branching paths. The results suggest new possibilities for fast and accurate transport of single electrons on quantum dot arrays. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: 10 pages, 14 figures

arXiv:2302.06135 [pdf]

doi 10.35848/1882-0786/acc3dc

Electron Charge Sensor with Hole Current Operating at Cryogenic Temperature

Authors: Digh Hisamoto, Noriyuki Lee, Ryuta Tsuchiya, Toshiyuki Mine, Takeru Utsugi, Shinichi Saito, Hiroyuki Mizuno

Abstract: When SOI-PMOS functions like a capacitor-less 1T-DRAM cell, it is possible for the number of electrons to be sensed at cryogenic temperatures (5K). We developed a structure that combines SOI-NMOS and SOI-PMOS with multiple gates to form a silicon quantum-dot array. In this structure, a variable number of electrons is injected into the SOI-PMOS body transporting them by means of the bucket-brigade… ▽ More When SOI-PMOS functions like a capacitor-less 1T-DRAM cell, it is possible for the number of electrons to be sensed at cryogenic temperatures (5K). We developed a structure that combines SOI-NMOS and SOI-PMOS with multiple gates to form a silicon quantum-dot array. In this structure, a variable number of electrons is injected into the SOI-PMOS body transporting them by means of the bucket-brigade operation of SOI-NMOS connected in series. The channel-hole current was changed by the injected electrons due to the body bias effect in SOI-PMOS, and the change appeared to be step-like, suggesting a dependence on the elementary charge. △ Less

Submitted 13 February, 2023; originally announced February 2023.

Comments: 16 pages, 4 figures

arXiv:2208.12689 [pdf, other]

SVD-GCN: A Simplified Graph Convolution Paradigm for Recommendation

Authors: Shaowen Peng, Kazunari Sugiyama, Tsunenori Mine

Abstract: With the tremendous success of Graph Convolutional Networks (GCNs), they have been widely applied to recommender systems and have shown promising performance. However, most GCN-based methods rigorously stick to a common GCN learning paradigm and suffer from two limitations: (1) the limited scalability due to the high computational cost and slow training convergence; (2) the notorious over-smoothin… ▽ More With the tremendous success of Graph Convolutional Networks (GCNs), they have been widely applied to recommender systems and have shown promising performance. However, most GCN-based methods rigorously stick to a common GCN learning paradigm and suffer from two limitations: (1) the limited scalability due to the high computational cost and slow training convergence; (2) the notorious over-smoothing issue which reduces performance as stacking graph convolution layers. We argue that the above limitations are due to the lack of a deep understanding of GCN-based methods. To this end, we first investigate what design makes GCN effective for recommendation. By simplifying LightGCN, we show the close connection between GCN-based and low-rank methods such as Singular Value Decomposition (SVD) and Matrix Factorization (MF), where stacking graph convolution layers is to learn a low-rank representation by emphasizing (suppressing) components with larger (smaller) singular values. Based on this observation, we replace the core design of GCN-based methods with a flexible truncated SVD and propose a simplified GCN learning paradigm dubbed SVD-GCN, which only exploits $K$-largest singular vectors for recommendation. To alleviate the over-smoothing issue, we propose a renormalization trick to adjust the singular value gap, resulting in significant improvement. Extensive experiments on three real-world datasets show that our proposed SVD-GCN not only significantly outperforms state-of-the-arts but also achieves over 100x and 10x speedups over LightGCN and MF, respectively. △ Less

Submitted 3 September, 2022; v1 submitted 26 August, 2022; originally announced August 2022.

Comments: Accepted by CIKM'22

arXiv:2204.11346 [pdf, other]

Less is More: Reweighting Important Spectral Graph Features for Recommendation

Authors: Shaowen Peng, Kazunari Sugiyama, Tsunenori Mine

Abstract: As much as Graph Convolutional Networks (GCNs) have shown tremendous success in recommender systems and collaborative filtering (CF), the mechanism of how they, especially the core components (\textit{i.e.,} neighborhood aggregation) contribute to recommendation has not been well studied. To unveil the effectiveness of GCNs for recommendation, we first analyze them in a spectral perspective and di… ▽ More As much as Graph Convolutional Networks (GCNs) have shown tremendous success in recommender systems and collaborative filtering (CF), the mechanism of how they, especially the core components (\textit{i.e.,} neighborhood aggregation) contribute to recommendation has not been well studied. To unveil the effectiveness of GCNs for recommendation, we first analyze them in a spectral perspective and discover two important findings: (1) only a small portion of spectral graph features that emphasize the neighborhood smoothness and difference contribute to the recommendation accuracy, whereas most graph information can be considered as noise that even reduces the performance, and (2) repetition of the neighborhood aggregation emphasizes smoothed features and filters out noise information in an ineffective way. Based on the two findings above, we propose a new GCN learning scheme for recommendation by replacing neihgborhood aggregation with a simple yet effective Graph Denoising Encoder (GDE), which acts as a band pass filter to capture important graph features. We show that our proposed method alleviates the over-smoothing and is comparable to an indefinite-layer GCN that can take any-hop neighborhood into consideration. Finally, we dynamically adjust the gradients over the negative samples to expedite model training without introducing additional complexity. Extensive experiments on five real-world datasets show that our proposed method not only outperforms state-of-the-arts but also achieves 12x speedup over LightGCN. △ Less

Submitted 24 April, 2022; originally announced April 2022.

Comments: Accepted by SIGIR '22

arXiv:2004.14734 [pdf, other]

A Robust Hierarchical Graph Convolutional Network Model for Collaborative Filtering

Authors: Shaowen Peng, Tsunenori Mine

Abstract: Graph Convolutional Network (GCN) has achieved great success and has been applied in various fields including recommender systems. However, GCN still suffers from many issues such as training difficulties, over-smoothing, vulnerable to adversarial attacks, etc. Distinct from current GCN-based methods which simply employ GCN for recommendation, in this paper we are committed to build a robust GCN m… ▽ More Graph Convolutional Network (GCN) has achieved great success and has been applied in various fields including recommender systems. However, GCN still suffers from many issues such as training difficulties, over-smoothing, vulnerable to adversarial attacks, etc. Distinct from current GCN-based methods which simply employ GCN for recommendation, in this paper we are committed to build a robust GCN model for collaborative filtering. Firstly, we argue that recursively incorporating messages from different order neighborhood mixes distinct node messages indistinguishably, which increases the training difficulty; instead we choose to separately aggregate different order neighbor messages with a simple GCN model which has been shown effective; then we accumulate them together in a hierarchical way without introducing additional model parameters. Secondly, we propose a solution to alleviate over-smoothing by randomly drop** out neighbor messages at each layer, which also well prevents over-fitting and enhances the robustness. Extensive experiments on three real-world datasets demonstrate the effectiveness and robustness of our model. △ Less

Submitted 30 April, 2020; originally announced April 2020.

arXiv:1906.00206 [pdf, ps, other]

doi 10.1007/s00023-019-00869-1

A self-adjointness criterion for the Schrödinger operator with infinitely many point interactions and its application to random operators

Authors: Masahiro Kaminaga, Takuya Mine, Fumihiko Nakano

Abstract: We prove the Schrödinger operator with infinitely many point interactions in $\mathbb{R}^d$ $(d=1,2,3)$ is self-adjoint if the support of the interactions is decomposed into uniformly discrete clusters. Using this fact, we prove the self-adjointness of the Schrödinger operator with point interactions on a random perturbation of a lattice or on the Poisson configuration. We also determine the spect… ▽ More We prove the Schrödinger operator with infinitely many point interactions in $\mathbb{R}^d$ $(d=1,2,3)$ is self-adjoint if the support of the interactions is decomposed into uniformly discrete clusters. Using this fact, we prove the self-adjointness of the Schrödinger operator with point interactions on a random perturbation of a lattice or on the Poisson configuration. We also determine the spectrum of the Schrödinger operators with random point interactions of Poisson--Anderson type. △ Less

Submitted 1 June, 2019; originally announced June 2019.

Comments: 32 pages, 11 figures

arXiv:1604.01573 [pdf, ps, other]

doi 10.1007/s00023-017-0559-0

Schrödinger operators with random $δ$ magnetic fields

Authors: Takuya Mine, Yuji Nomura

Abstract: We shall consider the Schrödinger operators on $\mathbf{R}^2$ with random $δ$ magnetic fields. Under some mild conditions on the positions and the fluxes of the $δ$-fields, we prove the spectrum coincides with $[0,\infty)$ and the integrated density of states (IDS) decays exponentially at the bottom of the spectrum (Lifshitz tail), by using the Hardy type inequality by Laptev-Weidl. We also give a… ▽ More We shall consider the Schrödinger operators on $\mathbf{R}^2$ with random $δ$ magnetic fields. Under some mild conditions on the positions and the fluxes of the $δ$-fields, we prove the spectrum coincides with $[0,\infty)$ and the integrated density of states (IDS) decays exponentially at the bottom of the spectrum (Lifshitz tail), by using the Hardy type inequality by Laptev-Weidl. We also give a lower bound for IDS at the bottom of the spectrum. △ Less

Submitted 19 January, 2017; v1 submitted 6 April, 2016; originally announced April 2016.

arXiv:1603.00084 [pdf, ps, other]

Quantum diffusion in the Kronig-Penney model

Authors: Masahiro Kaminaga, Takuya Mine

Abstract: In this paper we consider the 1D Schrödinger operator $H$ with periodic point interactions. We show an $L^1-L^\infty$ bound for the time evolution operator $e^{-itH}$ restricted to each energy band with decay order $O(t^{-1/3})$ as $t\to \infty$, which comes from some kind of resonant state. The order $O(t^{-1/3})$ is optimal for our model. We also give an asymptotic bound for the coefficient in t… ▽ More In this paper we consider the 1D Schrödinger operator $H$ with periodic point interactions. We show an $L^1-L^\infty$ bound for the time evolution operator $e^{-itH}$ restricted to each energy band with decay order $O(t^{-1/3})$ as $t\to \infty$, which comes from some kind of resonant state. The order $O(t^{-1/3})$ is optimal for our model. We also give an asymptotic bound for the coefficient in the high energy limit. For the proof, we give an asymptotic analysis for the band functions and the Bloch waves in the high energy limit. Especially we give the asymptotics for the inflection points in the graphs of band functions, which is crucial for the asymptotics of the coefficient in our estimate. △ Less

Submitted 2 March, 2016; v1 submitted 29 February, 2016; originally announced March 2016.

Comments: 31 pages, 7 figures

MSC Class: 35J10

arXiv:0805.4305 [pdf]

doi 10.1016/j.ssc.2008.05.010

Nickel-based phosphide superconductor with infinite-layer structure, BaNi2P2

Authors: Takashi Mine, Hiroshi Yanagi, Toshio Kamiya, Yoichi Kamihara, Masahiro Hirano, Hideo Hosono

Abstract: Analogous to cuprate high-Tc superconductors, a NiP-based compound system has several crystals in which the Ni-P layers have different stacking structures. Herein, the properties of BaNi2P2 are reported. BaNi2P2 has an infinite-layer structure, and shows a superconducting transition at ~3 K. Moreover, it exhibits metallic conduction and Pauli paramagnetism in the temperature range of 4-300 K. Be… ▽ More Analogous to cuprate high-Tc superconductors, a NiP-based compound system has several crystals in which the Ni-P layers have different stacking structures. Herein, the properties of BaNi2P2 are reported. BaNi2P2 has an infinite-layer structure, and shows a superconducting transition at ~3 K. Moreover, it exhibits metallic conduction and Pauli paramagnetism in the temperature range of 4-300 K. Below 3 K, the resistivity sharply drops to zero, and the magnetic susceptibility becomes negative, while the volume fraction of the superconducting phase estimated from the diamagnetic susceptibility reaches ~100 vol.% at 1.9 K. These observations substantiate that BaNi2P2 is a bulk superconductor. △ Less

Submitted 28 May, 2008; originally announced May 2008.

Comments: 9 pages, 4 figures, Solid State Communications, in press. Received 4 March 2008. Accepted 2 May 2008. Available online 14 May 2008

Journal ref: Solid State Comm. 147, 111 (2008)

Showing 1–15 of 15 results for author: Mine, T