Search | arXiv e-print repository

Boosting the Performance of Transformer Architectures for Semantic Textual Similarity

Abstract: Semantic textual similarity is the task of estimating the similarity between the meaning of two texts. In this paper, we fine-tune transformer architectures for semantic textual similarity on the Semantic Textual Similarity Benchmark by tuning the model partially and then end-to-end. We experiment with BERT, RoBERTa, and DeBERTaV3 cross-encoders by approaching the problem as a binary classificatio… ▽ More Semantic textual similarity is the task of estimating the similarity between the meaning of two texts. In this paper, we fine-tune transformer architectures for semantic textual similarity on the Semantic Textual Similarity Benchmark by tuning the model partially and then end-to-end. We experiment with BERT, RoBERTa, and DeBERTaV3 cross-encoders by approaching the problem as a binary classification task or a regression task. We combine the outputs of the transformer models and use handmade features as inputs for boosting algorithms. Due to worse test set results coupled with improvements on the validation set, we experiment with different dataset splits to further investigate this occurrence. We also provide an error analysis, focused on the edges of the prediction range. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: 6 pages, 2 figures, 12 tables

ACM Class: I.2.7

arXiv:1912.04825 [pdf, other]

doi 10.1109/TNNLS.2020.3017010

Integration of Neural Network-Based Symbolic Regression in Deep Learning for Scientific Discovery

Authors: Samuel Kim, Peter Y. Lu, Srijon Mukherjee, Michael Gilbert, Li **g, Vladimir Čeperić, Marin Soljačić

Abstract: Symbolic regression is a powerful technique that can discover analytical equations that describe data, which can lead to explainable models and generalizability outside of the training data set. In contrast, neural networks have achieved amazing levels of accuracy on image recognition and natural language processing tasks, but are often seen as black-box models that are difficult to interpret and… ▽ More Symbolic regression is a powerful technique that can discover analytical equations that describe data, which can lead to explainable models and generalizability outside of the training data set. In contrast, neural networks have achieved amazing levels of accuracy on image recognition and natural language processing tasks, but are often seen as black-box models that are difficult to interpret and typically extrapolate poorly. Here we use a neural network-based architecture for symbolic regression called the Equation Learner (EQL) network and integrate it with other deep learning architectures such that the whole system can be trained end-to-end through backpropagation. To demonstrate the power of such systems, we study their performance on several substantially different tasks. First, we show that the neural network can perform symbolic regression and learn the form of several functions. Next, we present an MNIST arithmetic task where a separate part of the neural network extracts the digits. Finally, we demonstrate prediction of dynamical systems where an unknown parameter is extracted through an encoder. We find that the EQL-based architecture can extrapolate quite well outside of the training data set compared to a standard neural network-based architecture, paving the way for deep learning to be applied in scientific exploration and discovery. △ Less

Submitted 13 August, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

Comments: 12 pages, 10 figures

Journal ref: IEEE.Trans.Neural.Netw.Learn.Syst. 32 (2021) 4166-4177

arXiv:1909.13877 [pdf, other]

A Recurrent Ising Machine in a Photonic Integrated Circuit

Authors: Mihika Prabhu, Charles Roques-Carmes, Yichen Shen, Nicholas Harris, Li **g, Jacques Carolan, Ryan Hamerly, Tom Baehr-Jones, Michael Hochberg, Vladimir Čeperić, John D. Joannopoulos, Dirk R. Englund, Marin Soljačić

Abstract: Conventional computing architectures have no known efficient algorithms for combinatorial optimization tasks, which are encountered in fundamental areas and real-world practical problems including logistics, social networks, and cryptography. Physical machines have recently been proposed and implemented as an alternative to conventional exact and heuristic solvers for the Ising problem, one such o… ▽ More Conventional computing architectures have no known efficient algorithms for combinatorial optimization tasks, which are encountered in fundamental areas and real-world practical problems including logistics, social networks, and cryptography. Physical machines have recently been proposed and implemented as an alternative to conventional exact and heuristic solvers for the Ising problem, one such optimization task that requires finding the ground state spin configuration of an arbitrary Ising graph. However, these physical approaches usually suffer from decreased ground state convergence probability or universality for high edge-density graphs or arbitrary graph weights, respectively. We experimentally demonstrate a proof-of-principle integrated nanophotonic recurrent Ising sampler (INPRIS) capable of converging to the ground state of various 4-spin graphs with high probability. The INPRIS exploits experimental physical noise as a resource to speed up the ground state search. By injecting additional extrinsic noise during the algorithm iterations, the INPRIS explores larger regions of the phase space, thus allowing one to probe noise-dependent physical observables. Since the recurrent photonic transformation that our machine imparts is a fixed function of the graph problem, and could thus be implemented with optoelectronic architectures that enable GHz clock rates (such as passive or non-volatile photonic circuits that do not require reprogramming at each iteration), our work paves a way for orders-of-magnitude speedups in exploring the solution space of combinatorially hard problems. △ Less

Submitted 30 September, 2019; originally announced September 2019.

arXiv:1811.02705 [pdf, other]

doi 10.1038/s41467-019-14096-z

Heuristic Recurrent Algorithms for Photonic Ising Machines

Authors: Charles Roques-Carmes, Yichen Shen, Cristian Zanoci, Mihika Prabhu, Fadi Atieh, Li **g, Tena Dubcek, Chenkai Mao, Miles R. Johnson, Vladimir Ceperic, John D. Joannopoulos, Dirk Englund, Marin Soljacic

Abstract: The inability of conventional electronic architectures to efficiently solve large combinatorial problems motivates the development of novel computational hardware. There has been much effort recently toward develo** novel, application-specific hardware, across many different fields of engineering, such as integrated circuits, memristors, and photonics. However, unleashing the true potential of s… ▽ More The inability of conventional electronic architectures to efficiently solve large combinatorial problems motivates the development of novel computational hardware. There has been much effort recently toward develo** novel, application-specific hardware, across many different fields of engineering, such as integrated circuits, memristors, and photonics. However, unleashing the true potential of such novel architectures requires the development of featured algorithms which optimally exploit their fundamental properties. We here present the Photonic Recurrent Ising Sampler (PRIS), a heuristic method tailored for parallel architectures that allows for fast and efficient sampling from distributions of combinatorially hard Ising problems. Since the PRIS relies essentially on vector-to-fixed matrix multiplications, we suggest the implementation of the PRIS in photonic parallel networks, which realize these operations at an unprecedented speed. The PRIS provides sample solutions to the ground state of arbitrary Ising models, by converging in probability to their associated Gibbs distribution. By running the PRIS at various noise levels, we probe the critical behavior of universality classes and their critical exponents. In addition to the attractive features of photonic networks, the PRIS relies on intrinsic dynamic noise and eigenvalue dropout to find ground states more efficiently. Our work suggests speedups in heuristic methods via photonic implementations of the PRIS. We also hint at a broader class of (meta)heuristic algorithms derived from the PRIS, such as combined simulated annealing on the noise and eigenvalue dropout levels. Our algorithm can also be implemented in a competitive manner on fast parallel electronic hardware, such as FPGAs and ASICs. △ Less

Submitted 19 November, 2019; v1 submitted 6 November, 2018; originally announced November 2018.

Comments: Main text : 10 pages, 4 figures; Supplementary Information: 33 pages, 16 figures

Journal ref: Nature Communications 11, 249 (2020)

arXiv:1808.03303 [pdf, other]

On-Chip Optical Convolutional Neural Networks

Authors: Hengameh Bagherian, Scott Skirlo, Yichen Shen, Huaiyu Meng, Vladimir Ceperic, Marin Soljacic

Abstract: Convolutional Neural Networks (CNNs) are a class of Artificial Neural Networks(ANNs) that employ the method of convolving input images with filter-kernels for object recognition and classification purposes. In this paper, we propose a photonics circuit architecture which could consume a fraction of energy per inference compared with state of the art electronics. Convolutional Neural Networks (CNNs) are a class of Artificial Neural Networks(ANNs) that employ the method of convolving input images with filter-kernels for object recognition and classification purposes. In this paper, we propose a photonics circuit architecture which could consume a fraction of energy per inference compared with state of the art electronics. △ Less

Submitted 16 August, 2018; v1 submitted 9 August, 2018; originally announced August 2018.

Comments: 18 pages, 7 figures

Showing 1–5 of 5 results for author: Ceperic, V