Skip to main content

Showing 1–39 of 39 results for author: Soljačić, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06576  [pdf, other

    cs.CL cs.AI cs.LG

    OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step

    Authors: Owen Dugan, Donato Manuel Jimenez Beneto, Charlotte Loh, Zhuo Chen, Rumen Dangovski, Marin Soljačić

    Abstract: Despite significant advancements in text generation and reasoning, Large Language Models (LLMs) still face challenges in accurately performing complex arithmetic operations. To achieve accurate calculations, language model systems often enable LLMs to generate code for arithmetic operations. However, this approach compromises speed and security and, if finetuning is involved, risks the language mo… ▽ More

    Submitted 29 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2406.04000  [pdf, other

    physics.optics cs.ET

    Stochastic logic in biased coupled photonic probabilistic bits

    Authors: Michael Horodynski, Charles Roques-Carmes, Yannick Salamin, Seou Choi, Jamison Sloan, Di Luo, Marin Soljačić

    Abstract: Optical computing often employs tailor-made hardware to implement specific algorithms, trading generality for improved performance in key aspects like speed and power efficiency. An important computing approach that is still missing its corresponding optical hardware is probabilistic computing, used e.g. for solving difficult combinatorial optimization problems. In this study, we propose an experi… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2406.00132  [pdf, other

    cs.LG quant-ph

    QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation

    Authors: Zhuo Chen, Rumen Dangovski, Charlotte Loh, Owen Dugan, Di Luo, Marin Soljačić

    Abstract: We propose Quantum-informed Tensor Adaptation (QuanTA), a novel, easy-to-implement, fine-tuning method with no inference overhead for large-scale pre-trained language models. By leveraging quantum-inspired methods derived from quantum circuit structures, QuanTA enables efficient high-rank fine-tuning, surpassing the limitations of Low-Rank Adaptation (LoRA)--low-rank approximation may fail for com… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  4. arXiv:2404.19756  [pdf, other

    cs.LG cond-mat.dis-nn cs.AI stat.ML

    KAN: Kolmogorov-Arnold Networks

    Authors: Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y. Hou, Max Tegmark

    Abstract: Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametriz… ▽ More

    Submitted 16 June, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 48 pages, 20 figures. Codes are available at https://github.com/KindXiaoming/pykan

  5. arXiv:2404.10771  [pdf, other

    cs.LG math.NA physics.comp-ph

    TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision

    Authors: Zhuo Chen, Jacob McCarran, Esteban Vizcaino, Marin Soljačić, Di Luo

    Abstract: Partial differential equations (PDEs) are instrumental for modeling dynamical systems in science and engineering. The advent of neural networks has initiated a significant shift in tackling these complexities though challenges in accuracy persist, especially for initial value problems. In this paper, we introduce the $\textit{Time-Evolving Natural Gradient (TENG)}$, generalizing time-dependent var… ▽ More

    Submitted 3 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Report number: MIT-CTP/5706

  6. arXiv:2312.00111  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Multimodal Learning for Materials

    Authors: Viggo Moro, Charlotte Loh, Rumen Dangovski, Ali Ghorashi, Andrew Ma, Zhuo Chen, Samuel Kim, Peter Y. Lu, Thomas Christensen, Marin Soljačić

    Abstract: Artificial intelligence is transforming computational materials science, improving the prediction of material properties, and accelerating the discovery of novel materials. Recently, publicly available material data repositories have grown rapidly. This growth encompasses not only more materials, but also a greater variety and quantity of their associated properties. Existing machine learning effo… ▽ More

    Submitted 12 April, 2024; v1 submitted 30 November, 2023; originally announced December 2023.

    Comments: 11 pages, 4 figures

  7. arXiv:2304.01996  [pdf, other

    quant-ph cond-mat.str-el cs.LG physics.comp-ph

    ANTN: Bridging Autoregressive Neural Networks and Tensor Networks for Quantum Many-Body Simulation

    Authors: Zhuo Chen, Laker Newhouse, Eddie Chen, Di Luo, Marin Soljačić

    Abstract: Quantum many-body physics simulation has important impacts on understanding fundamental science and has applications to quantum materials design and quantum technology. However, due to the exponentially growing size of the Hilbert space with respect to the particle number, a direct simulation is intractable. While representing quantum states with tensor networks and neural networks are the two sta… ▽ More

    Submitted 16 April, 2024; v1 submitted 4 April, 2023; originally announced April 2023.

    Report number: MIT-CTP/5549

  8. arXiv:2303.11277  [pdf, other

    cs.LG

    Model Stitching: Looking For Functional Similarity Between Representations

    Authors: Adriano Hernandez, Rumen Dangovski, Peter Y. Lu, Marin Soljacic

    Abstract: Model stitching (Lenc & Vedaldi 2015) is a compelling methodology to compare different neural network representations, because it allows us to measure to what degree they may be interchanged. We expand on a previous work from Bansal, Nakkiran & Barak which used model stitching to compare representations of the same shapes learned by differently seeded and/or trained neural networks of the same arc… ▽ More

    Submitted 31 August, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: 5 pages, 2 figures

  9. arXiv:2303.02484  [pdf, other

    cs.LG cs.AI cs.CV

    Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries

    Authors: Charlotte Loh, Seungwook Han, Shivchander Sudalairaj, Rumen Dangovski, Kai Xu, Florian Wenzel, Marin Soljacic, Akash Srivastava

    Abstract: Deep ensembles (DE) have been successful in improving model performance by learning diverse members via the stochasticity of random initialization. While recent works have attempted to promote further diversity in DE via hyperparameters or regularizing loss functions, these methods primarily still rely on a stochastic approach to explore the hypothesis space. In this work, we present Multi-Symmetr… ▽ More

    Submitted 19 June, 2023; v1 submitted 4 March, 2023; originally announced March 2023.

    Comments: Camera Ready Revision. ICML 2023

  10. arXiv:2302.12235  [pdf, other

    quant-ph cond-mat.dis-nn cond-mat.quant-gas cs.LG physics.comp-ph

    Q-Flow: Generative Modeling for Differential Equations of Open Quantum Dynamics with Normalizing Flows

    Authors: Owen Dugan, Peter Y. Lu, Rumen Dangovski, Di Luo, Marin Soljačić

    Abstract: Studying the dynamics of open quantum systems can enable breakthroughs both in fundamental physics and applications to quantum engineering and quantum computation. Since the density matrix $ρ$, which is the fundamental description for the dynamics of such systems, is high-dimensional, customized deep generative neural networks have been instrumental in modeling $ρ$. However, the complex-valued nat… ▽ More

    Submitted 6 June, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Report number: MIT-CTP/5533

  11. arXiv:2302.03019  [pdf, other

    cs.RO cond-mat.stat-mech physics.bio-ph physics.comp-ph

    Geometry of contact: contact planning for multi-legged robots via spin models duality

    Authors: Baxi Chong, Di Luo, Tianyu Wang, Gabriel Margolis, Juntao He, Pulkit Agrawal, Marin Soljačić, Daniel I. Goldman

    Abstract: Contact planning is crucial in locomoting systems.Specifically, appropriate contact planning can enable versatile behaviors (e.g., sidewinding in limbless locomotors) and facilitate speed-dependent gait transitions (e.g., walk-trot-gallop in quadrupedal locomotors). The challenges of contact planning include determining not only the sequence by which contact is made and broken between the locomoto… ▽ More

    Submitted 7 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: SI video: https://doi.org/10.5281/zenodo.7608693

    Report number: MIT-CTP/5526

  12. arXiv:2211.01365  [pdf, other

    quant-ph cs.AI cs.LG math.OC physics.comp-ph

    QuACK: Accelerating Gradient-Based Quantum Optimization with Koopman Operator Learning

    Authors: Di Luo, Jiayu Shen, Rumen Dangovski, Marin Soljačić

    Abstract: Quantum optimization, a key application of quantum computing, has traditionally been stymied by the linearly increasing complexity of gradient calculations with an increasing number of parameters. This work bridges the gap between Koopman operator theory, which has found utility in applications because it allows for a linear representation of nonlinear dynamical systems, and natural gradient metho… ▽ More

    Submitted 4 May, 2024; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: Advances in Neural Information Processing Systems 36 (NeurIPS 2023) spotlight

    Report number: MIT-CTP/5488

  13. arXiv:2210.06171  [pdf, other

    cs.LG

    Learning to Optimize Quasi-Newton Methods

    Authors: Isaac Liao, Rumen R. Dangovski, Jakob N. Foerster, Marin Soljačić

    Abstract: Fast gradient-based optimization algorithms have become increasingly essential for the computationally efficient training of machine learning models. One technique is to multiply the gradient by a preconditioner matrix to produce a step, but it is unclear what the best preconditioner matrix is. This paper introduces a novel machine learning optimizer called LODO, which tries to online meta-learn t… ▽ More

    Submitted 11 September, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

    ACM Class: I.2.6

  14. arXiv:2210.04783  [pdf, other

    cs.LG cs.CV physics.app-ph

    On the Importance of Calibration in Semi-supervised Learning

    Authors: Charlotte Loh, Rumen Dangovski, Shivchander Sudalairaj, Seungwook Han, Ligong Han, Leonid Karlinsky, Marin Soljacic, Akash Srivastava

    Abstract: State-of-the-art (SOTA) semi-supervised learning (SSL) methods have been highly successful in leveraging a mix of labeled and unlabeled data by combining techniques of consistency regularization and pseudo-labeling. During pseudo-labeling, the model's predictions on unlabeled data are used for training and thus, model calibration is important in mitigating confirmation bias. Yet, many SOTA methods… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: 24 pages

  15. arXiv:2210.00563  [pdf, other

    cs.SC cs.LG econ.EM

    AI-Assisted Discovery of Quantitative and Formal Models in Social Science

    Authors: Julia Balla, Sihao Huang, Owen Dugan, Rumen Dangovski, Marin Soljacic

    Abstract: In social science, formal and quantitative models, such as ones describing economic growth and collective action, are used to formulate mechanistic explanations, provide predictions, and uncover questions about observed phenomena. Here, we demonstrate the use of a machine learning system to aid the discovery of symbolic models that capture nonlinear and dynamical relationships in social science da… ▽ More

    Submitted 16 August, 2023; v1 submitted 2 October, 2022; originally announced October 2022.

    Comments: 19 pages, 4 figures

  16. arXiv:2208.14995  [pdf, other

    physics.comp-ph cs.LG nlin.CD nlin.SI physics.data-an

    Discovering Conservation Laws using Optimal Transport and Manifold Learning

    Authors: Peter Y. Lu, Rumen Dangovski, Marin Soljačić

    Abstract: Conservation laws are key theoretical and practical tools for understanding, characterizing, and modeling nonlinear dynamical systems. However, for many complex systems, the corresponding conserved quantities are difficult to identify, making it hard to analyze their dynamics and build stable predictive models. Current approaches for discovering conservation laws often depend on detailed dynamical… ▽ More

    Submitted 22 August, 2023; v1 submitted 31 August, 2022; originally announced August 2022.

    Comments: 30 pages, 15 figures (7 main text, 8 supplemental), 3 tables (supplemental)

    Journal ref: Nat. Commun. 14, 4744 (2023)

  17. arXiv:2207.00529  [pdf, other

    cs.LG cs.SC physics.comp-ph physics.data-an

    Deep Learning and Symbolic Regression for Discovering Parametric Equations

    Authors: Michael Zhang, Samuel Kim, Peter Y. Lu, Marin Soljačić

    Abstract: Symbolic regression is a machine learning technique that can learn the governing formulas of data and thus has the potential to transform scientific discovery. However, symbolic regression is still limited in the complexity and dimensionality of the systems that it can analyze. Deep learning on the other hand has transformed machine learning in its ability to analyze extremely complex and high-dim… ▽ More

    Submitted 28 May, 2023; v1 submitted 1 July, 2022; originally announced July 2022.

    Comments: Michael Zhang and Samuel Kim contributed equally to this work. 13 pages, 7 figures

  18. arXiv:2204.10298  [pdf, other

    cs.CL

    DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings

    Authors: Yung-Sung Chuang, Rumen Dangovski, Hongyin Luo, Yang Zhang, Shiyu Chang, Marin Soljačić, Shang-Wen Li, Wen-tau Yih, Yoon Kim, James Glass

    Abstract: We propose DiffCSE, an unsupervised contrastive learning framework for learning sentence embeddings. DiffCSE learns sentence embeddings that are sensitive to the difference between the original sentence and an edited sentence, where the edited sentence is obtained by stochastically masking out the original sentence and then sampling from a masked language model. We show that DiffSCE is an instance… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

    Comments: NAACL 2022 main conference (Long paper). Pretrained models and code are available at https://github.com/voidism/DiffCSE

  19. arXiv:2202.05255  [pdf, other

    cond-mat.mtrl-sci cond-mat.mes-hall cs.LG

    Topogivity: A Machine-Learned Chemical Rule for Discovering Topological Materials

    Authors: Andrew Ma, Yang Zhang, Thomas Christensen, Hoi Chun Po, Li **g, Liang Fu, Marin Soljačić

    Abstract: Topological materials present unconventional electronic properties that make them attractive for both basic science and next-generation technological applications. The majority of currently known topological materials have been discovered using methods that involve symmetry-based analysis of the quantum wavefunction. Here we use machine learning to develop a simple-to-use heuristic chemical rule t… ▽ More

    Submitted 23 January, 2023; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: Main text: 6 pages, 3 figures; supplementary materials: 43 pages, 62 figures, 5 tables

    Journal ref: Nano Lett. 2023, 23, 3, 772-778

  20. arXiv:2112.11929  [pdf, other

    cs.CV cs.LG

    Meta-Learning and Self-Supervised Pretraining for Real World Image Translation

    Authors: Ileana Rugina, Rumen Dangovski, Mark Veillette, Pooya Khorrami, Brian Cheung, Olga Simek, Marin Soljačić

    Abstract: Recent advances in deep learning, in particular enabled by hardware advances and big data, have provided impressive results across a wide range of computational problems such as computer vision, natural language, or reinforcement learning. Many of these improvements are however constrained to problems with large-scale curated data-sets which require a lot of human labor to gather. Additionally, th… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: 10 pages, 8 figures, 2 tables

  21. arXiv:2111.00899  [pdf, other

    cs.CV cs.LG eess.IV physics.app-ph

    Equivariant Contrastive Learning

    Authors: Rumen Dangovski, Li **g, Charlotte Loh, Seungwook Han, Akash Srivastava, Brian Cheung, Pulkit Agrawal, Marin Soljačić

    Abstract: In state-of-the-art self-supervised learning (SSL) pre-training produces semantically good representations by encouraging them to be invariant under meaningful transformations prescribed from human knowledge. In fact, the property of invariance is a trivial instance of a broader class called equivariance, which can be intuitively understood as the property that representations transform according… ▽ More

    Submitted 14 March, 2022; v1 submitted 28 October, 2021; originally announced November 2021.

    Comments: Camera Ready Revision. ICLR 2022. Discussion: https://openreview.net/forum?id=gKLAAfiytI Code: https://github.com/rdangovs/essl

  22. arXiv:2110.08406  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.app-ph physics.optics

    Surrogate- and invariance-boosted contrastive learning for data-scarce applications in science

    Authors: Charlotte Loh, Thomas Christensen, Rumen Dangovski, Samuel Kim, Marin Soljacic

    Abstract: Deep learning techniques have been increasingly applied to the natural sciences, e.g., for property prediction and optimization or material discovery. A fundamental ingredient of such approaches is the vast quantity of labelled data needed to train the model; this poses severe challenges in data-scarce settings where obtaining labels requires substantial computational or labor resources. Here, we… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

    Comments: 21 pages, 10 figures

  23. arXiv:2107.10879  [pdf, other

    cs.LG physics.comp-ph physics.data-an

    Discovering Sparse Interpretable Dynamics from Partial Observations

    Authors: Peter Y. Lu, Joan Ariño, Marin Soljačić

    Abstract: Identifying the governing equations of a nonlinear dynamical system is key to both understanding the physical features of the system and constructing an accurate model of the dynamics that generalizes well beyond the available data. We propose a machine learning framework for discovering these governing equations using only partial observations, combining an encoder for state reconstruction with a… ▽ More

    Submitted 15 December, 2021; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: 10 pages, 6 figures (4 main text, 2 supplemental)

    Journal ref: Commun. Phys. 5, 206 (2022)

  24. arXiv:2104.11667  [pdf, other

    cs.LG physics.app-ph physics.chem-ph physics.comp-ph physics.optics

    Deep Learning for Bayesian Optimization of Scientific Problems with High-Dimensional Structure

    Authors: Samuel Kim, Peter Y. Lu, Charlotte Loh, Jamie Smith, Jasper Snoek, Marin Soljačić

    Abstract: Bayesian optimization (BO) is a popular paradigm for global optimization of expensive black-box functions, but there are many domains where the function is not completely a black-box. The data may have some known structure (e.g. symmetries) and/or the data generation process may be a composite process that yields useful intermediate or auxiliary information in addition to the value of the optimiza… ▽ More

    Submitted 6 December, 2022; v1 submitted 23 April, 2021; originally announced April 2021.

    Comments: 32 pages, 16 figures; published in TMLR

    Journal ref: Transactions on Machine Learning Research (TMLR) September 2022

  25. arXiv:2012.02030  [pdf, other

    cs.CL

    Data-Informed Global Sparseness in Attention Mechanisms for Deep Neural Networks

    Authors: Ileana Rugina, Rumen Dangovski, Li **g, Preslav Nakov, Marin Soljačić

    Abstract: Attention mechanisms play a crucial role in the neural revolution of Natural Language Processing (NLP). With the growth of attention-based models, several pruning techniques have been developed to identify and exploit sparseness, making these models more efficient. Most efforts focus on hard-coding attention patterns or pruning attention weights based on training data. We propose Attention Pruning… ▽ More

    Submitted 17 May, 2024; v1 submitted 20 November, 2020; originally announced December 2020.

    Comments: Presented at LREC-COLING 2024: 12 pages, 4 figures, 11 tables

  26. arXiv:2007.10784  [pdf, other

    cs.LG cs.NE stat.ML

    OccamNet: A Fast Neural Model for Symbolic Regression at Scale

    Authors: Owen Dugan, Rumen Dangovski, Allan Costa, Samuel Kim, Pawan Goyal, Joseph Jacobson, Marin Soljačić

    Abstract: Neural networks' expressiveness comes at the cost of complex, black-box models that often extrapolate poorly beyond the domain of the training dataset, conflicting with the goal of finding compact analytic expressions to describe scientific data. We introduce OccamNet, a neural network model that finds interpretable, compact, and sparse symbolic fits to data, à la Occam's razor. Our model defines… ▽ More

    Submitted 27 November, 2023; v1 submitted 16 July, 2020; originally announced July 2020.

  27. arXiv:2007.10143  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Contextualizing Enhances Gradient Based Meta Learning

    Authors: Evan Vogelbaum, Rumen Dangovski, Li **g, Marin Soljačić

    Abstract: Meta learning methods have found success when applied to few shot classification problems, in which they quickly adapt to a small number of labeled examples. Prototypical representations, each representing a particular class, have been of particular importance in this setting, as they provide a compact form to convey information learned from the labeled examples. However, these prototypes are just… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

  28. arXiv:2007.09456  [pdf, ps, other

    cs.CL cs.LG stat.ML

    On a Novel Application of Wasserstein-Procrustes for Unsupervised Cross-Lingual Learning

    Authors: Guillem Ramírez, Rumen Dangovski, Preslav Nakov, Marin Soljačić

    Abstract: The emergence of unsupervised word embeddings, pre-trained on very large monolingual text corpora, is at the core of the ongoing neural revolution in Natural Language Processing (NLP). Initially introduced for English, such pre-trained word embeddings quickly emerged for a number of other languages. Subsequently, there have been a number of attempts to align the embedding spaces across languages,… ▽ More

    Submitted 16 June, 2024; v1 submitted 18 July, 2020; originally announced July 2020.

    Journal ref: Proceedings of the 17th Workshop on Building and Using Comparable Corpora (BUCC) at LREC-COLING 2024

  29. arXiv:1912.04825  [pdf, other

    cs.LG cs.NE physics.data-an stat.ML

    Integration of Neural Network-Based Symbolic Regression in Deep Learning for Scientific Discovery

    Authors: Samuel Kim, Peter Y. Lu, Srijon Mukherjee, Michael Gilbert, Li **g, Vladimir Čeperić, Marin Soljačić

    Abstract: Symbolic regression is a powerful technique that can discover analytical equations that describe data, which can lead to explainable models and generalizability outside of the training data set. In contrast, neural networks have achieved amazing levels of accuracy on image recognition and natural language processing tasks, but are often seen as black-box models that are difficult to interpret and… ▽ More

    Submitted 13 August, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: 12 pages, 10 figures

    Journal ref: IEEE.Trans.Neural.Netw.Learn.Syst. 32 (2021) 4166-4177

  30. arXiv:1907.06011  [pdf, ps, other

    physics.comp-ph cs.LG physics.data-an stat.ML

    Extracting Interpretable Physical Parameters from Spatiotemporal Systems using Unsupervised Learning

    Authors: Peter Y. Lu, Samuel Kim, Marin Soljačić

    Abstract: Experimental data is often affected by uncontrolled variables that make analysis and interpretation difficult. For spatiotemporal systems, this problem is further exacerbated by their intricate dynamics. Modern machine learning methods are particularly well-suited for analyzing and modeling complex datasets, but to be effective in science, the result needs to be interpretable. We demonstrate an un… ▽ More

    Submitted 14 September, 2020; v1 submitted 13 July, 2019; originally announced July 2019.

    Comments: 19 pages, 9 figures, 2 tables

    Journal ref: Phys. Rev. X 10, 031056 (2020)

  31. arXiv:1812.07614  [pdf, other

    cs.ET physics.optics

    Large-Scale Optical Neural Networks based on Photoelectric Multiplication

    Authors: Ryan Hamerly, Liane Bernstein, Alexander Sludds, Marin Soljačić, Dirk Englund

    Abstract: Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large ($N \gtrsim 10^6$) networks and can be operated at high (GHz) speeds and very low (sub-aJ) energies per multiply-and-accumulate (MAC), using the massive spa… ▽ More

    Submitted 18 May, 2019; v1 submitted 12 November, 2018; originally announced December 2018.

    Comments: Text: 10 pages, 5 figures, 1 table. Supplementary: 8 pages, 5, figures, 2 tables

    Journal ref: Phys. Rev. X 9, 021032 (2019)

  32. arXiv:1811.11644  [pdf, other

    cs.LG cs.CV stat.ML

    WaveletNet: Logarithmic Scale Efficient Convolutional Neural Networks for Edge Devices

    Authors: Li **g, Rumen Dangovski, Marin Soljacic

    Abstract: We present a logarithmic-scale efficient convolutional neural network architecture for edge devices, named WaveletNet. Our model is based on the well-known depthwise convolution, and on two new layers, which we introduce in this work: a wavelet convolution and a depthwise fast wavelet transform. By breaking the symmetry in channel dimensions and applying a fast algorithm, WaveletNet shrinks the co… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

    Comments: 10 pages, 5 figures

  33. arXiv:1811.02705  [pdf, other

    physics.app-ph cs.ET physics.optics

    Heuristic Recurrent Algorithms for Photonic Ising Machines

    Authors: Charles Roques-Carmes, Yichen Shen, Cristian Zanoci, Mihika Prabhu, Fadi Atieh, Li **g, Tena Dubcek, Chenkai Mao, Miles R. Johnson, Vladimir Ceperic, John D. Joannopoulos, Dirk Englund, Marin Soljacic

    Abstract: The inability of conventional electronic architectures to efficiently solve large combinatorial problems motivates the development of novel computational hardware. There has been much effort recently toward develo** novel, application-specific hardware, across many different fields of engineering, such as integrated circuits, memristors, and photonics. However, unleashing the true potential of s… ▽ More

    Submitted 19 November, 2019; v1 submitted 6 November, 2018; originally announced November 2018.

    Comments: Main text : 10 pages, 4 figures; Supplementary Information: 33 pages, 16 figures

    Journal ref: Nature Communications 11, 249 (2020)

  34. arXiv:1809.00972  [pdf, other

    cs.CV cs.LG physics.comp-ph

    Migrating Knowledge between Physical Scenarios based on Artificial Neural Networks

    Authors: Yurui Qu, Li **g, Yichen Shen, Min Qiu, Marin Soljacic

    Abstract: Deep learning is known to be data-hungry, which hinders its application in many areas of science when datasets are small. Here, we propose to use transfer learning methods to migrate knowledge between different physical scenarios and significantly improve the prediction accuracy of artificial neural networks trained on a small dataset. This method can help reduce the demand for expensive data by m… ▽ More

    Submitted 2 May, 2019; v1 submitted 27 August, 2018; originally announced September 2018.

  35. arXiv:1808.03303  [pdf, other

    cs.ET

    On-Chip Optical Convolutional Neural Networks

    Authors: Hengameh Bagherian, Scott Skirlo, Yichen Shen, Huaiyu Meng, Vladimir Ceperic, Marin Soljacic

    Abstract: Convolutional Neural Networks (CNNs) are a class of Artificial Neural Networks(ANNs) that employ the method of convolving input images with filter-kernels for object recognition and classification purposes. In this paper, we propose a photonics circuit architecture which could consume a fraction of energy per inference compared with state of the art electronics.

    Submitted 16 August, 2018; v1 submitted 9 August, 2018; originally announced August 2018.

    Comments: 18 pages, 7 figures

  36. arXiv:1710.09537  [pdf, other

    cs.LG cs.NE stat.ML

    Rotational Unit of Memory

    Authors: Rumen Dangovski, Li **g, Marin Soljacic

    Abstract: The concepts of unitary evolution matrices and associative memory have boosted the field of Recurrent Neural Networks (RNN) to state-of-the-art performance in a variety of sequential tasks. However, RNN still have a limited capacity to manipulate long-term memory. To bypass this weakness the most successful applications of RNN use external techniques such as attention mechanisms. In this paper we… ▽ More

    Submitted 26 October, 2017; originally announced October 2017.

  37. arXiv:1706.02761  [pdf, other

    cs.LG cs.NE stat.ML

    Gated Orthogonal Recurrent Units: On Learning to Forget

    Authors: Li **g, Caglar Gulcehre, John Peurifoy, Yichen Shen, Max Tegmark, Marin Soljačić, Yoshua Bengio

    Abstract: We present a novel recurrent neural network (RNN) based model that combines the remembering ability of unitary RNNs with the ability of gated RNNs to effectively forget redundant/irrelevant information in its memory. We achieve this by extending unitary RNNs with a gating mechanism. Our model is able to outperform LSTMs, GRUs and Unitary RNNs on several long-term dependency benchmark tasks. We emp… ▽ More

    Submitted 25 October, 2017; v1 submitted 8 June, 2017; originally announced June 2017.

  38. arXiv:1612.05231  [pdf, other

    cs.LG cs.NE stat.ML

    Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

    Authors: Li **g, Yichen Shen, Tena Dubček, John Peurifoy, Scott Skirlo, Yann LeCun, Max Tegmark, Marin Soljačić

    Abstract: Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Networ… ▽ More

    Submitted 3 April, 2017; v1 submitted 15 December, 2016; originally announced December 2016.

    Comments: 9 pages, 4 figures

  39. arXiv:1408.6915  [pdf, ps, other

    cs.CV cs.IT

    Binary matrices of optimal autocorrelations as alignment marks

    Authors: Scott A. Skirlo, Ling Lu, Marin Soljačić

    Abstract: We define a new class of binary matrices by maximizing the peak-sidelobe distances in the aperiodic autocorrelations. These matrices can be used as robust position marks for in-plane spatial alignment. The optimal square matrices of dimensions up to 7 by 7 and optimal diagonally-symmetric matrices of 8 by 8 and 9 by 9 were found by exhaustive searches.

    Submitted 28 August, 2014; originally announced August 2014.

    Comments: 8 pages, 6 figures and 1 table