-
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Authors:
Taishi Nakamura,
Mayank Mishra,
Simone Tedeschi,
Yekun Chai,
Jason T Stillerman,
Felix Friedrich,
Prateek Yadav,
Tanmay Laud,
Vu Minh Chien,
Terry Yue Zhuo,
Diganta Misra,
Ben Bogin,
Xuan-Son Vu,
Marzena Karpinska,
Arnav Varma Dantuluri,
Wojciech Kusa,
Tommaso Furlanello,
Rio Yokota,
Niklas Muennighoff,
Suhas Pai,
Tosin Adewumi,
Veronika Laippala,
Xiaozhe Yao,
Adalberto Junior,
Alpay Ariyak
, et al. (20 additional authors not shown)
Abstract:
Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to democratize access to pretrained models for collaborative community development. However, such existing models face challenges: limited multilingual capabilities, continual pretraining causing catastrophic forgetting, where…
▽ More
Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to democratize access to pretrained models for collaborative community development. However, such existing models face challenges: limited multilingual capabilities, continual pretraining causing catastrophic forgetting, whereas pretraining from scratch is computationally expensive, and compliance with AI safety and development laws. This paper presents Aurora-M, a 15B parameter multilingual open-source model trained on English, Finnish, Hindi, Japanese, Vietnamese, and code. Continually pretrained from StarCoderPlus on 435 billion additional tokens, Aurora-M surpasses 2 trillion tokens in total training token count. It is the first open-source multilingual model fine-tuned on human-reviewed safety instructions, thus aligning its development not only with conventional red-teaming considerations, but also with the specific concerns articulated in the Biden-Harris Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. Aurora-M is rigorously evaluated across various tasks and languages, demonstrating robustness against catastrophic forgetting and outperforming alternatives in multilingual settings, particularly in safety evaluations. To promote responsible open-source LLM development, Aurora-M and its variants are released at https://huggingface.co/collections/aurora-m/aurora-m-models-65fdfdff62471e09812f5407 .
△ Less
Submitted 23 April, 2024; v1 submitted 30 March, 2024;
originally announced April 2024.
-
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
Authors:
Shivalika Singh,
Freddie Vargus,
Daniel Dsouza,
Börje F. Karlsson,
Abinaya Mahendiran,
Wei-Yin Ko,
Herumb Shandilya,
Jay Patel,
Deividas Mataciunas,
Laura OMahony,
Mike Zhang,
Ramith Hettiarachchi,
Joseph Wilson,
Marina Machado,
Luisa Souza Moura,
Dominik Krzemiński,
Hakimeh Fadaei,
Irem Ergün,
Ifeoma Okoh,
Aisha Alaagib,
Oshan Mudannayake,
Zaid Alyafeai,
Vu Minh Chien,
Sebastian Ruder,
Surya Guthikonda
, et al. (8 additional authors not shown)
Abstract:
Datasets are foundational to many breakthroughs in modern artificial intelligence. Many recent achievements in the space of natural language processing (NLP) can be attributed to the finetuning of pre-trained models on a diverse set of tasks that enables a large language model (LLM) to respond to instructions. Instruction fine-tuning (IFT) requires specifically constructed and annotated datasets.…
▽ More
Datasets are foundational to many breakthroughs in modern artificial intelligence. Many recent achievements in the space of natural language processing (NLP) can be attributed to the finetuning of pre-trained models on a diverse set of tasks that enables a large language model (LLM) to respond to instructions. Instruction fine-tuning (IFT) requires specifically constructed and annotated datasets. However, existing datasets are almost all in the English language. In this work, our primary goal is to bridge the language gap by building a human-curated instruction-following dataset spanning 65 languages. We worked with fluent speakers of languages from around the world to collect natural instances of instructions and completions. Furthermore, we create the most extensive multilingual collection to date, comprising 513 million instances through templating and translating existing datasets across 114 languages. In total, we contribute four key resources: we develop and open-source the Aya Annotation Platform, the Aya Dataset, the Aya Collection, and the Aya Evaluation Suite. The Aya initiative also serves as a valuable case study in participatory research, involving collaborators from 119 countries. We see this as a valuable framework for future research collaborations that aim to bridge gaps in resources.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Numerical ranges of cyclic shift matrices
Authors:
Mao-Ting Chien,
Steve Kirkland,
Chi-Kwong Li,
Hiroshi Nakazato
Abstract:
We study the numerical range of an $n\times n$ cyclic shift matrix, which can be viewed as the adjacency matrix of a directed cycle with $n$ weighted arcs. In particular, we consider the change in the numerical range if the weights are rearranged or perturbed. In addition to obtaining some general results on the problem, a permutation of the given weights is identified such that the corresponding…
▽ More
We study the numerical range of an $n\times n$ cyclic shift matrix, which can be viewed as the adjacency matrix of a directed cycle with $n$ weighted arcs. In particular, we consider the change in the numerical range if the weights are rearranged or perturbed. In addition to obtaining some general results on the problem, a permutation of the given weights is identified such that the corresponding matrix yields the largest numerical range (in terms of set inclusion), for $n \le 6$. We conjecture that the maximizing pattern extends to general $n\times n$ cylic shift matrices. For $n \le 5$, we also determine permutations such that the corresponding cyclic shift matrix yields the smallest numerical range.
△ Less
Submitted 29 August, 2023; v1 submitted 11 April, 2023;
originally announced April 2023.
-
Analysis of carbon content in direct-write plasmonic Au structures by nanomechanical scanning absorption microscopy
Authors:
Miao-Hsuan Chien,
Mostafa M. Shawrav,
Kurt Hingerl,
Philipp Taus,
Markus Schinnerl,
Heinz D. Wanzenboeck,
Silvan Schmid
Abstract:
The determination of the chemical content is crucial for the quality control in high-precision device fabrication and advanced process development. For reliable chemical composition characterization, certain interaction volume of the target material is necessary for conventional techniques such as energy-dispersive X-ray spectroscopy (EDX) and electron energy loss spectroscopy (EELS). This remains…
▽ More
The determination of the chemical content is crucial for the quality control in high-precision device fabrication and advanced process development. For reliable chemical composition characterization, certain interaction volume of the target material is necessary for conventional techniques such as energy-dispersive X-ray spectroscopy (EDX) and electron energy loss spectroscopy (EELS). This remains however a challenge for nanostructures. We hereby propose an alternative technique for measuring chemical composition of nanostructures with limited volume. By measuring the differences in the optical absorption of the nanostructure due to the differences in the chemical composition with the resonance frequency detuning of a nanomechanical resonator as well as the assistance of the analytical optical modelling, we demonstrate the possibility of characterizing the carbon content in the direct-write focused electron beam induced deposition (FEBID) gold nanostructures.
△ Less
Submitted 22 October, 2020;
originally announced October 2020.
-
Thermal transport and frequency response of localized modes on low-stress nanomechanical silicon nitride drums featuring a phononic bandgap structure
Authors:
Pedram Sadeghi,
Manuel Tanzer,
Niklas Luhmann,
Markus Piller,
Miao-Hsuan Chien,
Silvan Schmid
Abstract:
Development of broadband thermal sensors for the detection of, among others, radiation, single nanoparticles, or single molecules is of great interest. In recent years, photothermal spectroscopy based on the shift of the resonance frequency of stressed nanomechanical resonators has been successfully demonstrated. Here, we show the application of soft-clamped phononic crystal membranes made of sili…
▽ More
Development of broadband thermal sensors for the detection of, among others, radiation, single nanoparticles, or single molecules is of great interest. In recent years, photothermal spectroscopy based on the shift of the resonance frequency of stressed nanomechanical resonators has been successfully demonstrated. Here, we show the application of soft-clamped phononic crystal membranes made of silicon nitride as thermal sensors. It is experimentally demonstrated how a quasi-bandgap remains even at very low tensile stress, in agreement with finite element method simulations. An increase of the relative responsivity of the fundamental defect mode is found when compared to that of uniform square membranes of equal size, with enhancement factors as large as an order of magnitude. We then show phononic crystals engineered inside nanomechanical trampolines, which results in additional reduction of the tensile stress and increased thermal isolation, resulting in further enhancement of the responsivity. Finally, defect mode and bandgap tuning is shown by laser heating of the defect to the point where the fundamental defect mode completely leaves the bandgap.
△ Less
Submitted 21 May, 2020;
originally announced May 2020.
-
Towards nanoelectromechanical photothermal localization microscopy
Authors:
Miao-Hsuan Chien,
Silvan Schmid
Abstract:
Single-molecule microscopy has become an indispensable tool for biochemical analysis. The capability of characterizing distinct properties of individual molecules without averaging has provided us with a different perspective for the existing scientific issues and phenomena. Recently, the super-resolution fluorescence microscopy techniques have overcome the optical diffraction limit by the localiz…
▽ More
Single-molecule microscopy has become an indispensable tool for biochemical analysis. The capability of characterizing distinct properties of individual molecules without averaging has provided us with a different perspective for the existing scientific issues and phenomena. Recently, the super-resolution fluorescence microscopy techniques have overcome the optical diffraction limit by the localization of molecule positions. However, the labelling process can potentially modify the intermolecular dynamics. Based on the highly-sensitive nanomechanical photothermal microscopy reported previously, we propose optimizations on this label-free microscopy technique towards localization microscopy. A localization precision of 3 $\unicode{x212B}$ is achieved with gold nanoparticles, and the detection of polarization-dependent absorption is demonstrated, which opens the door for further improvement with polarization modulation imaging.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
A nanoelectromechanical position-sensitive detector with picometer resolution
Authors:
Miao-Hsuan Chien,
Johannes Steurer,
Pedram Sadeghi,
Nicolas Cazier,
Silvan Schmid
Abstract:
Sub-nanometer displacement detection lays the solid foundation for critical applications in modern metrology. In-plane displacement sensing, however, is mainly dominated by the detection of differential photocurrent signals from photodiodes, with resolution in the nanometer range. Here, we present an integrated in-plane displacement sensor based on a nanoelectromechanical trampoline resonator. Wit…
▽ More
Sub-nanometer displacement detection lays the solid foundation for critical applications in modern metrology. In-plane displacement sensing, however, is mainly dominated by the detection of differential photocurrent signals from photodiodes, with resolution in the nanometer range. Here, we present an integrated in-plane displacement sensor based on a nanoelectromechanical trampoline resonator. With a position resolution of 4 $pm/Hz^{1/2}$ for a low laser power of 85 $μ$W and a repeatability of 2 nm after 5 cycles of operation as well as good long-term stability, this new detection principle provides a reliable alternative for overcoming the current position detection limit in a wide variety of research and application fields.
△ Less
Submitted 22 April, 2020;
originally announced April 2020.
-
Spectrally Broadband Electro-Optic Modulation with Nanoelectromechanical String Resonators
Authors:
Nicolas Cazier,
Pedram Sadeghi,
Miao-Hsuan Chien,
Mostafa Moonir Shawrav,
Silvan Schmid
Abstract:
In this paper, we present an electro-optical modulator made of two parallel nanoelectromechanical silicon nitride string resonators. These strings are covered with electrically connected gold electrodes and actuated either by Lorentz or electrostatic forces. The in-plane string vibrations modulate thewidth of the gap between the strings. The gold electrodes on both sides of the gap act as a mobile…
▽ More
In this paper, we present an electro-optical modulator made of two parallel nanoelectromechanical silicon nitride string resonators. These strings are covered with electrically connected gold electrodes and actuated either by Lorentz or electrostatic forces. The in-plane string vibrations modulate thewidth of the gap between the strings. The gold electrodes on both sides of the gap act as a mobile mirror that modulate the laser light that is focused in the middle of this gap. These electro-optical modulators can achieve an optical modulation depth of almost 100% for a driving voltage lower than 1 mV at a frequency of 314 kHz. The frequency range is determined by the string resonance frequency, which can take values of the order of a few hundred kilohertz to several megahertz. The strings are driven in the strongly nonlinear regime, which allows a frequency tuning of several kilohertz without significant effect on the optical modulation depth.
△ Less
Submitted 19 December, 2019;
originally announced December 2019.
-
Ultrathin 2 nm gold as ideal impedance-matched absorber for infrared light
Authors:
Niklas Luhmann,
Dennis Høj,
Markus Piller,
Hendrik Kähler,
Miao-Hsuan Chien,
Robert G. West,
Ulrik Lund Andersen,
Silvan Schmid
Abstract:
Thermal detectors are a cornerstone of infrared (IR) and terahertz (THz) technology due to their broad spectral range. These detectors call for suitable broad spectral absorbers with minimalthermal mass. Often this is realized by plasmonic absorbers, which ensure a high absorptivity butonly for a narrow spectral band. Alternativly, a common approach is based on impedance-matching the sheet resista…
▽ More
Thermal detectors are a cornerstone of infrared (IR) and terahertz (THz) technology due to their broad spectral range. These detectors call for suitable broad spectral absorbers with minimalthermal mass. Often this is realized by plasmonic absorbers, which ensure a high absorptivity butonly for a narrow spectral band. Alternativly, a common approach is based on impedance-matching the sheet resistance of a thin metallic film to half the free-space impedance. Thereby, it is possible to achieve a wavelength-independent absorptivity of up to 50 %, depending on the dielectric properties of the underlying substrate. However, existing absorber films typicallyrequire a thickness of the order of tens of nanometers, such as titanium nitride (14 nm), whichcan significantly deteriorate the response of a thermal transducers. Here, we present the application of ultrathin gold (2 nm) on top of a 1.2 nm copper oxide seed layer as an effective IR absorber. An almost wavelength-independent and long-time stable absorptivity of 47(3) %, ranging from 2 $μ$m to 20 $μ$m, could be obtained and is further discussed. The presented gold thin-film represents analmost ideal impedance-matched IR absorber that allows a significant improvement of state-of-the-art thermal detector technology.
△ Less
Submitted 29 November, 2019;
originally announced November 2019.
-
Norm-parallelism and the Davis--Wielandt radius of Hilbert space operators
Authors:
A. Zamani,
M. S. Moslehian,
M. T. Chien,
H. Nakazato
Abstract:
We present a necessary and sufficient condition for the norm-parallelism of bounded linear operators on a Hilbert space. We also give a characterization of the Birkhoff--James orthogonality for Hilbert space operators. Moreover, we discuss the connection between norm-parallelism to the identity operator and an equality condition for the Davis--Wielandt radius. Some other related results are also d…
▽ More
We present a necessary and sufficient condition for the norm-parallelism of bounded linear operators on a Hilbert space. We also give a characterization of the Birkhoff--James orthogonality for Hilbert space operators. Moreover, we discuss the connection between norm-parallelism to the identity operator and an equality condition for the Davis--Wielandt radius. Some other related results are also discussed.
△ Less
Submitted 1 November, 2018;
originally announced November 2018.
-
Learning Robotic Assembly from CAD
Authors:
Garrett Thomas,
Melissa Chien,
Aviv Tamar,
Juan Aparicio Ojea,
Pieter Abbeel
Abstract:
In this work, motivated by recent manufacturing trends, we investigate autonomous robotic assembly. Industrial assembly tasks require contact-rich manipulation skills, which are challenging to acquire using classical control and motion planning approaches. Consequently, robot controllers for assembly domains are presently engineered to solve a particular task, and cannot easily handle variations i…
▽ More
In this work, motivated by recent manufacturing trends, we investigate autonomous robotic assembly. Industrial assembly tasks require contact-rich manipulation skills, which are challenging to acquire using classical control and motion planning approaches. Consequently, robot controllers for assembly domains are presently engineered to solve a particular task, and cannot easily handle variations in the product or environment. Reinforcement learning (RL) is a promising approach for autonomously acquiring robot skills that involve contact-rich dynamics. However, RL relies on random exploration for learning a control policy, which requires many robot executions, and often gets trapped in locally suboptimal solutions. Instead, we posit that prior knowledge, when available, can improve RL performance. We exploit the fact that in modern assembly domains, geometric information about the task is readily available via the CAD design files. We propose to leverage this prior knowledge by guiding RL along a geometric motion plan, calculated using the CAD data. We show that our approach effectively improves over traditional control approaches for tracking the motion plan, and can solve assembly tasks that require high precision, even without accurate state estimation. In addition, we propose a neural network architecture that can learn to track the motion plan, and generalize the assembly controller to changes in the object positions.
△ Less
Submitted 24 July, 2018; v1 submitted 20 March, 2018;
originally announced March 2018.
-
Single-molecule optical absorption imaging by nanomechanical photothermal sensing at room temperature
Authors:
Miao-Hsuan Chien,
Mario Brameshuber,
Gerhard J. Schütz,
Silvan Schmid
Abstract:
Absorption microscopy is a powerful technique, enabling the detection of single non- fluorescent molecules at room temperature. So far, the molecular absorption has been probed optically via the attenuation of a probing laser. The sensitivity of optical probing is not only restricted by background scattering, but it is fundamentally limited by laser shot noise. Here, we present nanomechanical phot…
▽ More
Absorption microscopy is a powerful technique, enabling the detection of single non- fluorescent molecules at room temperature. So far, the molecular absorption has been probed optically via the attenuation of a probing laser. The sensitivity of optical probing is not only restricted by background scattering, but it is fundamentally limited by laser shot noise. Here, we present nanomechanical photothermal microscopy, which overcomes the scattering and shot noise limit by detecting the sample absorption directly with a temperature sensitive substrate. We use nanomechanical silicon nitride drums, whose resonant frequency detunes with local heating. Individual Au nanoparticles with diameters from 10 nm - 200 nm and single molecules (Atto 633) are scanned with a 305 μW heating laser with a peak irradiance of 330 μW/μm2. Using stress-optimized drums, we achieve a sensitivity of 45 fW/sqrt(Hz), which results in a signal-to-noise ratio of >60 for a single molecule. Our method has important consequences for a wide range of applications, such as imaging, absorption analysis and spectrochemical analysis of non-fluorescent samples.
△ Less
Submitted 13 December, 2017;
originally announced December 2017.
-
Two-photon photoactivated voltage imaging in tissue with an Archaerhodopsin-derived reporter
Authors:
Miao-** Chien,
Daan Brinks,
Yoav Adam,
William Bloxham,
Simon Kheifets,
Adam E. Cohen
Abstract:
Robust voltage imaging in tissue remains a technical challenge. Existing combinations of genetically encoded voltage indicators (GEVIs) and microscopy techniques cannot simultaneously achieve sufficiently high voltage sensitivity, background rejection, and time resolution for high-resolution map** of sub-cellular voltage dynamics in intact brain tissue. We developed a pooled high-throughput scre…
▽ More
Robust voltage imaging in tissue remains a technical challenge. Existing combinations of genetically encoded voltage indicators (GEVIs) and microscopy techniques cannot simultaneously achieve sufficiently high voltage sensitivity, background rejection, and time resolution for high-resolution map** of sub-cellular voltage dynamics in intact brain tissue. We developed a pooled high-throughput screening approach to identify Archaerhodopsin mutants with unusual photophysical properties. After screening ~105 cells, we identified a novel GEVI, NovArch, whose 1-photon near infrared fluorescence is reversibly enhanced by weak 2-photon excitation. Because the 2-photon excitation acts catalytically rather than stoichiometrically, high fluorescence signals, optical sectioning, and high time resolution are achieved simultaneously, at modest 2- photon laser power. We developed a microscopy system optimized for NovArch imaging in tissue. The combination of protein and optical engineering enhanced signal contrast sufficiently to enable optical map** of back-propagating action potentials in dendrites in acute mouse brain slice.
△ Less
Submitted 27 October, 2017;
originally announced October 2017.