-
Pretraining and the Lasso
Authors:
Erin Craig,
Mert Pilanci,
Thomas Le Menestrel,
Balasubramanian Narasimhan,
Manuel Rivas,
Roozbeh Dehghannasiri,
Julia Salzman,
Jonathan Taylor,
Robert Tibshirani
Abstract:
Pretraining is a popular and powerful paradigm in machine learning. As an example, suppose one has a modest-sized dataset of images of cats and dogs, and plans to fit a deep neural network to classify them from the pixel features. With pretraining, we start with a neural network trained on a large corpus of images, consisting of not just cats and dogs but hundreds of other image types. Then we fix…
▽ More
Pretraining is a popular and powerful paradigm in machine learning. As an example, suppose one has a modest-sized dataset of images of cats and dogs, and plans to fit a deep neural network to classify them from the pixel features. With pretraining, we start with a neural network trained on a large corpus of images, consisting of not just cats and dogs but hundreds of other image types. Then we fix all of the network weights except for the top layer (which makes the final classification) and train (or "fine tune") those weights on our dataset. This often results in dramatically better performance than the network trained solely on our smaller dataset.
In this paper, we ask the question "Can pretraining help the lasso?". We develop a framework for the lasso in which an overall model is fit to a large set of data, and then fine-tuned to a specific task on a smaller dataset. This latter dataset can be a subset of the original dataset, but does not need to be. We find that this framework has a wide variety of applications, including stratified models, multinomial targets, multi-response models, conditional average treatment estimation and even gradient boosting.
In the stratified model setting, the pretrained lasso pipeline estimates the coefficients common to all groups at the first stage, and then group specific coefficients at the second "fine-tuning" stage. We show that under appropriate assumptions, the support recovery rate of the common coefficients is superior to that of the usual lasso trained only on individual groups. This separate identification of common and individual coefficients can also be useful for scientific understanding.
△ Less
Submitted 18 April, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
Enhanced Polling and Infiltration for Highly-Efficient Electro-Optic Polymer-Based Mach-Zehnder Modulators
Authors:
Iman Taghavi,
Razi Dehghannasiri,
Tianren Fan,
Alexander Tofini,
Hesam Moradinejad,
Ali. A. Efterkhar,
Sudip Shekhar,
Lukas Chrostowski,
Nicolas A. F. Jaeger,
Ali Adibi
Abstract:
An ultra-narrow slot waveguide is fabricated for use in highly-efficient, electro-optic-polymer-based, integrated-optic modulators. Measurement results indicate that $V_πL$'s below 1.2 V.mm are possible for balanced Mach-Zehnder modulators using this ultra-narrow slot waveguide on a silicon-organic hybrid platform. Simulated $V_πL$'s of 0.35 V.mm have also been obtained. In addition to adapting st…
▽ More
An ultra-narrow slot waveguide is fabricated for use in highly-efficient, electro-optic-polymer-based, integrated-optic modulators. Measurement results indicate that $V_πL$'s below 1.2 V.mm are possible for balanced Mach-Zehnder modulators using this ultra-narrow slot waveguide on a silicon-organic hybrid platform. Simulated $V_πL$'s of 0.35 V.mm have also been obtained. In addition to adapting standard recipes, we developed two novel fabrication processes for achieving miniaturized devices with high modulation sensitivity. To boost compactness and decrease the overall footprint, we use a fabrication approach based on air bridge interconnects on thick, thermally-reflowed, MaN 2410 E-beam resist protected by an alumina layer. To overcome the challenges of high currents and imperfect infiltration of polymers into ultra-narrow slots, we use a carefully designed, atomically-thin layer of TiO$_2$ as a carrier-barrier to enhance the polling efficiency of our electro-optic polymers. Additionally, finite-difference time-domain simulations are employed to optimize the effect of the thin layer of TiO$_2$. As compared to other, non-optimized, cases, our peak measured current is reduced by a factor of 3; scanning electron microscopy images also demonstrate that we achieve almost perfect infiltration. The anticipated increase in total capacitance due to the TiO$_2$ layer is shown to be negligible. In fact, applying our TiO$_2$ surface treatment to our ultra-narrow slot, allows us to obtain an improved phase shift efficiency ($\partial n / \partial V$) of $\sim$94% for a 10 nm TiO$_2$ layer.
△ Less
Submitted 10 March, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Sequential Experimental Design for Optimal Structural Intervention in Gene Regulatory Networks Based on the Mean Objective Cost of Uncertainty
Authors:
Mahdi Imani,
Roozbeh Dehghannasiri,
Ulisses M. Braga-Neto,
Edward R. Dougherty
Abstract:
Scientists are attempting to use models of ever increasing complexity, especially in medicine, where gene-based diseases such as cancer require better modeling of cell regulation. Complex models suffer from uncertainty and experiments are needed to reduce this uncertainty. Because experiments can be costly and time-consuming it is desirable to determine experiments providing the most useful inform…
▽ More
Scientists are attempting to use models of ever increasing complexity, especially in medicine, where gene-based diseases such as cancer require better modeling of cell regulation. Complex models suffer from uncertainty and experiments are needed to reduce this uncertainty. Because experiments can be costly and time-consuming it is desirable to determine experiments providing the most useful information. If a sequence of experiments is to be performed, experimental design is needed to determine the order. A classical approach is to maximally reduce the overall uncertainty in the model, meaning maximal entropy reduction. A recently proposed method takes into account both model uncertainty and the translational objective, for instance, optimal structural intervention in gene regulatory networks, where the aim is to alter the regulatory logic to maximally reduce the long-run likelihood of being in a cancerous state. The mean objective cost of uncertainty (MOCU) quantifies uncertainty based on the degree to which model uncertainty affects the objective. Experimental design involves choosing the experiment that yields the greatest reduction in MOCU. This paper introduces finite-horizon dynamic programming for MOCU-based sequential experimental design and compares it to the greedy approach, which selects one experiment at a time without consideration of the full horizon of experiments. A salient aspect of the paper is that it demonstrates the advantage of MOCU-based design over the widely used entropy-based design for both greedy and dynamic-programming strategies and investigates the effect of model conditions on the comparative performances.
△ Less
Submitted 30 May, 2018;
originally announced May 2018.