-
Comparison of adaptive mesh refinement techniques for numerical weather prediction
Authors:
Daniel S. Abdi,
Ann Almgren,
Francis X. Giraldo,
Isidora Jankov
Abstract:
This paper examines the application of adaptive mesh refinement (AMR) in the field of numerical weather prediction (NWP). We implement and assess two distinct AMR approaches and evaluate their performance through standard NWP benchmarks. In both cases, we solve the fully compressible Euler equations, fundamental to many non-hydrostatic weather models.
The first approach utilizes oct-tree cell-ba…
▽ More
This paper examines the application of adaptive mesh refinement (AMR) in the field of numerical weather prediction (NWP). We implement and assess two distinct AMR approaches and evaluate their performance through standard NWP benchmarks. In both cases, we solve the fully compressible Euler equations, fundamental to many non-hydrostatic weather models.
The first approach utilizes oct-tree cell-based mesh refinement coupled with a high-order discontinuous Galerkin method for spatial discretization. In the second approach, we employ level-based AMR with the finite difference method. Our study provides insights into the accuracy and benefits of employing these AMR methodologies for the multi-scale problem of NWP. Additionally, we explore essential properties including their impact on mass and energy conservation. Moreover, we present and evaluate an AMR solution transfer strategy for the tree-based AMR approach that is simple to implement, memory-efficient, and ensures conservation for both flow in the box and sphere.
Furthermore, we discuss scalability, performance portability, and the practical utility of the AMR methodology within an NWP framework -- crucial considerations in selecting an AMR approach. The current de facto standard for mesh refinement in NWP employs a relatively simplistic approach of static nested grids, either within a general circulation model or a separately operated regional model with loose one-way synchronization. It is our hope that this study will stimulate further interest in the adoption of AMR frameworks like AMReX in NWP. These frameworks offer a triple advantage: a robust dynamic AMR for tracking localized and consequential features such as tropical cyclones, extreme scalability, and performance portability.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Large Language Models Meet User Interfaces: The Case of Provisioning Feedback
Authors:
Stanislav Pozdniakov,
Jonathan Brazil,
Solmaz Abdi,
Aneesha Bakharia,
Shazia Sadiq,
Dragan Gasevic,
Paul Denny,
Hassan Khosravi
Abstract:
Incorporating Generative AI (GenAI) and Large Language Models (LLMs) in education can enhance teaching efficiency and enrich student learning. Current LLM usage involves conversational user interfaces (CUIs) for tasks like generating materials or providing feedback. However, this presents challenges including the need for educator expertise in AI and CUIs, ethical concerns with high-stakes decisio…
▽ More
Incorporating Generative AI (GenAI) and Large Language Models (LLMs) in education can enhance teaching efficiency and enrich student learning. Current LLM usage involves conversational user interfaces (CUIs) for tasks like generating materials or providing feedback. However, this presents challenges including the need for educator expertise in AI and CUIs, ethical concerns with high-stakes decisions, and privacy risks. CUIs also struggle with complex tasks. To address these, we propose transitioning from CUIs to user-friendly applications leveraging LLMs via API calls. We present a framework for ethically incorporating GenAI into educational tools and demonstrate its application in our tool, Feedback Copilot, which provides personalized feedback on student assignments. Our evaluation shows the effectiveness of this approach, with implications for GenAI researchers, educators, and technologists. This work charts a course for the future of GenAI in education.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Low Polarization Sensitive O-band SOA on InP Membrane for Advanced Photonic Integration
Authors:
Desalegn Wolde Feyisa,
Salim Abdi,
Rene van Veldhoven,
Nicola Calabretta,
Yuqing Jiao,
Ripalta Stabile
Abstract:
Managing insertion losses, polarizations and device footprint is crucial in develo** large-scale photonic integrated circuits (PICs). This paper presents a solution to these critical challenges by designing a semiconductor optical amplifier (SOA) in the O-band with reduced polarization sensitivity, leveraging the ultra-compact InP Membrane on Silicon (IMOS) platform. The platform is compatible w…
▽ More
Managing insertion losses, polarizations and device footprint is crucial in develo** large-scale photonic integrated circuits (PICs). This paper presents a solution to these critical challenges by designing a semiconductor optical amplifier (SOA) in the O-band with reduced polarization sensitivity, leveraging the ultra-compact InP Membrane on Silicon (IMOS) platform. The platform is compatible with close integration atop electronics, via densely populated vertical interconnects. The SOA incorporates a thin tensile-strained bulk active layer to mitigate polarization sensitivity. The developed 500 um long SOA has a peak gain of 11.5 dB at 1350 nm and an optimal polarization dependency of less than 1 dB across a 25 nm bandwidth, ranging from 1312 nm to 1337 nm. The device is practical for integrated circuits where multiple amplifiers work in cascades with a minimal 6.5 dB noise figure (NF) measured at the gain peak. The designed vertical active-passive transition, achieved through inverse tapering, allows for effective field coupling in the vertical direction resulting in a transmission efficiency of over 95% at the transition and minimal polarization sensitivity of less than 3%. The device yields significant gain at a small current density of less than 3 kA/cm2 as the result of minimalist gain medium structure, reducing joule heating and improving energy efficiency. This is especially relevant in applications such as optical switching, where multiple SOAs populate the PIC within a small area. Consequently, the simulated and fabricated low polarization sensitive O-band SOA is a suitable candidate for integration into large-scale, ultra-compact photonic integrated circuits.
△ Less
Submitted 9 April, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Rapid design space exploration of multi-clock domain MPSoCs with hybrid prototy**
Authors:
Ehsan Saboori,
Samar Abdi
Abstract:
This paper presents novel techniques of using hybrid prototy** for early power-performance analysis of MPSoC designs with multiple clock domains. The fundamental idea of hybrid prototy** is to simulate a design with multiple cores by creating an emulation kernel in software on top of a single physical instance of the core. However, so far hybrid prototy** has been limited to homogeneous mult…
▽ More
This paper presents novel techniques of using hybrid prototy** for early power-performance analysis of MPSoC designs with multiple clock domains. The fundamental idea of hybrid prototy** is to simulate a design with multiple cores by creating an emulation kernel in software on top of a single physical instance of the core. However, so far hybrid prototy** has been limited to homogeneous multicores running at the same clock frequency. Moreover, hybrid prototy** has not yet been demonstrated for efficient design space exploration. Our work focuses on enhancing the capabilities of hybrid prototy**, such that it can be applied to realistic multi-clock MPSoC designs as well to perform early power-performance evaluation of MPSoC designs. Our experiments using industrial strength applications such as JPEG, MP3 and Packet Processing, demonstrate the high accuracy of our hybrid prototypes, and over two orders of magnitude improvement over software simulation speed. We also demonstrate that exploring over 150 design options using hybrid prototy** can be done with high reliability in the order of minutes compared to multiple days using conventional FPGA prototy**.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
Recrystallization and Interdiffusion Processes in Laser-Annealed Strain-Relaxed Metastable Ge$_{0.89}$Sn0$_{.11}$
Authors:
Salim Abdi,
Simone Assali,
Mahmoud R. M. Atalla,
Sebastian Koelling,
Jeffrey M. Warrender,
Oussama Moutanabbir
Abstract:
The prospect of GeSn semiconductors for silicon-integrated infrared optoelectronics brings new challenges related to the metastability of this class of materials. As a matter of fact, maintaining a reduced thermal budget throughout all processing steps of GeSn devices is essential to avoid possible material degradation. This constraint is exacerbated by the need for higher Sn contents along with a…
▽ More
The prospect of GeSn semiconductors for silicon-integrated infrared optoelectronics brings new challenges related to the metastability of this class of materials. As a matter of fact, maintaining a reduced thermal budget throughout all processing steps of GeSn devices is essential to avoid possible material degradation. This constraint is exacerbated by the need for higher Sn contents along with an enhanced strain relaxation to achieve efficient mid-infrared devices. Herein, as a low thermal budget solution for post-epitaxy processing, we elucidate the effects of laser thermal annealing (LTA) on strain-relaxed Ge$_{0.89}$Sn0$_{.11}$ layers and Ni-Ge$_{0.89}$Sn0$_{.11}$ contacts. Key diffusion and recrystallization processes are proposed and discussed in the light of systematic microstructural studies. LTA treatment at a fluence of 0.40 J/cm2 results in a 200-300 nm-thick layer where Sn atoms segregate toward the surface and in the formation of Sn-rich columnar structures in the LTA-affected region. These structures are reminiscent to those observed in the dislocation-assisted pipe-diffusion mechanism, while the buried GeSn layers remain intact. Moreover, by tailoring the LTA fluence, the contact resistance can be reduced without triggering phase separation across the whole GeSn multi-layer stacking. Indeed, a one order of magnitude decrease in the Ni-based specific contact resistance was obtained at the highest LTA fluence, thus confirming the potential of this method for the functionalization of direct bandgap GeSn materials.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
All-Group IV membrane room-temperature mid-infrared photodetector
Authors:
Mahmoud R. M. Atalla,
Simone Assali,
Anis Attiaoui,
Cedric Lemieux-Leduc,
Aashish Kumar,
Salim Abdi,
Oussama Moutanabbir
Abstract:
Strain engineering has been a ubiquitous paradigm to tailor the electronic band structure and harness the associated new or enhanced fundamental properties in semiconductors. In this regard, semiconductor membranes emerged as a versatile class of nanoscale materials to control lattice strain and engineer complex heterostructures leading to the development of a variety of innovative applications. H…
▽ More
Strain engineering has been a ubiquitous paradigm to tailor the electronic band structure and harness the associated new or enhanced fundamental properties in semiconductors. In this regard, semiconductor membranes emerged as a versatile class of nanoscale materials to control lattice strain and engineer complex heterostructures leading to the development of a variety of innovative applications. Herein we exploit this quasi-two-dimensional platform to tune simultaneously the lattice parameter and bandgap energy in group IV GeSn semiconductor alloys. As Sn content is increased to reach a direct band gap, these semiconductors become metastable and typically compressively strained. We show that the release and transfer of GeSn membranes lead to a significant relaxation thus extending the absorption wavelength range deeper in the mid-infrared. Fully released Ge$_{0.83}$Sn$_{0.17}$ membranes were integrated on silicon and used in the fabrication of broadband photodetectors operating at room temperature with a record wavelength cutoff of 4.6 $μ$m, without compromising the performance at shorter wavelengths down to 2.3 $μ$m. These membrane devices are characterized by two orders of magnitude reduction in dark current as compared to devices processed from as-grown strained epitaxial layers. The latter exhibit a content-dependent, shorter wavelength cutoff in the 2.6-3.5 $μ$m range, thus highlighting the role of lattice strain relaxation in sha** the spectral response of membrane photodetectors. This ability to engineer all-group IV transferable mid-infrared photodetectors lays the groundwork to implement scalable and flexible sensing and imaging technologies exploiting these integrative, silicon-compatible strained-relaxed GeSn membranes.
△ Less
Submitted 28 August, 2020; v1 submitted 23 July, 2020;
originally announced July 2020.
-
A Multivariate Elo-based Learner Model for Adaptive Educational Systems
Authors:
Solmaz Abdi,
Hassan Khosravi,
Shazia Sadiq,
Dragan Gasevic
Abstract:
The Elo rating system has been recognised as an effective method for modelling students and items within adaptive educational systems. The existing Elo-based models have the limiting assumption that items are only tagged with a single concept and are mainly studied in the context of adaptive testing systems. In this paper, we introduce a multivariate Elo-based learner model that is suitable for th…
▽ More
The Elo rating system has been recognised as an effective method for modelling students and items within adaptive educational systems. The existing Elo-based models have the limiting assumption that items are only tagged with a single concept and are mainly studied in the context of adaptive testing systems. In this paper, we introduce a multivariate Elo-based learner model that is suitable for the domains where learning items can be tagged with multiple concepts, and investigate its fit in the context of adaptive learning. To evaluate the model, we first compare the predictive performance of the proposed model against the standard Elo-based model using synthetic and public data sets. Our results from this study indicate that our proposed model has superior predictive performance compared to the standard Elo-based model, but the difference is rather small. We then investigate the fit of the proposed multivariate Elo-based model by integrating it into an adaptive learning system which incorporates the principles of open learner models (OLMs). The results from this study suggest that the availability of additional parameters derived from multivariate Elo-based models have two further advantages: guiding adaptive behaviour for the system and providing additional insight for students and instructors.
△ Less
Submitted 14 October, 2019;
originally announced October 2019.
-
New deep coronal spectra from the 2017 total solar eclipse
Authors:
S. Koutchmy,
F. Baudin,
Sh. Abdi,
L. Golub,
F. Sèvre
Abstract:
Total eclipses permit a deep analysis of both the inner and the outer parts of the corona using the continuum White-Light (W-L) radiations from electrons (K-corona), the superposed spectrum of forbidden emission lines from ions (E-corona) and the dust component with F-lines (F-corona). By sufficiently dispersing the W-L spectrum, the Fraunhofer (F) spectrum of the dust component of the corona appe…
▽ More
Total eclipses permit a deep analysis of both the inner and the outer parts of the corona using the continuum White-Light (W-L) radiations from electrons (K-corona), the superposed spectrum of forbidden emission lines from ions (E-corona) and the dust component with F-lines (F-corona). By sufficiently dispersing the W-L spectrum, the Fraunhofer (F) spectrum of the dust component of the corona appears and the continuum Thomson radiation can be evaluated. The superposed emission lines of ions with different degrees of ionization are studied to allow the measurement of temperatures, non-thermal velocities, Doppler shifts and abundances. We describe a slit spectroscopic experiment of high spectral resolution for providing an analysis of the most typical parts of the quasi-minimum type corona observed during the total solar eclipse of Aug. 21, 2017 observed from Idaho, USA. Streamers, active region enhancements and polar coronal holes (CHs) are well measured using deep spectra. 60 spectra are obtained during the totality with a long slit, covering +/-3 solar radii in the range of 510 to 590nm. The K+F continuum corona is well exposed up to 2 solar radius. The F-corona can be measured even at the solar limb. New weak emission lines were discovered or confirmed. The rarely observed high FIP ArX line is recorded almost everywhere; the FeXIV and NiXIII lines are well recorded everywhere. For the first time hot lines are also measured inside the CH regions. The radial variations of the non-thermal turbulent velocities of the lines do not show a great departure from the average values. No significantly large Doppler shifts are seen anywhere in the inner and the middle corona. The wings of the FeXIV line show some non-Gaussianity.
△ Less
Submitted 5 November, 2019; v1 submitted 3 October, 2019;
originally announced October 2019.
-
Acceleration of the Implicit-Explicit Non-hydrostatic Unified Model of the Atmosphere (NUMA) on Manycore Processors
Authors:
Daniel S. Abdi,
Francis X. Giraldo,
Emil M. Constantinescu,
Lester E. Carr III,
Lucas C. Wilcox,
Timothy C. Warburton
Abstract:
We present the acceleration of an IMplicit-EXplicit (IMEX) non-hydrostatic atmospheric model on manycore processors such as GPUs and Intel's MIC architecture. IMEX time integration methods sidestep the constraint imposed by the Courant-Friedrichs-Lewy condition on explicit methods through corrective implicit solves within each time step. In this work, we implement and evaluate the performance of I…
▽ More
We present the acceleration of an IMplicit-EXplicit (IMEX) non-hydrostatic atmospheric model on manycore processors such as GPUs and Intel's MIC architecture. IMEX time integration methods sidestep the constraint imposed by the Courant-Friedrichs-Lewy condition on explicit methods through corrective implicit solves within each time step. In this work, we implement and evaluate the performance of IMEX on manycore processors relative to explicit methods. Using 3D-IMEX at Courant number C=15 , we obtained a speedup of about 4X relative to an explicit time step** method run with the maximum allowable C=1. In addition, we demonstrate a much larger speedup of 100X at C=150 using 1D-IMEX due to the unconditional stability of the method in the vertical direction. Several improvements on the IMEX procedure were necessary in order to outperform our results with explicit methods: a) reducing the number of degrees of freedom of the IMEX formulation by forming the Schur complement; b) formulating a horizontally-explicit vertically-implicit (HEVI) 1D-IMEX scheme that has a lower workload and potentially better scalability than 3D-IMEX; c) using high-order polynomial preconditioners to reduce the condition number of the resulting system; d) using a direct solver for the 1D-IMEX method by performing and storing LU factorizations once to obtain a constant cost for any Courant number. Without all of these improvements, explicit time integration methods turned out to be difficult to beat. We discuss in detail the IMEX infrastructure required for formulating and implementing efficient methods on manycore processors. Finally, we validate our results with standard benchmark problems in NWP and evaluate the performance and scalability of the IMEX method using up to 4192 GPUs and 16 Knights Landing processors.
△ Less
Submitted 13 February, 2017;
originally announced February 2017.
-
The Impact of Data Replicatino on Job Scheduling Performance in Hierarchical data Grid
Authors:
Somayeh Abdi,
Hossein Pedram,
Somayeh Mohamadi
Abstract:
In data-intensive applications data transfer is a primary cause of job execution delay. Data access time depends on bandwidth. The major bottleneck to supporting fast data access in Grids is the high latencies of Wide Area Networks and Internet. Effective scheduling can reduce the amount of data transferred across the internet by dispatching a job to where the needed data are present. Another solu…
▽ More
In data-intensive applications data transfer is a primary cause of job execution delay. Data access time depends on bandwidth. The major bottleneck to supporting fast data access in Grids is the high latencies of Wide Area Networks and Internet. Effective scheduling can reduce the amount of data transferred across the internet by dispatching a job to where the needed data are present. Another solution is to use a data replication mechanism. Objective of dynamic replica strategies is reducing file access time which leads to reducing job runtime. In this paper we develop a job scheduling policy and a dynamic data replication strategy, called HRS (Hierarchical Replication Strategy), to improve the data access efficiencies. We study our approach and evaluate it through simulation. The results show that our algorithm has improved 12% over the current strategies.
△ Less
Submitted 4 October, 2010;
originally announced October 2010.