-
Thermalization in Trapped Bosonic Systems With Disorder
Authors:
Javier de la Cruz,
Carlos Diaz-Mejia,
Sergio Lerma-Hernandez,
Jorge G. Hirsch
Abstract:
A detailed study of thermalization is conducted on experimentally accessible states in a system of bosonic atoms trapped in an open linear chain with disorder. When the disorder parameter is large, the system exhibits regularity and localization. In contrast, weak disorder introduces chaos and raises questions about the validity of the Eigenstate Thermalization Hypothesis (ETH), especially for sta…
▽ More
A detailed study of thermalization is conducted on experimentally accessible states in a system of bosonic atoms trapped in an open linear chain with disorder. When the disorder parameter is large, the system exhibits regularity and localization. In contrast, weak disorder introduces chaos and raises questions about the validity of the Eigenstate Thermalization Hypothesis (ETH), especially for states at the extremes of the energy spectrum which remain regular and non-thermalizing. The validity of ETH is assessed by examining the dispersion of entanglement entropy and the number of bosons on the first site across various dimensions, while maintaining a constant particle density of one. Experimentally accessible states in the occupation basis are categorized using a crowding parameter that linearly correlates with their mean energy. Using full exact diagonalization to simulate temporal evolution, we study the equilibration of entanglement entropy, the number of bosons, and the reduced density matrix of the first site for all states in the occupation basis. Comparing equilibrium values of these observables with those predicted by microcanonical ensembles, we find that, within certain tolerances, most states in the chaotic region thermalize. However, states with low participation ratios in the energy eigenstate basis show greater deviations from thermal equilibrium values.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
On conceptualisation and an overview of learning path recommender systems in e-learning
Authors:
A. Fuster-López,
J. M. Cruz,
P. Guerrero-García,
E. M. T. Hendrix,
A. Košir,
I. Nowak,
L. Oneto,
S. Sirmakessis,
M. F. Pacheco,
F. P. Fernandes,
A. I. Pereira
Abstract:
The use of e-learning systems has a long tradition, where students can study online helped by a system. In this context, the use of recommender systems is relatively new. In our research project, we investigated various ways to create a recommender system. They all aim at facilitating the learning and understanding of a student. We present a common concept of the learning path and its learning ind…
▽ More
The use of e-learning systems has a long tradition, where students can study online helped by a system. In this context, the use of recommender systems is relatively new. In our research project, we investigated various ways to create a recommender system. They all aim at facilitating the learning and understanding of a student. We present a common concept of the learning path and its learning indicators and embed 5 different recommenders in this context.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Authors:
Holy Lovenia,
Rahmad Mahendra,
Salsabil Maulana Akbar,
Lester James V. Miranda,
Jennifer Santoso,
Elyanah Aco,
Akhdan Fadhilah,
Jonibek Mansurov,
Joseph Marvin Imperial,
Onno P. Kampman,
Joel Ruben Antony Moniz,
Muhammad Ravi Shulthan Habibi,
Frederikus Hudi,
Railey Montalan,
Ryan Ignatius,
Joanito Agili Lopo,
William Nixon,
Börje F. Karlsson,
James Jaya,
Ryandito Diandaru,
Yuze Gao,
Patrick Amadeus,
Bin Wang,
Jan Christian Blaise Cruz,
Chenxi Whitehouse
, et al. (36 additional authors not shown)
Abstract:
Southeast Asia (SEA) is a region rich in linguistic diversity and cultural variety, with over 1,300 indigenous languages and a population of 671 million people. However, prevailing AI models suffer from a significant lack of representation of texts, images, and audio datasets from SEA, compromising the quality of AI models for SEA languages. Evaluating models for SEA languages is challenging due t…
▽ More
Southeast Asia (SEA) is a region rich in linguistic diversity and cultural variety, with over 1,300 indigenous languages and a population of 671 million people. However, prevailing AI models suffer from a significant lack of representation of texts, images, and audio datasets from SEA, compromising the quality of AI models for SEA languages. Evaluating models for SEA languages is challenging due to the scarcity of high-quality datasets, compounded by the dominance of English training data, raising concerns about potential cultural misrepresentation. To address these challenges, we introduce SEACrowd, a collaborative initiative that consolidates a comprehensive resource hub that fills the resource gap by providing standardized corpora in nearly 1,000 SEA languages across three modalities. Through our SEACrowd benchmarks, we assess the quality of AI models on 36 indigenous languages across 13 tasks, offering valuable insights into the current AI landscape in SEA. Furthermore, we propose strategies to facilitate greater AI advancements, maximizing potential utility and resource equity for the future of AI in SEA.
△ Less
Submitted 8 July, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
Authors:
David Romero,
Chenyang Lyu,
Haryo Akbarianto Wibowo,
Teresa Lynn,
Injy Hamed,
Aditya Nanda Kishore,
Aishik Mandal,
Alina Dragonetti,
Artem Abzaliev,
Atnafu Lambebo Tonja,
Bontu Fufa Balcha,
Chenxi Whitehouse,
Christian Salamea,
Dan John Velasco,
David Ifeoluwa Adelani,
David Le Meur,
Emilio Villa-Cueva,
Fajri Koto,
Fauzan Farooqui,
Frederico Belcavello,
Ganzorig Batnasan,
Gisela Vallejo,
Grainne Caulfield,
Guido Ivetta,
Haiyue Song
, et al. (50 additional authors not shown)
Abstract:
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA models use datasets that are primarily focused on English and a few major world languages, with images that are typically Western-centric. While recen…
▽ More
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA models use datasets that are primarily focused on English and a few major world languages, with images that are typically Western-centric. While recent efforts have tried to increase the number of languages covered on VQA datasets, they still lack diversity in low-resource languages. More importantly, although these datasets often extend their linguistic range via translation or some other approaches, they usually keep images the same, resulting in narrow cultural representation. To address these limitations, we construct CVQA, a new Culturally-diverse multilingual Visual Question Answering benchmark, designed to cover a rich set of languages and cultures, where we engage native speakers and cultural experts in the data collection process. As a result, CVQA includes culturally-driven images and questions from across 28 countries on four continents, covering 26 languages with 11 scripts, providing a total of 9k questions. We then benchmark several Multimodal Large Language Models (MLLMs) on CVQA, and show that the dataset is challenging for the current state-of-the-art models. This benchmark can serve as a probing evaluation suite for assessing the cultural capability and bias of multimodal models and hopefully encourage more research efforts toward increasing cultural awareness and linguistic diversity in this field.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Measuring the circular polarization of gravitational waves with pulsar timing arrays
Authors:
N. M. Jiménez Cruz,
Ameek Malhotra,
Gianmassimo Tasinato,
Ivonne Zavala
Abstract:
The circular polarization of the stochastic gravitational wave background (SGWB) is a key observable for characterising the origin of the signal detected by Pulsar Timing Array (PTA) collaborations. Both the astrophysical and the cosmological SGWB can have a sizeable amount of circular polarization, due to Poisson fluctuations in the source properties for the former, and to parity violating proces…
▽ More
The circular polarization of the stochastic gravitational wave background (SGWB) is a key observable for characterising the origin of the signal detected by Pulsar Timing Array (PTA) collaborations. Both the astrophysical and the cosmological SGWB can have a sizeable amount of circular polarization, due to Poisson fluctuations in the source properties for the former, and to parity violating processes in the early universe for the latter. Its measurement is challenging since PTA are blind to the circular polarization monopole, forcing us to turn to anisotropies for detection. We study the sensitivity of current and future PTA datasets to circular polarization anisotropies, focusing on realistic modelling of intrinsic and kinematic anisotropies for astrophysical and cosmological scenarios respectively. Our results indicate that the expected level of circular polarization for the astrophysical SGWB should be within the reach of near future datasets, while for cosmological SGWB circular polarization is a viable target for more advanced SKA-type experiments.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Quasiparticle and Excitonic Structures of Few-layer and Bulk GaSe: Interlayer Coupling, Self-energy, and Electron-hole Interaction
Authors:
Fanhao Jia,
Zhao Tang,
Greis J. Cruz,
Weiwei Gao,
Shaowen Xu,
Wei Ren,
Peihong Zhang
Abstract:
Metal monochalcogenide GaSe is a classic layered semiconductor that has received increasing research interest due to its highly tunable electronic and optical properties for ultrathin electronics applications. Despite intense research efforts, a systematic understanding of the layer-dependent electronic and optical properties of GaSe remains to be established, and there appear significant discrepa…
▽ More
Metal monochalcogenide GaSe is a classic layered semiconductor that has received increasing research interest due to its highly tunable electronic and optical properties for ultrathin electronics applications. Despite intense research efforts, a systematic understanding of the layer-dependent electronic and optical properties of GaSe remains to be established, and there appear significant discrepancies between different experiments. We have performed GW plus Bethe-Salpeter equation (BSE) calculations for few-layer and bulk GaSe, aiming at understanding the effects of interlayer coupling and dielectric screening on excited state properties of GaSe, and how the electronic and optical properties evolve from strongly two-dimensional (2D) like to intermediate thick layers, and to three-dimensional (3D) bulk character. Using a new definition of the exciton binding energy, we are able to calculate the binding energies of all excitonic states. Our results reveal an interesting correlation between the binding energy of an exciton and the spread of its wave function in the real and momentum spaces. We find that the existence of (nearly) parallel valence and conduction bands facilitates the formation of excitonic states that spread out in the momentum space. Thus, these excitons tend to be more localized in real space and have large exciton binding energies. The interlayer coupling substantially suppresses the Mexican-hat-like dispersion of the top valence band seen in monolayer system, explaining the greatly enhanced photoluminescence (PL) as layer thickness increases. Our results also help resolve apparent discrepancies between different experiments. After including the quasiparticle and excitonic effects as well the optical activities of excitons, our results compare well with available experimental results.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering
Authors:
Zhentao Xu,
Mark Jerome Cruz,
Matthew Guevara,
Tie Wang,
Manasi Deshpande,
Xiaofeng Wang,
Zheng Li
Abstract:
In customer service technical support, swiftly and accurately retrieving relevant past issues is critical for efficiently resolving customer inquiries. The conventional retrieval methods in retrieval-augmented generation (RAG) for large language models (LLMs) treat a large corpus of past issue tracking tickets as plain text, ignoring the crucial intra-issue structure and inter-issue relations, whi…
▽ More
In customer service technical support, swiftly and accurately retrieving relevant past issues is critical for efficiently resolving customer inquiries. The conventional retrieval methods in retrieval-augmented generation (RAG) for large language models (LLMs) treat a large corpus of past issue tracking tickets as plain text, ignoring the crucial intra-issue structure and inter-issue relations, which limits performance. We introduce a novel customer service question-answering method that amalgamates RAG with a knowledge graph (KG). Our method constructs a KG from historical issues for use in retrieval, retaining the intra-issue structure and inter-issue relations. During the question-answering phase, our method parses consumer queries and retrieves related sub-graphs from the KG to generate answers. This integration of a KG not only improves retrieval accuracy by preserving customer service structure information but also enhances answering quality by mitigating the effects of text segmentation. Empirical assessments on our benchmark datasets, utilizing key retrieval (MRR, Recall@K, NDCG@K) and text generation (BLEU, ROUGE, METEOR) metrics, reveal that our method outperforms the baseline by 77.6% in MRR and by 0.32 in BLEU. Our method has been deployed within LinkedIn's customer service team for approximately six months and has reduced the median per-issue resolution time by 28.6%.
△ Less
Submitted 6 May, 2024; v1 submitted 26 April, 2024;
originally announced April 2024.
-
Singular parametric oscillators from the one-parameter Darboux transformation of the classical harmonic oscillator
Authors:
H. C. Rosu,
J. de la Cruz
Abstract:
The singular parametric oscillators obtained from the one-parameter Darboux deformation/transformation effected upon the classical harmonic oscillator are introduced and discussed in some detail using sin(omega_0 t) and cos(omega_0 t) as seed solutions. The corresponding Ermakov-Lewis integrability problem of these parametric oscillators is also studied. It is shown that the Ermakov-Lewis invarian…
▽ More
The singular parametric oscillators obtained from the one-parameter Darboux deformation/transformation effected upon the classical harmonic oscillator are introduced and discussed in some detail using sin(omega_0 t) and cos(omega_0 t) as seed solutions. The corresponding Ermakov-Lewis integrability problem of these parametric oscillators is also studied. It is shown that the Ermakov-Lewis invariants do not depend on the deformation parameter.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations
Authors:
Carlos Jose Xavier Cruz
Abstract:
This article explores the dynamic influence of computational entities based on multi-agent systems theory (SMA) combined with large language models (LLM), which are characterized by their ability to simulate complex human interactions, as a possibility to revolutionize human user interaction from the use of specialized artificial agents to support everything from operational organizational process…
▽ More
This article explores the dynamic influence of computational entities based on multi-agent systems theory (SMA) combined with large language models (LLM), which are characterized by their ability to simulate complex human interactions, as a possibility to revolutionize human user interaction from the use of specialized artificial agents to support everything from operational organizational processes to strategic decision making based on applied knowledge and human orchestration. Previous investigations reveal that there are limitations, particularly in the autonomous approach of artificial agents, especially when dealing with new challenges and pragmatic tasks such as inducing logical reasoning and problem solving. It is also considered that traditional techniques, such as the stimulation of chains of thoughts, require explicit human guidance. In our approach we employ agents developed from large language models (LLM), each with distinct prototy** that considers behavioral elements, driven by strategies that stimulate the generation of knowledge based on the use case proposed in the scenario (role-play) business, using a discussion approach between agents (guided conversation). We demonstrate the potential of develo** agents useful for organizational strategies, based on multi-agent system theories (SMA) and innovative uses based on large language models (LLM based), offering a differentiated and adaptable experiment to different applications, complexities, domains, and capabilities from LLM.
△ Less
Submitted 15 March, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Measuring kinematic anisotropies with pulsar timing arrays
Authors:
N. M. Jiménez Cruz,
Ameek Malhotra,
Gianmassimo Tasinato,
Ivonne Zavala
Abstract:
Recent Pulsar Timing Array (PTA) collaborations show strong evidence for a stochastic gravitational wave background (SGWB) with the characteristic Hellings-Downs inter-pulsar correlations. The signal may stem from supermassive black hole binary mergers, or early universe phenomena. The former is expected to be strongly anisotropic while primordial backgrounds are likely to be predominantly isotrop…
▽ More
Recent Pulsar Timing Array (PTA) collaborations show strong evidence for a stochastic gravitational wave background (SGWB) with the characteristic Hellings-Downs inter-pulsar correlations. The signal may stem from supermassive black hole binary mergers, or early universe phenomena. The former is expected to be strongly anisotropic while primordial backgrounds are likely to be predominantly isotropic with small fluctuations. In case the observed SGWB is of cosmological origin, our relative motion with respect to the SGWB rest frame is a guaranteed source of anisotropy, leading to $\textit{O}(10^{-3})$ energy density fluctuations of the SGWB. For such cosmological SGWB, kinematic anisotropies are likely to be larger than the intrinsic anisotropies, akin to the cosmic microwave background (CMB) dipole anisotropy. We assess the sensitivity of current PTA data to the kinematic dipole anisotropy, and we also forecast at what extent the magnitude and direction of the kinematic dipole can be measured in the future with an SKA-like experiment. We also discuss how the spectral shape of the SGWB and the location of the pulsars to monitor affect the prospects of detecting the kinematic dipole with PTA. In the future, a detection of this anisotropy may even help resolve the discrepancy in the magnitude of the kinematic dipole as measured by CMB and large-scale structure observations.
△ Less
Submitted 7 June, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Seismic structure of the southern Rivera plate and Jalisco block subduction zone
Authors:
D Nunez,
F J Nunez Cornu,
F de J Escalona-Alcazar,
D Cordoba,
J Y Lopez Ortiz,
J L Carrillo de la Cruz,
J J Danobeitia
Abstract:
Structural and tectonic features in the Pacific Coast of Mexico generate a high level of seismic activity in the Jalisco block (JB) region, making it one of the most attractive areas of the world for geophysical investigations. The Rivera North America contact zone has been the object of different tectonic studies in recent years framed within the TsuJal project. To this day, this project is gener…
▽ More
Structural and tectonic features in the Pacific Coast of Mexico generate a high level of seismic activity in the Jalisco block (JB) region, making it one of the most attractive areas of the world for geophysical investigations. The Rivera North America contact zone has been the object of different tectonic studies in recent years framed within the TsuJal project. To this day, this project is generating numerous crucial geophysical results, which significantly improve our understanding of the region. Our study is focused on the interaction between the south of the JB and Rivera plate (RP), which crosses the Middle America trench. We also cover an offshore onshore transect of 130 km length between the eastern Rivera fracture zone and La Huerta region, in the Jalisco state. To characterize this region,we interpreted wide angle seismic, multichannel seismic, and multibeam bathymetry data. The integration of these results, with the local and regional seismicity recorded by the Jalisco Seismic Accelerometric Telemetric Network and by the Map** the Rivera Subduction Zone experiment, provides new insights into the geometry of the southern RP, which is dip** 12 14 degrees under the JB in the northeast southwest direction. Moreover, our results provide new seismic images of the accretionary wedge, the shallow crust, the deep crust, and the upper-mantle structure along this profile.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
INRISCO: INcident monitoRing In Smart COmmunities
Authors:
Mónica Aguilar Igartua,
Florina Almenares,
Rebeca P. Díaz Redondo,
Manuela I. Martín,
Jordi Forné,
Celeste Campo,
Ana Fernández,
Luis J. de la Cruz,
Carlos García-Rubio,
Andrés Marínn,
Ahmad Mohamad Mezher,
Daniel Díaz,
Héctor Cerezo,
David Rebollo-Monedero,
Patricia Arias,
Francisco Rico
Abstract:
Major advances in information and communication technologies (ICTs) make citizens to be considered as sensors in motion. Carrying their mobile devices, moving in their connected vehicles or actively participating in social networks, citizens provide a wealth of information that, after properly processing, can support numerous applications for the benefit of the community. In the context of smart c…
▽ More
Major advances in information and communication technologies (ICTs) make citizens to be considered as sensors in motion. Carrying their mobile devices, moving in their connected vehicles or actively participating in social networks, citizens provide a wealth of information that, after properly processing, can support numerous applications for the benefit of the community. In the context of smart communities, the INRISCO proposal intends for (i) the early detection of abnormal situations in cities (i.e., incidents), (ii) the analysis of whether, according to their impact, those incidents are really adverse for the community; and (iii) the automatic actuation by dissemination of appropriate information to citizens and authorities. Thus, INRISCO will identify and report on incidents in traffic (jam, accident) or public infrastructure (e.g., works, street cut), the occurrence of specific events that affect other citizens life (e.g., demonstrations, concerts), or environmental problems (e.g., pollution, bad weather). It is of particular interest to this proposal the identification of incidents with a social and economic impact, which affects the quality of life of citizens.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Variational Autoencoders for Noise Reduction in Industrial LLRF Systems
Authors:
J. P. Edelen,
M. J. Henderson,
J. Einstein-Curtis,
C. C. Hall,
J. A. Diaz Cruz,
A. L. Edelen
Abstract:
Industrial particle accelerators inherently operate in much dirtier environments than typical research accelerators. This leads to an increase in noise both in the RF system and in other electronic systems. Combined with the fact that industrial accelerators are mass produced, there is less attention given to optimizing the performance of an individual system. As a result, industrial systems tend…
▽ More
Industrial particle accelerators inherently operate in much dirtier environments than typical research accelerators. This leads to an increase in noise both in the RF system and in other electronic systems. Combined with the fact that industrial accelerators are mass produced, there is less attention given to optimizing the performance of an individual system. As a result, industrial systems tend to under perform considering their hardware hardware capabilities. With the growing demand for accelerators for medical sterilization, food irradiation, cancer treatment, and imaging, improving the signal processing of these machines will increase the margin for the deployment of these systems. Our work is focusing on using machine learning techniques to reduce the noise of RF signals used for pulse-to-pulse feedback in industrial accelerators. We will review our algorithms, simulation results, and results working with measured data. We will then discuss next steps for deployment and testing on an industrial system.
△ Less
Submitted 7 November, 2023; v1 submitted 29 October, 2023;
originally announced November 2023.
-
Samsung R&D Institute Philippines at WMT 2023
Authors:
Jan Christian Blaise Cruz
Abstract:
In this paper, we describe the constrained MT systems submitted by Samsung R&D Institute Philippines to the WMT 2023 General Translation Task for two directions: en$\rightarrow$he and he$\rightarrow$en. Our systems comprise of Transformer-based sequence-to-sequence models that are trained with a mix of best practices: comprehensive data preprocessing pipelines, synthetic backtranslated data, and t…
▽ More
In this paper, we describe the constrained MT systems submitted by Samsung R&D Institute Philippines to the WMT 2023 General Translation Task for two directions: en$\rightarrow$he and he$\rightarrow$en. Our systems comprise of Transformer-based sequence-to-sequence models that are trained with a mix of best practices: comprehensive data preprocessing pipelines, synthetic backtranslated data, and the use of noisy channel reranking during online decoding. Our models perform comparably to, and sometimes outperform, strong baseline unconstrained systems such as mBART50 M2M and NLLB 200 MoE despite having significantly fewer parameters on two public benchmarks: FLORES-200 and NTREX-128.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
On LCP codes over a mixed ring alphabet
Authors:
Maryam Bajalan,
Javier de la Cruz,
Alexandre Fotue-Tabue,
Edgar Martínez-Moro
Abstract:
In this paper, we introduce a standard generator matrix for mixed-alphabet linear codes over finite chain rings. Furthermore, we show that, when one has a linear complementary pair (LCP) of mixed-alphabet linear codes, both codes are weakly-free. Additionally, we establish that any mixed-alphabet product group code is separable. Thus, if one has a pair $\{C, D\}$ of mixed-alphabet product group co…
▽ More
In this paper, we introduce a standard generator matrix for mixed-alphabet linear codes over finite chain rings. Furthermore, we show that, when one has a linear complementary pair (LCP) of mixed-alphabet linear codes, both codes are weakly-free. Additionally, we establish that any mixed-alphabet product group code is separable. Thus, if one has a pair $\{C, D\}$ of mixed-alphabet product group codes over a finite chain ring that forms a LCP, it follows that $C$ and the Euclidean dual of $D$ are permutation equivalent.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Level of Awareness of PSU Bayambang Campus Students towards E learning Technologies
Authors:
Matthew John F. Sino Cruz,
Kim Eric B. Nanlabi,
Michael Ryan C. Peoro
Abstract:
The study assesses the awareness of PSU Bayambang Campus students regarding e-learning technologies. A Quantitative Research Approach was used, gathering data through a demographic questionnaire and ICT Resources assessment. The survey measured students' familiarity and knowledge of existing e-learning technologies. Around 52.50% of respondents were familiar with e learning concepts, but their exp…
▽ More
The study assesses the awareness of PSU Bayambang Campus students regarding e-learning technologies. A Quantitative Research Approach was used, gathering data through a demographic questionnaire and ICT Resources assessment. The survey measured students' familiarity and knowledge of existing e-learning technologies. Around 52.50% of respondents were familiar with e learning concepts, but their exposure and utilization levels need consideration. Technology, Support, and Users were identified as key factors influencing student awareness. Implementation can be improved through policies and resource provision. The researchers recommend integrating e learning policies, providing ICT Resources and Infrastructure, and offering training for students and teachers. This research serves as a guide for policy design, enhancing the University's learning process and facilitating better learning and interaction.
△ Less
Submitted 6 August, 2023;
originally announced August 2023.
-
RobôCIn Small Size League Extended Team Description Paper for RoboCup 2023
Authors:
Aline Lima de Oliveira,
Cauê Addae da Silva Gomes,
Cecília Virginia Santos da Silva,
Charles Matheus de Sousa Alves,
Danilo Andrade Martins de Souza,
Driele Pires Ferreira Araújo Xavier,
Edgleyson Pereira da Silva,
Felipe Bezerra Martins,
Lucas Henrique Cavalcanti Santos,
Lucas Dias Maciel,
Matheus Paixão Gumercindo dos Santos,
Matheus Lafayette Vasconcelos,
Matheus Vinícius Teotonio do Nascimento Andrade,
João Guilherme Oliveira Carvalho de Melo,
João Pedro Souza Pereira de Moura,
José Ronald da Silva,
José Victor Silva Cruz,
Pedro Henrique Santana de Morais,
Pedro Paulo Salman de Oliveira,
Riei Joaquim Matos Rodrigues,
Roberto Costa Fernandes,
Ryan Vinicius Santos Morais,
Tamara Mayara Ramos Teobaldo,
Washington Igor dos Santos Silva,
Edna Natividade Silva Barros
Abstract:
RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Ou…
▽ More
RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Our team has successfully published 2 articles related to SSL at two high-impact conferences: the 25th RoboCup International Symposium and the 19th IEEE Latin American Robotics Symposium (LARS 2022). Over the last year, we have been continuously migrating from our past codebase to Unification. We will describe the new architecture implemented and some points of software and AI refactoring. In addition, we discuss the process of integrating machined components into the mechanical system, our development for participating in the vision blackout challenge last year and what we are preparing for this year.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
NANOGrav meets Hot New Early Dark Energy and the origin of neutrino mass
Authors:
Juan S. Cruz,
Florian Niedermann,
Martin S. Sloth
Abstract:
It has recently been speculated that the NANOGrav observations point towards a first-order phase transition in the dark sector at the GeV scale [1]. Here, we show that such a phase transition might already have been predicted in the Hot New Early Dark Energy model (Hot NEDE) [2],[3]. There, it was argued that two dark sector phase transitions are the signature of neutrino mass generation through t…
▽ More
It has recently been speculated that the NANOGrav observations point towards a first-order phase transition in the dark sector at the GeV scale [1]. Here, we show that such a phase transition might already have been predicted in the Hot New Early Dark Energy model (Hot NEDE) [2],[3]. There, it was argued that two dark sector phase transitions are the signature of neutrino mass generation through the inverse seesaw mechanism. In particular, an IR phase transition serves a double purpose by resolving the Hubble tension through an energy injection and generating the Majorana mass entry in the inverse seesaw mixing matrix. This usual NEDE phase transition is then accompanied by a UV counterpart, which generates the heavy Dirac mass entry in the inverse seesaw mass matrix of a right-handed neutrino. Here, we investigate if the UV phase transition of the Hot NEDE model can occur at the GeV scale in view of the recent NANOGrav observations.
△ Less
Submitted 4 August, 2023; v1 submitted 6 July, 2023;
originally announced July 2023.
-
An Empirical Study of Impact of Solidity Compiler Updates on Vulnerabilities in Ethereum Smart Contracts
Authors:
Chihiro Kado,
Naoto Yanai,
Jason Paul Cruz,
Kyosuke Yamashita,
Shingo Okamura
Abstract:
Vulnerabilities of Ethereum smart contracts often cause serious financial damage. Whereas the Solidity compiler has been updated to prevent vulnerabilities, its effectiveness has not been revealed so far, to the best of our knowledge. In this paper, we shed light on the impact of compiler versions of vulnerabilities of Ethereum smart contracts. To this end, we collected 503,572 contracts with Soli…
▽ More
Vulnerabilities of Ethereum smart contracts often cause serious financial damage. Whereas the Solidity compiler has been updated to prevent vulnerabilities, its effectiveness has not been revealed so far, to the best of our knowledge. In this paper, we shed light on the impact of compiler versions of vulnerabilities of Ethereum smart contracts. To this end, we collected 503,572 contracts with Solidity source codes in the Ethereum blockchain and then analyzed their vulnerabilities. For three vulnerabilities with high severity, i.e., Locked Money, Using tx.origin, and Unchecked Call, we show that their appearance rates are decreased by virtue of major updates of the Solidity compiler. We then found the following four key insights. First, after the release of version 0.6, the appearance rate for Locked Money has decreased. Second, regardless of compiler updates, the appearance rate for Using tx.origin is significantly low. Third, although the appearance rate for Unchecked Call has decreased in version 0.8, it still remains high due to various factors, including code clones. Fourth, through analysis of code clones, our promising results show that the appearance rate for Unchecked Call can be further decreased by removing the code clones.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Multilingual Large Language Models Are Not (Yet) Code-Switchers
Authors:
Ruochen Zhang,
Samuel Cahyawijaya,
Jan Christian Blaise Cruz,
Genta Indra Winata,
Alham Fikri Aji
Abstract:
Multilingual Large Language Models (LLMs) have recently shown great capabilities in a wide range of tasks, exhibiting state-of-the-art performance through zero-shot or few-shot prompting methods. While there have been extensive studies on their abilities in monolingual tasks, the investigation of their potential in the context of code-switching (CSW), the practice of alternating languages within a…
▽ More
Multilingual Large Language Models (LLMs) have recently shown great capabilities in a wide range of tasks, exhibiting state-of-the-art performance through zero-shot or few-shot prompting methods. While there have been extensive studies on their abilities in monolingual tasks, the investigation of their potential in the context of code-switching (CSW), the practice of alternating languages within an utterance, remains relatively uncharted. In this paper, we provide a comprehensive empirical analysis of various multilingual LLMs, benchmarking their performance across four tasks: sentiment analysis, machine translation, summarization and word-level language identification. Our results indicate that despite multilingual LLMs exhibiting promising outcomes in certain tasks using zero or few-shot prompting, they still underperform in comparison to fine-tuned models of much smaller scales. We argue that current "multilingualism" in LLMs does not inherently imply proficiency with code-switching texts, calling for future research to bridge this discrepancy.
△ Less
Submitted 23 October, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Cold New Early Dark Energy pulls the trigger on the $H_0$ and $S_8$ tensions: a simultaneous solution to both tensions without new ingredients
Authors:
Juan S. Cruz,
Florian Niedermann,
Martin S. Sloth
Abstract:
In this work, we show that the Cold New Early Dark Energy (Cold NEDE) model in its original form can solve both the Hubble tension and the $S_8$ tension without adding any new ingredients at the fundamental level. So far, it was assumed that the trigger field in the Cold NEDE model is completely subdominant. However, relaxing this assumption and letting the trigger field contribute a mere $0.5\%$…
▽ More
In this work, we show that the Cold New Early Dark Energy (Cold NEDE) model in its original form can solve both the Hubble tension and the $S_8$ tension without adding any new ingredients at the fundamental level. So far, it was assumed that the trigger field in the Cold NEDE model is completely subdominant. However, relaxing this assumption and letting the trigger field contribute a mere $0.5\%$ of the total energy density leads to a resolution of the $S_8$ tension while simultaneously improving it as a solution to the $H_0$ tension. Fitting this model to baryonic acoustic oscillations, large-scale-structure, supernovae (including a SH0ES prior), and cosmic microwave background data, we report a preferred NEDE fraction of $f_\mathrm{NEDE}= 0.134^{+0.032}_{-0.025}$ ($68\%$ C.L.), lifting its Gaussian evidence for the first time above $5σ$ (up from $4 σ$ when the trigger contribution to dark matter is negligible). At the same time, we find the new concordance values $H_0 = 71.71 \pm 0.88 \,\mathrm{km}\, \mathrm{sec}^{-1}\, \mathrm{Mpc}^{-1}$ and $S_8 = 0.793 \pm 0.018$. Excluding large-scale structure data and the SH$_0$ES prior, both Gaussian tensions are reduced below the $2 σ$ level.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
On LCP and checkable group codes over finite non-commutative Frobenius rings
Authors:
Sanjit Bhowmick,
Javier de la Cruz,
Edgar Martínez-Moro,
Anuradha Sharma
Abstract:
We provide a simple proof for a complementary pair of group codes over a finite non-commutative Frobenius ring of the fact that one of them is equivalent to the other one. We also explore this fact for checkeable codes over the same type of alphabet.
We provide a simple proof for a complementary pair of group codes over a finite non-commutative Frobenius ring of the fact that one of them is equivalent to the other one. We also explore this fact for checkeable codes over the same type of alphabet.
△ Less
Submitted 13 April, 2023; v1 submitted 31 March, 2023;
originally announced March 2023.
-
Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages
Authors:
Zheng-Xin Yong,
Ruochen Zhang,
Jessica Zosa Forde,
Skyler Wang,
Arjun Subramonian,
Holy Lovenia,
Samuel Cahyawijaya,
Genta Indra Winata,
Lintang Sutawika,
Jan Christian Blaise Cruz,
Yin Lin Tan,
Long Phan,
Rowena Garcia,
Thamar Solorio,
Alham Fikri Aji
Abstract:
While code-mixing is a common linguistic practice in many parts of the world, collecting high-quality and low-cost code-mixed data remains a challenge for natural language processing (NLP) research. The recent proliferation of Large Language Models (LLMs) compels one to ask: how capable are these systems in generating code-mixed data? In this paper, we explore prompting multilingual LLMs in a zero…
▽ More
While code-mixing is a common linguistic practice in many parts of the world, collecting high-quality and low-cost code-mixed data remains a challenge for natural language processing (NLP) research. The recent proliferation of Large Language Models (LLMs) compels one to ask: how capable are these systems in generating code-mixed data? In this paper, we explore prompting multilingual LLMs in a zero-shot manner to generate code-mixed data for seven languages in South East Asia (SEA), namely Indonesian, Malay, Chinese, Tagalog, Vietnamese, Tamil, and Singlish. We find that publicly available multilingual instruction-tuned models such as BLOOMZ and Flan-T5-XXL are incapable of producing texts with phrases or clauses from different languages. ChatGPT exhibits inconsistent capabilities in generating code-mixed texts, wherein its performance varies depending on the prompt template and language pairing. For instance, ChatGPT generates fluent and natural Singlish texts (an English-based creole spoken in Singapore), but for English-Tamil language pair, the system mostly produces grammatically incorrect or semantically meaningless utterances. Furthermore, it may erroneously introduce languages not specified in the prompt. Based on our investigation, existing multilingual LLMs exhibit a wide range of proficiency in code-mixed data generation for SEA languages. As such, we advise against using LLMs in this context without extensive human checks.
△ Less
Submitted 12 September, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
Profiling Cold New Early Dark Energy
Authors:
Juan S. Cruz,
Steen Hannestad,
Emil Brinch Holm,
Florian Niedermann,
Martin S. Sloth,
Thomas Tram
Abstract:
Recent interest in New Early Dark Energy (NEDE), a cosmological model with a vacuum energy component decaying in a triggered phase transition around recombination, has been sparked by its impact on the Hubble tension. Previous constraints on the model parameters were derived in a Bayesian framework with Markov-chain Monte Carlo (MCMC) methods. In this work, we instead perform a frequentist analysi…
▽ More
Recent interest in New Early Dark Energy (NEDE), a cosmological model with a vacuum energy component decaying in a triggered phase transition around recombination, has been sparked by its impact on the Hubble tension. Previous constraints on the model parameters were derived in a Bayesian framework with Markov-chain Monte Carlo (MCMC) methods. In this work, we instead perform a frequentist analysis using the profile likelihood in order to assess the impact of prior volume effects on the constraints. We constrain the maximal fraction of NEDE $f_\mathrm{NEDE}$, finding $f_\mathrm{NEDE}=0.076^{+0.040}_{-0.035}$ at $68 \%$ CL with our baseline dataset and similar constraints using either data from SPT-3G, ACT or full-shape large-scale structure, showing a preference over $Λ$CDM even in the absence of a SH0ES prior on $H_0$. While this is stronger evidence for NEDE than obtained with the corresponding Bayesian analysis, our constraints broadly match those obtained by fixing the NEDE trigger mass. Including the SH0ES prior on $H_0$, we obtain $f_\mathrm{NEDE}=0.136^{+0.024}_{-0.026}$ at $68 \%$ CL. Furthermore, we compare NEDE with the Early Dark Energy (EDE) model, finding similar constraints on the maximal energy density fractions and $H_0$ in the two models. At $68 \%$ CL in the NEDE model, we find $H_0 = 69.56^{+1.16}_{-1.29} \text{ km s}^{-1}\text{ Mpc}^{-1}$ with our baseline and $H_0 = 71.62^{+0.78}_{-0.76} \text{ km s}^{-1}\text{ Mpc}^{-1}$ when including the SH0ES measurement of $H_0$, thus corroborating previous conclusions that the NEDE model provides a considerable alleviation of the $H_0$ tension.
△ Less
Submitted 15 February, 2023;
originally announced February 2023.
-
Ontology-based Context Aware Recommender System Application for Tourism
Authors:
Vitor T. Camacho,
José Cruz
Abstract:
In this work a novel recommender system (RS) for Tourism is presented. The RS is context aware as is now the rule in the state-of-the-art for recommender systems and works on top of a tourism ontology which is used to group the different items being offered. The presented RS mixes different types of recommenders creating an ensemble which changes on the basis of the RS's maturity. Starting from si…
▽ More
In this work a novel recommender system (RS) for Tourism is presented. The RS is context aware as is now the rule in the state-of-the-art for recommender systems and works on top of a tourism ontology which is used to group the different items being offered. The presented RS mixes different types of recommenders creating an ensemble which changes on the basis of the RS's maturity. Starting from simple content-based recommendations and iteratively adding popularity, demographic and collaborative filtering methods as rating density and user cardinality increases. The result is a RS that mutates during its lifetime and uses a tourism ontology and natural language processing (NLP) to correctly bin the items to specific item categories and meta categories in the ontology. This item classification facilitates the association between user preferences and items, as well as allowing to better classify and group the items being offered, which in turn is particularly useful for context-aware filtering.
△ Less
Submitted 29 December, 2022;
originally announced January 2023.
-
Abelian and consta-Abelian polyadic codes over affine algebras with a finite commutative chain coefficient ring
Authors:
Gülsüm Gözde Yılmazgüç,
Javier de la Cruz,
Edgar Martínez-Moro
Abstract:
In this paper, we define Abelian and consta-Abelian polyadic codes over rings defined as affine algebras over chain rings. For that aim, we use the classical construction via splittings and multipliers of the underlying Abelian group. We also derive some results on the structure of the associated polyadic codes and the number of codes under these conditions.
In this paper, we define Abelian and consta-Abelian polyadic codes over rings defined as affine algebras over chain rings. For that aim, we use the classical construction via splittings and multipliers of the underlying Abelian group. We also derive some results on the structure of the associated polyadic codes and the number of codes under these conditions.
△ Less
Submitted 29 December, 2022;
originally announced December 2022.
-
Twisted skew $G$-codes
Authors:
Angelot Behajaina,
Martino Borello,
Javier de la Cruz,
Wolfgang Willems
Abstract:
In this paper we investigate left ideals as codes in twisted skew group rings. The considered rings, which are often algebras over a finite field, allows us to detect many of the well-known codes. The presentation, given here, unifies the concept of group codes, twisted group codes and skew group codes.
In this paper we investigate left ideals as codes in twisted skew group rings. The considered rings, which are often algebras over a finite field, allows us to detect many of the well-known codes. The presentation, given here, unifies the concept of group codes, twisted group codes and skew group codes.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Stacking up electron-rich and electron-deficient monolayers to achieve extraordinary mid- to far-infrared excitonic absorption: Interlayer excitons in the C3B/C3N bilayer
Authors:
Zhao Tang,
Greis J. Cruz,
Fanhao Jia,
Yabei Wu,
Weiyi Xia,
Peihong Zhang
Abstract:
Our ability to efficiently detect and generate far-infrared (i.e., terahertz) radiation is vital in areas spanning from biomedical imaging to interstellar spectroscopy. Despite decades of intense research, bridging the terahertz gap between electronics and optics remains a major challenge due to the lack of robust materials that can efficiently operate in this frequency range, and two-dimensional…
▽ More
Our ability to efficiently detect and generate far-infrared (i.e., terahertz) radiation is vital in areas spanning from biomedical imaging to interstellar spectroscopy. Despite decades of intense research, bridging the terahertz gap between electronics and optics remains a major challenge due to the lack of robust materials that can efficiently operate in this frequency range, and two-dimensional (2D) type-II heterostructures may be ideal candidates to fill this gap. Herein, using highly accurate many-body perturbation theory within the GW plus Bethe-Salpeter equation approach, we predict that a type-II heterostructure consisting of an electron rich C3N and an electron deficient C3B monolayers can give rise to extraordinary optical activities in the mid- to far-infrared range. C3N and C3B are two graphene-derived 2D materials that have attracted increasing research attention. Although both C3N and C3B monolayers are moderate gap 2D materials, and they only couple through the rather weak van der Waals interactions, the bilayer heterostructure surprisingly supports extremely bright, low-energy interlayer excitons with large binding energies of 0.2 ~ 0.4 eV, offering an ideal material with interlayer excitonic states for mid-to far-infrared applications at room temperature. We also investigate in detail the properties and formation mechanism of the inter- and intra-layer excitons.
△ Less
Submitted 25 November, 2022;
originally announced November 2022.
-
Giant excitonic effects in bulk vacancy-ordered double perovskites
Authors:
Fan Zhang,
Weiwei Gao,
Greis J. Cruz,
Yi-yang Sun,
Peihong Zhang,
Jijun Zhao
Abstract:
Using first-principles GW plus Bethe-Salpeter equation calculations, we identify anomalously strong excitonic effects in several vacancy-ordered double perovskites Cs2MX6 (M = Ti, Zr; X = I, Br). Giant exciton binding energies about 1 eV are found in these moderate-gap, inorganic bulk semiconductors, pushing the limit of our understanding of electron-hole (e-h) interaction and exciton formation in…
▽ More
Using first-principles GW plus Bethe-Salpeter equation calculations, we identify anomalously strong excitonic effects in several vacancy-ordered double perovskites Cs2MX6 (M = Ti, Zr; X = I, Br). Giant exciton binding energies about 1 eV are found in these moderate-gap, inorganic bulk semiconductors, pushing the limit of our understanding of electron-hole (e-h) interaction and exciton formation in solids. Not only are the exciton binding energies extremely large compared with any other moderate-gap bulk semiconductors, but they are also larger than typical 2D semiconductors with comparable quasiparticle gaps. Our calculated lowest bright exciton energy agrees well with the experimental optical band gap. The low-energy excitons closely resemble the Frenkel excitons in molecular crystals, as they are highly localized in a single [MX6]2- octahedron and extended in the reciprocal space. The weak dielectric screening effects and the nearly flat frontier electronic bands, which are derived from the weakly bonded [MX6]2- units, together explain the significant excitonic effects. Spin-orbit coupling effects play a crucial role in red-shifting the lowest bright exciton by mixing up spin-singlet and spin-triplet excitons, while exciton-phonon coupling effects have minor impacts on the strong exciton binding energies.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Narrow bandwidth active noise control for microphonics rejection in superconducting cavities at LCLS-II
Authors:
Andrea Bellandi,
Julien Branlard,
Jorge Diaz Cruz,
Sebastian Aderhold,
Andrew Benwell,
Axel Brachmann,
Sonya Hoobler,
Alessandro Ratti,
Dan Gonnella,
Janice Nelson,
Ryan Douglas Porter,
Lisa Zacarias
Abstract:
LCLS-II is an X-Ray Free Electron Laser (XFEL) commissioned in 2022, being the first Continuous Wave (CW) hard XFEL in the world to come into operation. To accelerate the electron beam to an energy of $\SI{4}{\giga \eV}$, 280 TESLA type superconducting RF (SRF) cavities are used. A loaded quality factor ($Q_L$) of $4 \times 10^7$ is used to drive the cavities at a power level of a few kilowatts. F…
▽ More
LCLS-II is an X-Ray Free Electron Laser (XFEL) commissioned in 2022, being the first Continuous Wave (CW) hard XFEL in the world to come into operation. To accelerate the electron beam to an energy of $\SI{4}{\giga \eV}$, 280 TESLA type superconducting RF (SRF) cavities are used. A loaded quality factor ($Q_L$) of $4 \times 10^7$ is used to drive the cavities at a power level of a few kilowatts. For this $Q_L$, the RF cavity bandwidth is 32 Hz. Therefore, kee** the cavity resonance frequency within such bandwidth is imperative to avoid a significant increase in the required drive power. In superconducting accelerators, resonance frequency variations are produced by mechanical microphonic vibrations of the cavities. One source of microphonic noise is rotary machinery such as vacuum pumps or HVAC equipment. A possible method to reject these disturbances is to use Narrowband Active Noise Control (NANC) techniques. These techniques were already tested at DESY/CMTB and Cornell/CBETA. This proceeding presents the implementation of a NANC controller adapted to the LCLS-II Low Level RF (LLRF) control system. Tests showing the rejection of LCLS-II microphonic disturbances are also presented.
△ Less
Submitted 11 October, 2022; v1 submitted 28 September, 2022;
originally announced September 2022.
-
A grounded perspective on New Early Dark Energy using ACT, SPT, and BICEP/Keck
Authors:
Juan S. Cruz,
Florian Niedermann,
Martin S. Sloth
Abstract:
We examine further the ability of the New Early Dark Energy model (NEDE) to resolve the current tension between the Cosmic Microwave Background (CMB) and local measurements of $H_0$ and the consequences for inflation. We perform new Bayesian analyses, including the current datasets from the ground-based CMB telescopes Atacama Cosmology Telescope (ACT), the South Pole Telescope (SPT), and the BICEP…
▽ More
We examine further the ability of the New Early Dark Energy model (NEDE) to resolve the current tension between the Cosmic Microwave Background (CMB) and local measurements of $H_0$ and the consequences for inflation. We perform new Bayesian analyses, including the current datasets from the ground-based CMB telescopes Atacama Cosmology Telescope (ACT), the South Pole Telescope (SPT), and the BICEP/Keck telescopes, employing an updated likelihood for the local measurements coming from the S$H_0$ES collaboration. Using the S$H_0$ES prior on $H_0$, the combined analysis with Baryonic Acoustic Oscillations (BAO), Pantheon, Planck and ACT improves the best-fit by $Δχ^2 = -15.9$ with respect to $Λ$CDM, favors a non-zero fractional contribution of NEDE, $f_{\rm NEDE} > 0$, by $4.8σ$, and gives a best-fit value for the Hubble constant of $H_0 = 72.09$ km/s/Mpc (mean $71.48_{-0.81}^{+0.79}$ with $68\%$ C.L.). A similar analysis using SPT instead of ACT yields consistent results with a $Δχ^2 = - 23.1$ over $Λ$CDM, a preference for non-zero $f_{\rm NEDE}$ of $4.7σ$ and a best-fit value of $H_0=71.77$ km/s/Mpc (mean $71.43_{-0.84}^{+0.84}$ with $68\%$ C.L.). We also provide the constraints on the inflation parameters $r$ and $n_s$ coming from NEDE, including the BICEP/Keck 2018 data, and show that the allowed upper value on the tensor-scalar ratio is consistent with the $Λ$CDM bound, but, as also originally found, with a more blue scalar spectrum implying that the simplest curvaton model is now favored over the Starobinsky inflation model.
△ Less
Submitted 24 February, 2023; v1 submitted 6 September, 2022;
originally announced September 2022.
-
Instability of bubble expansion at zero temperature
Authors:
Wen-Yuan Ai,
Juan S. Cruz,
Bjorn Garbrecht,
Carlos Tamarit
Abstract:
In the context of false vacuum decay at zero temperature, it is well known that bubbles expand with uniform proper acceleration. We show that this uniformly accelerating expansion suffers from an instability related to the bubble size. This can be observed in Minkowski spacetime as a tachyonic mode in the spectrum of fluctuations for the energy functional in the reference frame in which the unifor…
▽ More
In the context of false vacuum decay at zero temperature, it is well known that bubbles expand with uniform proper acceleration. We show that this uniformly accelerating expansion suffers from an instability related to the bubble size. This can be observed in Minkowski spacetime as a tachyonic mode in the spectrum of fluctuations for the energy functional in the reference frame in which the uniformly accelerating bubble wall appears static. In such a frame, arbitrary small perturbations cause an amplifying departure from the static wall solution. This implies that the nucleated bubble is not a critical point of the energy functional in the rest frame of nucleation but becomes one in the accelerating frame. The aforementioned instability for vacuum bubbles can be related to the well-known instability for the nucleated critical static bubbles during finite-temperature phase transitions in the rest frame of the plasma. It is therefore proposed that zero-temperature vacuum decays as seen from accelerating frames have a dual description in terms of finite-temperature phase transitions.
△ Less
Submitted 15 February, 2023; v1 submitted 1 September, 2022;
originally announced September 2022.
-
Linear and Nonlinear Partial Integro-Differential Equations arising from Finance
Authors:
Jose Cruz,
Maria Grossinho,
Daniel Sevcovic,
Cyril Izuchukwu Udeani
Abstract:
The purpose of this review paper is to present our recent results on nonlinear and nonlocal mathematical models arising from modern financial mathematics. It is based on our four papers written jointly by J. Cruz, M. Grossinho, D. Sevcovic, and C. Udeani, as well as parts of PhD thesis by J. Cruz. We investigated linear and nonlinear partial integro-differential equations (PIDEs) arising from opti…
▽ More
The purpose of this review paper is to present our recent results on nonlinear and nonlocal mathematical models arising from modern financial mathematics. It is based on our four papers written jointly by J. Cruz, M. Grossinho, D. Sevcovic, and C. Udeani, as well as parts of PhD thesis by J. Cruz. We investigated linear and nonlinear partial integro-differential equations (PIDEs) arising from option pricing and portfolio selection problems and studied the systematic relationships between the PIDEs with option pricing theory and Black--Scholes models. First, we relax the liquid and complete market assumptions and extend the models that study the market's illiquidity to the case where the underlying asset price follows a Lévy stochastic process with jumps. Then, we establish the corresponding PIDE for option pricing under suitable assumptions. The qualitative properties of solutions to nonlocal linear and nonlinear PIDE are presented using the theory of abstract semilinear parabolic equation in the scale of Bessel potential spaces. The existence and uniqueness of solutions to the PIDE for a general class of the so-called admissible Lévy measures satisfying suitable growth conditions at infinity and origin are also established in the multidimensional space.
△ Less
Submitted 23 July, 2022;
originally announced July 2022.
-
The limits of the strong $CP$ problem
Authors:
Wen-Yuan Ai,
Juan S. Cruz,
Bjorn Garbrecht,
Carlos Tamarit
Abstract:
While $CP$ violation has never been observed in the strong interactions, the QCD Lagrangian admits a $CP$-odd topological interaction proportional to the so called $θ$ angle, which weighs the contributions to the partition function from different topological sectors. The observational bounds are usually interpreted as demanding a severe tuning of $θ$ against the phases of the quark masses, which c…
▽ More
While $CP$ violation has never been observed in the strong interactions, the QCD Lagrangian admits a $CP$-odd topological interaction proportional to the so called $θ$ angle, which weighs the contributions to the partition function from different topological sectors. The observational bounds are usually interpreted as demanding a severe tuning of $θ$ against the phases of the quark masses, which constitutes the strong $CP$ problem. Here we report on recent challenges to this view based on a careful treatment of boundary conditions in the path integral and of the limit of infinite spacetime volume, which leads to $θ$ drop** out of fermion correlation functions and becoming unobservable, implying that $CP$ is preserved in QCD.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Quantum and Gradient Corrections to False Vacuum Decay on a de Sitter Background
Authors:
Juan S. Cruz,
Stephan Brandt,
Maximilian Urban
Abstract:
We study the effects of a fixed de Sitter geometry background in scenarios of false vacuum decay. It is currently understood that bubble nucleation processes associated with first order phase transitions are particularly important in cosmology. The geometry of spacetime complicates the interpretation of the decay rate of a metastable vacuum. However, the effects of curvature can still be studied i…
▽ More
We study the effects of a fixed de Sitter geometry background in scenarios of false vacuum decay. It is currently understood that bubble nucleation processes associated with first order phase transitions are particularly important in cosmology. The geometry of spacetime complicates the interpretation of the decay rate of a metastable vacuum. However, the effects of curvature can still be studied in the particular case where backreaction is neglected. We compute the imaginary part of the action in de Sitter space, including the one-loop and the gradient corrections. We use two independent methodologies and quantify the size of the corrections without any assumptions on the thickness of the wall of the scalar background configuration.
△ Less
Submitted 9 September, 2022; v1 submitted 20 May, 2022;
originally announced May 2022.
-
OMU: A Probabilistic 3D Occupancy Map** Accelerator for Real-time OctoMap at the Edge
Authors:
Tianyu Jia,
En-Yu Yang,
Yu-Shun Hsiao,
Jonathan Cruz,
David Brooks,
Gu-Yeon Wei,
Vijay Janapa Reddi
Abstract:
Autonomous machines (e.g., vehicles, mobile robots, drones) require sophisticated 3D map** to perceive the dynamic environment. However, maintaining a real-time 3D map is expensive both in terms of compute and memory requirements, especially for resource-constrained edge machines. Probabilistic OctoMap is a reliable and memory-efficient 3D dense map model to represent the full environment, with…
▽ More
Autonomous machines (e.g., vehicles, mobile robots, drones) require sophisticated 3D map** to perceive the dynamic environment. However, maintaining a real-time 3D map is expensive both in terms of compute and memory requirements, especially for resource-constrained edge machines. Probabilistic OctoMap is a reliable and memory-efficient 3D dense map model to represent the full environment, with dynamic voxel node pruning and expansion capacity. This paper presents the first efficient accelerator solution, i.e. OMU, to enable real-time probabilistic 3D map** at the edge. To improve the performance, the input map voxels are updated via parallel PE units for data parallelism. Within each PE, the voxels are stored using a specially developed data structure in parallel memory banks. In addition, a pruning address manager is designed within each PE unit to reuse the pruned memory addresses. The proposed 3D map** accelerator is implemented and evaluated using a commercial 12 nm technology. Compared to the ARM Cortex-A57 CPU in the Nvidia Jetson TX2 platform, the proposed accelerator achieves up to 62$\times$ performance and 708$\times$ energy efficiency improvement. Furthermore, the accelerator provides 63 FPS throughput, more than 2$\times$ higher than a real-time requirement, enabling real-time perception for 3D map**.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
Public key cryptography based on skew dihedral group rings
Authors:
Javier de la Cruz,
Edgar Martínez-Moro,
Ricardo Villanueva-Polanco
Abstract:
In this paper, we propose to use a skew dihedral group ring given by the group $D_{2n}$ and the finite field $\mathbb{F}_{q^2}$ for public-key cryptography. Using the ambient space $\mathbb{F}_{q^{2}}^θ D_{2n}$ and a group homomorphism $θ: D_{2n} \rightarrow \mathrm{Aut}(\mathbb{F}_{q^2})$, we introduce a key exchange protocol and present an analysis of its security. Moreover, we explore the prope…
▽ More
In this paper, we propose to use a skew dihedral group ring given by the group $D_{2n}$ and the finite field $\mathbb{F}_{q^2}$ for public-key cryptography. Using the ambient space $\mathbb{F}_{q^{2}}^θ D_{2n}$ and a group homomorphism $θ: D_{2n} \rightarrow \mathrm{Aut}(\mathbb{F}_{q^2})$, we introduce a key exchange protocol and present an analysis of its security. Moreover, we explore the properties of the resulting skew group ring $\mathbb{F}_{q^{2}}^θ D_{2n}$, exploiting them to enhance our key exchange protocol. We also introduce a probabilistic public-key scheme derived from our key exchange protocol and obtain a key encapsulation mechanism (KEM) by applying a well-known generic transformation to our public-key scheme. Finally, we present a proof-of-concept implementation of our cryptographic constructions. To the best of our knowledge, this is the first paper that proposes a skew dihedral group ring for public-key cryptography.
△ Less
Submitted 5 May, 2022;
originally announced May 2022.
-
Automatic Hardware Trojan Insertion using Machine Learning
Authors:
Jonathan Cruz,
Pravin Gaikwad,
Abhishek Nair,
Prabuddha Chakraborty,
Swarup Bhunia
Abstract:
Due to the current horizontal business model that promotes increasing reliance on untrusted third-party Intellectual Properties (IPs), CAD tools, and design facilities, hardware Trojan attacks have become a serious threat to the semiconductor industry. Development of effective countermeasures against hardware Trojan attacks requires: (1) fast and reliable exploration of the viable Trojan attack sp…
▽ More
Due to the current horizontal business model that promotes increasing reliance on untrusted third-party Intellectual Properties (IPs), CAD tools, and design facilities, hardware Trojan attacks have become a serious threat to the semiconductor industry. Development of effective countermeasures against hardware Trojan attacks requires: (1) fast and reliable exploration of the viable Trojan attack space for a given design and (2) a suite of high-quality Trojan-inserted benchmarks that meet specific standards. The latter has become essential for the development and evaluation of design/verification solutions to achieve quantifiable assurance against Trojan attacks. While existing static benchmarks provide a baseline for comparing different countermeasures, they only enumerate a limited number of handcrafted Trojans from the complete Trojan design space. To accomplish these dual objectives, in this paper, we present MIMIC, a novel AI-guided framework for automatic Trojan insertion, which can create a large population of valid Trojans for a given design by mimicking the properties of a small set of known Trojans. While there exist tools to automatically insert Trojan instances using fixed Trojan templates, they cannot analyze known Trojan attacks for creating new instances that accurately capture the threat model. MIMIC works in two major steps: (1) it analyzes structural and functional features of existing Trojan populations in a multi-dimensional space to train machine learning models and generate a large number of "virtual Trojans" of the given design, (2) next, it binds them into the design by matching their functional/structural properties with suitable nets of the internal logic structure. We have developed a complete tool flow for MIMIC, extensively evaluated the framework by exploring several use-cases, and quantified its effectiveness to demonstrate highly promising results.
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
A review of path following control strategies for autonomous robotic vehicles: theory, simulations, and experiments
Authors:
Nguyen Hung,
Francisco Rego,
Joao Quintas,
Joao Cruz,
Marcelo Jacinto,
David Souto,
Andre Potes,
Luis Sebastiao,
Antonio Pascoal
Abstract:
This article presents an in-depth review of the topic of path following for autonomous robotic vehicles, with a specific focus on vehicle motion in two dimensional space (2D). From a control system standpoint, path following can be formulated as the problem of stabilizing a path following error system that describes the dynamics of position and possibly orientation errors of a vehicle with respect…
▽ More
This article presents an in-depth review of the topic of path following for autonomous robotic vehicles, with a specific focus on vehicle motion in two dimensional space (2D). From a control system standpoint, path following can be formulated as the problem of stabilizing a path following error system that describes the dynamics of position and possibly orientation errors of a vehicle with respect to a path, with the errors defined in an appropriate reference frame. In spite of the large variety of path following methods described in the literature we show that, in principle, most of them can be categorized in two groups: stabilization of the path following error system expressed either in the vehicle's body frame or in a frame attached to a "reference point" moving along the path, such as a Frenet-Serret (F-S) frame or a Parallel Transport (P-T) frame. With this observation, we provide a unified formulation that is simple but general enough to cover many methods available in the literature. We then discuss the advantages and disadvantages of each method, comparing them from the design and implementation standpoint. We further show experimental results of the path following methods obtained from field trials testing with under-actuated and fully-actuated autonomous marine vehicles. In addition, we introduce open-source Matlab and Gazebo/ROS simulation toolboxes that are helpful in testing path following methods prior to their integration in the combined guidance, navigation, and control systems of autonomous vehicles.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Towards Automatic Construction of Filipino WordNet: Word Sense Induction and Synset Induction Using Sentence Embeddings
Authors:
Dan John Velasco,
Axel Alba,
Trisha Gail Pelagio,
Bryce Anthony Ramirez,
Unisse Chua,
Briane Paul Samson,
Jan Christian Blaise Cruz,
Charibeth Cheng
Abstract:
Wordnets are indispensable tools for various natural language processing applications. Unfortunately, wordnets get outdated, and producing or updating wordnets can be slow and costly in terms of time and resources. This problem intensifies for low-resource languages. This study proposes a method for word sense induction and synset induction using only two linguistic resources, namely, an unlabeled…
▽ More
Wordnets are indispensable tools for various natural language processing applications. Unfortunately, wordnets get outdated, and producing or updating wordnets can be slow and costly in terms of time and resources. This problem intensifies for low-resource languages. This study proposes a method for word sense induction and synset induction using only two linguistic resources, namely, an unlabeled corpus and a sentence embeddings-based language model. The resulting sense inventory and synonym sets can be used in automatically creating a wordnet. We applied this method on a corpus of Filipino text. The sense inventory and synsets were evaluated by matching them with the sense inventory of the machine translated Princeton WordNet, as well as comparing the synsets to the Filipino WordNet. This study empirically shows that the 30% of the induced word senses are valid and 40% of the induced synsets are valid in which 20% are novel synsets.
△ Less
Submitted 19 October, 2023; v1 submitted 7 April, 2022;
originally announced April 2022.
-
Using Synthetic Data for Conversational Response Generation in Low-resource Settings
Authors:
Gabriel Louis Tan,
Adrian Paule Ty,
Schuyler Ng,
Denzel Adrian Co,
Jan Christian Blaise Cruz,
Charibeth Cheng
Abstract:
Response generation is a task in natural language processing (NLP) where a model is trained to respond to human statements. Conversational response generators take this one step further with the ability to respond within the context of previous responses. While there are existing techniques for training such models, they all require an abundance of conversational data which are not always availabl…
▽ More
Response generation is a task in natural language processing (NLP) where a model is trained to respond to human statements. Conversational response generators take this one step further with the ability to respond within the context of previous responses. While there are existing techniques for training such models, they all require an abundance of conversational data which are not always available for low-resource languages. In this research, we make three contributions. First, we released the first Filipino conversational dataset collected from a popular Philippine online forum, which we named the PEx Conversations Dataset. Second, we introduce a data augmentation (DA) methodology for Filipino data by employing a Tagalog RoBERTa model to increase the size of the existing corpora. Lastly, we published the first Filipino conversational response generator capable of generating responses related to the previous 3 responses. With the supplementary synthetic data, we were able to improve the performance of the response generator by up to 12.2% in BERTScore, 10.7% in perplexity, and 11.7% in content word usage as compared to training with zero synthetic data.
△ Less
Submitted 6 April, 2022;
originally announced April 2022.
-
Persistent revivals in a system of trapped bosonic atoms
Authors:
Carlos Diaz Mejia,
Javier de la Cruz,
Sergio Lerma-Hernandez,
Jorge G. Hirsch
Abstract:
Dynamical signatures of quantum chaos are observed in the survival probability of different initial states, in a system of cold atoms trapped in a linear chain with site noise and open boundary conditions. It is shown that chaos is present in the region of small disorder, at intermediate energies. The study is performed with different number of sites and atoms: 7,8 and 9, but focusing on the case…
▽ More
Dynamical signatures of quantum chaos are observed in the survival probability of different initial states, in a system of cold atoms trapped in a linear chain with site noise and open boundary conditions. It is shown that chaos is present in the region of small disorder, at intermediate energies. The study is performed with different number of sites and atoms: 7,8 and 9, but focusing on the case where the particle density is one. States of the occupation basis with energies in the chaotic region are evolved at long times.
Remarkable differences in the behaviour of the survival probability are found for states with different energy-eigenbasis participation ratio (PR). Whereas those with large PR clearly exhibit the characteristic random-matrix correlation hole before equilibration, those with small PR present a marginal or even no correlation hole which is replaced by revivals lasting up to the stage of equilibration, suggesting a connection with the quantum scarring phenomenon.
△ Less
Submitted 12 December, 2023; v1 submitted 16 March, 2022;
originally announced March 2022.
-
Stochastic evaluation of four-component relativistic second-order many-body perturbation energies: A potentially quadratic-scaling correlation method
Authors:
J. César Cruz,
Jorge Garza,
Takeshi Yanai,
So Hirata
Abstract:
A second-order many-body perturbation correction to the relativistic Dirac-Hartree-Fock energy is evaluated stochastically by integrating 13-dimensional products of four-component spinors and Coulomb potentials. The integration in the real space of electron coordinates is carried out by the Monte Carlo (MC) method with the Metropolis sampling, whereas the MC integration in the imaginary-time domai…
▽ More
A second-order many-body perturbation correction to the relativistic Dirac-Hartree-Fock energy is evaluated stochastically by integrating 13-dimensional products of four-component spinors and Coulomb potentials. The integration in the real space of electron coordinates is carried out by the Monte Carlo (MC) method with the Metropolis sampling, whereas the MC integration in the imaginary-time domain is performed by the inverse-CDF (cumulative distribution function) method. The computational cost to reach a given relative statistical error for spatially compact but heavy molecules is observed to be no worse than cubic and possibly quadratic with the number of electrons or basis functions. This is a vast improvement over the quintic scaling of the conventional, deterministic second-order many-body perturbation method. The algorithm is also easily and efficiently parallelized with demonstrated 92% strong scalability going from 64 to 4096 processors for a fixed job size.
△ Less
Submitted 26 April, 2022; v1 submitted 10 March, 2022;
originally announced March 2022.
-
Cellulose-Based Reflective Liquid Crystal Films as Optical Filters and Solar Gain Regulators
Authors:
Joshua A. De La Cruz,
Qingkun Liu,
Bohdan Senyuk,
Allister W. Frazier,
Karthik Peddireddy,
Ivan I. Smalyukh
Abstract:
Many promising approaches for designing interactions of synthetic materials with light involve solid optical monocrystals and nanofabricated photonic crystal structures with spatially periodic variations of refractive index. Although their high costs limit current technological applications, remarkably, such photonic and optically anisotropic materials have also evolved throughout nature and enabl…
▽ More
Many promising approaches for designing interactions of synthetic materials with light involve solid optical monocrystals and nanofabricated photonic crystal structures with spatially periodic variations of refractive index. Although their high costs limit current technological applications, remarkably, such photonic and optically anisotropic materials have also evolved throughout nature and enable narrow or broad-band spectral reflection of light. Here we use self-assembly of biomaterial cellulose nanocrystals to obtain three-layer films with helicoidal and nematic-like organization of the cellulose nanoparticles, which mimics naturally occurring polarization-insensitive reflectors found in the wings of Plusiotis resplendens beetles. These films were characterized with polarized optical microscopy and circular dichroism spectrometry, as well as scanning and transmission electron microscopies. These films exhibit high reflectivity tunable within the visible and near-infrared regions of the optical spectrum and may find applications ranging from color filters to smart cloth designs and in solar-gain-regulating building technologies.
△ Less
Submitted 2 January, 2022;
originally announced January 2022.
-
Aerogel from sustainably grown bacterial cellulose pellicle as thermally insulative film for building envelope
Authors:
Blaise Fleury,
Eldho Abraham,
Joshua A. De La Cruz,
Varun S. Chandrasekar,
Bohdan Senyuk,
Qingkun Liu,
Vladyslav Cherpak,
Sungoh Park,
Jan Bart ten Hove,
Ivan I. Smalyukh
Abstract:
Improving building energy performance requires the development of new highly insulative materials. An affordable retrofitting solution comprising a thin film could improve the resistance to heat flow in both residential and commercial buildings and reduce overall energy consumption. Here we propose cellulose aerogel films formed from pellicles produced by the bacteria Gluconacetobacter hansenii as…
▽ More
Improving building energy performance requires the development of new highly insulative materials. An affordable retrofitting solution comprising a thin film could improve the resistance to heat flow in both residential and commercial buildings and reduce overall energy consumption. Here we propose cellulose aerogel films formed from pellicles produced by the bacteria Gluconacetobacter hansenii as insulation materials. We studied the impact of density and nanostructure on the aerogels' thermal properties. Thermal conductivity as low as 13 mW/(K*m) was measured for native pellicle-based aerogels dried as-is with minimal post-treatment. The use of waste from the beer brewing industry as a solution to grow the pellicle maintained the cellulose yield obtained with standard Hestrin-Schramm medium, making our product more affordable and sustainable. In the future, our work can be extended through further diversification of the sources of substrate among food wastes, facilitating larger potential production and applications.
△ Less
Submitted 2 January, 2022;
originally announced January 2022.
-
Public key cryptography based on twisted dihedral group algebras
Authors:
Javier de la Cruz,
Ricardo Villanueva-Polanco
Abstract:
In this paper, we propose to use a twisted dihedral group algebra for public-key cryptography. For this, we introduce a new $2$-cocycle $α_λ$ to twist the dihedral group algebra. Using the ambient space $\mathbb{F}^{α_λ} D_{2n}$, we then introduce a key exchange protocol and present an analysis of its security. Moreover, we explore the properties of the resulting twisted algebra…
▽ More
In this paper, we propose to use a twisted dihedral group algebra for public-key cryptography. For this, we introduce a new $2$-cocycle $α_λ$ to twist the dihedral group algebra. Using the ambient space $\mathbb{F}^{α_λ} D_{2n}$, we then introduce a key exchange protocol and present an analysis of its security. Moreover, we explore the properties of the resulting twisted algebra $\mathbb{F}^{α_λ}D_{2n}$, exploiting them to enhance our key exchange protocol. We also introduce a probabilistic public-key scheme derived from our key-exchange protocol and obtain a key encapsulation mechanism (KEM) by applying a well-known generic transformation to our public-key scheme. Finally, we present a proof-of-concept implementation of the resulting key encapsulation mechanism.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
Software Variants for Hardware Trojan Detection and Resilience in COTS Processors
Authors:
Mahmudul Hasan,
Jonathan Cruz,
Prabuddha Chakraborty,
Swarup Bhunia,
Tamzidul Hoque
Abstract:
The commercial off-the-shelf (COTS) component based ecosystem provides an attractive system design paradigm due to the drastic reduction in development time and cost compared to custom solutions. However, it brings in a growing concern of trustworthiness arising from the possibility of embedded malicious logic, or hardware Trojans in COTS components. Existing trust-verification approaches are typi…
▽ More
The commercial off-the-shelf (COTS) component based ecosystem provides an attractive system design paradigm due to the drastic reduction in development time and cost compared to custom solutions. However, it brings in a growing concern of trustworthiness arising from the possibility of embedded malicious logic, or hardware Trojans in COTS components. Existing trust-verification approaches are typically not applicable to COTS hardware due to the absence of golden models and the lack of observability of internal signals. In this work, we propose a novel approach for runtime Trojan detection and resilience in untrusted COTS processors through judicious modifications in software. The proposed approach does not rely on any hardware redundancy or architectural modification and hence seamlessly integrates with the COTS-based system design process. Trojan resilience is achieved through the execution of multiple functionally equivalent software variants. We have developed and implemented a solution for compiler-based automatic generation of program variants, metric-guided selection of variants, and their integration in a single executable. To evaluate the proposed approach, we first analyzed the effectiveness of program variants in avoiding the activation of a random pool of Trojans. By implementing several Trojans in an OpenRISC 1000 processor, we analyzed the detectability and resilience during Trojan activation in both single and multiple variants. We also present delay and code size overhead for the automatically generated variants for several programs and discuss future research directions to reduce the overhead.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
Third-Party Hardware IP Assurance against Trojans through Supervised Learning and Post-processing
Authors:
Pravin Gaikwad,
Jonathan Cruz,
Prabuddha Chakraborty,
Swarup Bhunia,
Tamzidul Hoque
Abstract:
System-on-chip (SoC) developers increasingly rely on pre-verified hardware intellectual property (IP) blocks acquired from untrusted third-party vendors. These IPs might contain hidden malicious functionalities or hardware Trojans to compromise the security of the fabricated SoCs. Recently, supervised machine learning (ML) techniques have shown promising capability in identifying nets of potential…
▽ More
System-on-chip (SoC) developers increasingly rely on pre-verified hardware intellectual property (IP) blocks acquired from untrusted third-party vendors. These IPs might contain hidden malicious functionalities or hardware Trojans to compromise the security of the fabricated SoCs. Recently, supervised machine learning (ML) techniques have shown promising capability in identifying nets of potential Trojans in third party IPs (3PIPs). However, they bring several major challenges. First, they do not guide us to an optimal choice of features that reliably covers diverse classes of Trojans. Second, they require multiple Trojan-free/trusted designs to insert known Trojans and generate a trained model. Even if a set of trusted designs are available for training, the suspect IP could be inherently very different from the set of trusted designs, which may negatively impact the verification outcome. Third, these techniques only identify a set of suspect Trojan nets that require manual intervention to understand the potential threat. In this paper, we present VIPR, a systematic machine learning (ML) based trust verification solution for 3PIPs that eliminates the need for trusted designs for training. We present a comprehensive framework, associated algorithms, and a tool flow for obtaining an optimal set of features, training a targeted machine learning model, detecting suspect nets, and identifying Trojan circuitry from the suspect nets. We evaluate the framework on several Trust-Hub Trojan benchmarks and provide a comparative analysis of detection performance across different trained models, selection of features, and post-processing techniques. The proposed post-processing algorithms reduce false positives by up to 92.85%.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
Data Processing Matters: SRPH-Konvergen AI's Machine Translation System for WMT'21
Authors:
Lintang Sutawika,
Jan Christian Blaise Cruz
Abstract:
In this paper, we describe the submission of the joint Samsung Research Philippines-Konvergen AI team for the WMT'21 Large Scale Multilingual Translation Task - Small Track 2. We submit a standard Seq2Seq Transformer model to the shared task without any training or architecture tricks, relying mainly on the strength of our data preprocessing techniques to boost performance. Our final submission mo…
▽ More
In this paper, we describe the submission of the joint Samsung Research Philippines-Konvergen AI team for the WMT'21 Large Scale Multilingual Translation Task - Small Track 2. We submit a standard Seq2Seq Transformer model to the shared task without any training or architecture tricks, relying mainly on the strength of our data preprocessing techniques to boost performance. Our final submission model scored 22.92 average BLEU on the FLORES-101 devtest set, and scored 22.97 average BLEU on the contest's hidden test set, ranking us sixth overall. Despite using only a standard Transformer, our model ranked first in Indonesian to Javanese, showing that data preprocessing matters equally, if not more, than cutting edge model architectures and training techniques.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
Improving Large-scale Language Models and Resources for Filipino
Authors:
Jan Christian Blaise Cruz,
Charibeth Cheng
Abstract:
In this paper, we improve on existing language resources for the low-resource Filipino language in two ways. First, we outline the construction of the TLUnified dataset, a large-scale pretraining corpus that serves as an improvement over smaller existing pretraining datasets for the language in terms of scale and topic variety. Second, we pretrain new Transformer language models following the RoBE…
▽ More
In this paper, we improve on existing language resources for the low-resource Filipino language in two ways. First, we outline the construction of the TLUnified dataset, a large-scale pretraining corpus that serves as an improvement over smaller existing pretraining datasets for the language in terms of scale and topic variety. Second, we pretrain new Transformer language models following the RoBERTa pretraining technique to supplant existing models trained with small corpora. Our new RoBERTa models show significant improvements over existing Filipino models in three benchmark datasets with an average gain of 4.47% test accuracy across the three classification tasks of varying difficulty.
△ Less
Submitted 11 November, 2021;
originally announced November 2021.