Search | arXiv e-print repository

arXiv:2405.20014 [pdf, other]

subMFL: Compatiple subModel Generation for Federated Learning in Device Heterogenous Environment

Authors: Zeyneddin Oz, Ceylan Soygul Oz, Abdollah Malekjafarian, Nima Afraz, Fatemeh Golpayegani

Abstract: Federated Learning (FL) is commonly used in systems with distributed and heterogeneous devices with access to varying amounts of data and diverse computing and storage capacities. FL training process enables such devices to update the weights of a shared model locally using their local data and then a trusted central server combines all of those models to generate a global model. In this way, a gl… ▽ More Federated Learning (FL) is commonly used in systems with distributed and heterogeneous devices with access to varying amounts of data and diverse computing and storage capacities. FL training process enables such devices to update the weights of a shared model locally using their local data and then a trusted central server combines all of those models to generate a global model. In this way, a global model is generated while the data remains local to devices to preserve privacy. However, training large models such as Deep Neural Networks (DNNs) on resource-constrained devices can take a prohibitively long time and consume a large amount of energy. In the current process, the low-capacity devices are excluded from the training process, although they might have access to unseen data. To overcome this challenge, we propose a model compression approach that enables heterogeneous devices with varying computing capacities to participate in the FL process. In our approach, the server shares a dense model with all devices to train it: Afterwards, the trained model is gradually compressed to obtain submodels with varying levels of sparsity to be used as suitable initial global models for resource-constrained devices that were not capable of train the first dense model. This results in an increased participation rate of resource-constrained devices while the transferred weights from the previous round of training are preserved. Our validation experiments show that despite reaching about 50 per cent global sparsity, generated submodels maintain their accuracy while can be shared to increase participation by around 50 per cent. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 12 pages, 7 figures, European Conference on Parallel Processing, pp. between 52 and 64, Springer, 2023

arXiv:2405.07778 [pdf, other]

doi 10.1016/j.eswa.2024.124123

A Comprehensive Analysis of Static Word Embeddings for Turkish

Authors: Karahan Sarıtaş, Cahid Arda Öz, Tunga Güngör

Abstract: Word embeddings are fixed-length, dense and distributed word representations that are used in natural language processing (NLP) applications. There are basically two types of word embedding models which are non-contextual (static) models and contextual models. The former method generates a single embedding for a word regardless of its context, while the latter method produces distinct embeddings f… ▽ More Word embeddings are fixed-length, dense and distributed word representations that are used in natural language processing (NLP) applications. There are basically two types of word embedding models which are non-contextual (static) models and contextual models. The former method generates a single embedding for a word regardless of its context, while the latter method produces distinct embeddings for a word based on the specific contexts in which it appears. There are plenty of works that compare contextual and non-contextual embedding models within their respective groups in different languages. However, the number of studies that compare the models in these two groups with each other is very few and there is no such study in Turkish. This process necessitates converting contextual embeddings into static embeddings. In this paper, we compare and evaluate the performance of several contextual and non-contextual models in both intrinsic and extrinsic evaluation settings for Turkish. We make a fine-grained comparison by analyzing the syntactic and semantic capabilities of the models separately. The results of the analyses provide insights about the suitability of different embedding models in different types of NLP tasks. We also build a Turkish word embedding repository comprising the embedding models used in this work, which may serve as a valuable resource for researchers and practitioners in the field of Turkish NLP. We make the word embeddings, scripts, and evaluation datasets publicly available. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Journal ref: Expert Systems with Applications Volume 252, Part A, 15 October 2024, 124123

arXiv:2303.13070 [pdf, other]

Development of proton beam irradiation system for the NA65/DsTau experiment

Authors: Shigeki Aoki, Akitaka Ariga, Tomoko Ariga, Nikolaos Charitonidis, Sergey Dmitrievsky, Radu Dobre, Elena Firu, Yury Gornushkin, Ali Murat Guler, Daiki Hayakawa, Koichi Kodama, Masahiro Komatsu, Umut Kose, Madalina Mihaela Miloi, Manato Miura, Mitsuhiro Nakamura, Toshiyuki Nakano, Alina-Tania Neagu, Toranosuke Okumura, Canay Oz, Hiroki Rokujo, Osamu Sato, Svetlana Vasina, Junya Yoshida, Masahiro Yoshimoto , et al. (1 additional authors not shown)

Abstract: Tau neutrino is the least studied lepton of the Standard Model (SM). The NA65/DsTau experiment targets to investigate $D_s$, the parent particle of the $ν_τ$, using the nuclear emulsion-based detector and to decrease the systematic uncertainty of $ν_τ$ flux prediction from over 50% to 10% for future beam dump experiments. In the experiment, the emulsion detectors are exposed to the CERN SPS 400 Ge… ▽ More Tau neutrino is the least studied lepton of the Standard Model (SM). The NA65/DsTau experiment targets to investigate $D_s$, the parent particle of the $ν_τ$, using the nuclear emulsion-based detector and to decrease the systematic uncertainty of $ν_τ$ flux prediction from over 50% to 10% for future beam dump experiments. In the experiment, the emulsion detectors are exposed to the CERN SPS 400 GeV proton beam. To provide optimal conditions for the reconstruction of interactions, the protons are required to be uniformly distributed over the detector's surface with an average density of $10^5~\rm{cm^{-2}}$ and the fluctuation of less than 10%. To address this issue, we developed a new proton irradiation system called the target mover. The new target mover provided irradiation with a proton density of $0.98~\rm{cm^{-2}}$ and the density fluctuation of $2.0\pm 0.3$% in the DsTau 2021 run. △ Less

Submitted 3 April, 2024; v1 submitted 23 March, 2023; originally announced March 2023.

Comments: 9 pages, 16 figures

Showing 1–3 of 3 results for author: Oz, C