-
The Mediating Effect of Blockchain Technology on the Cryptocurrency Purchase Intention
Authors:
İbrahim Halil Efendioğlu,
Gökhan Akel,
Bekir Değirmenci,
Dilek Aydoğdu,
Kamile Elmasoğlu,
Hande Begüm Bumin Doyduk,
Arzu Şeker,
Hatice Bahçe
Abstract:
Cryptocurrencies, enabling secure digital asset transfers without a central authority, are experiencing increasing interest. With the increasing number of global and Turkish investors, it is evident that interest in digital assets will continue to rise sustainably, even in the face of financial fluctuations. However, it remains uncertain whether consumers perceive blockchain technology's ease of u…
▽ More
Cryptocurrencies, enabling secure digital asset transfers without a central authority, are experiencing increasing interest. With the increasing number of global and Turkish investors, it is evident that interest in digital assets will continue to rise sustainably, even in the face of financial fluctuations. However, it remains uncertain whether consumers perceive blockchain technology's ease of use and usefulness when purchasing cryptocurrencies. This study aims to explain blockchain technology's perceived ease of use and usefulness in cryptocurrency purchases by considering factors such as quality customer service, reduced costs, efficiency, and reliability. To achieve this goal, data were obtained from 463 participants interested in cryptocurrencies in different regions of Turkey. The data were analyzed using SPSS Process Macro programs. The analysis results indicate that perceived ease of use and usefulness mediate the effects of customer service and reduced costs, efficiency, and security on purchase intention.
△ Less
Submitted 28 September, 2023;
originally announced October 2023.
-
Large Pre-Trained Models with Extra-Large Vocabularies: A Contrastive Analysis of Hebrew BERT Models and a New One to Outperform Them All
Authors:
Eylon Gueta,
Avi Shmidman,
Shaltiel Shmidman,
Cheyn Shmuel Shmidman,
Joshua Guedalia,
Moshe Koppel,
Dan Bareket,
Amit Seker,
Reut Tsarfaty
Abstract:
We present a new pre-trained language model (PLM) for modern Hebrew, termed AlephBERTGimmel, which employs a much larger vocabulary (128K items) than standard Hebrew PLMs before. We perform a contrastive analysis of this model against all previous Hebrew PLMs (mBERT, heBERT, AlephBERT) and assess the effects of larger vocabularies on task performance. Our experiments show that larger vocabularies…
▽ More
We present a new pre-trained language model (PLM) for modern Hebrew, termed AlephBERTGimmel, which employs a much larger vocabulary (128K items) than standard Hebrew PLMs before. We perform a contrastive analysis of this model against all previous Hebrew PLMs (mBERT, heBERT, AlephBERT) and assess the effects of larger vocabularies on task performance. Our experiments show that larger vocabularies lead to fewer splits, and that reducing splits is better for model performance, across different tasks. All in all this new model achieves new SOTA on all available Hebrew benchmarks, including Morphological Segmentation, POS Tagging, Full Morphological Analysis, NER, and Sentiment Analysis. Subsequently we advocate for PLMs that are larger not only in terms of number of layers or training data, but also in terms of their vocabulary. We release the new model publicly for unrestricted use.
△ Less
Submitted 15 May, 2023; v1 submitted 28 November, 2022;
originally announced November 2022.
-
AlephBERT:A Hebrew Large Pre-Trained Language Model to Start-off your Hebrew NLP Application With
Authors:
Amit Seker,
Elron Bandel,
Dan Bareket,
Idan Brusilovsky,
Refael Shaked Greenfeld,
Reut Tsarfaty
Abstract:
Large Pre-trained Language Models (PLMs) have become ubiquitous in the development of language understanding technology and lie at the heart of many artificial intelligence advances. While advances reported for English using PLMs are unprecedented, reported advances using PLMs in Hebrew are few and far between. The problem is twofold. First, Hebrew resources available for training NLP models are n…
▽ More
Large Pre-trained Language Models (PLMs) have become ubiquitous in the development of language understanding technology and lie at the heart of many artificial intelligence advances. While advances reported for English using PLMs are unprecedented, reported advances using PLMs in Hebrew are few and far between. The problem is twofold. First, Hebrew resources available for training NLP models are not at the same order of magnitude as their English counterparts. Second, there are no accepted tasks and benchmarks to evaluate the progress of Hebrew PLMs on. In this work we aim to remedy both aspects. First, we present AlephBERT, a large pre-trained language model for Modern Hebrew, which is trained on larger vocabulary and a larger dataset than any Hebrew PLM before. Second, using AlephBERT we present new state-of-the-art results on multiple Hebrew tasks and benchmarks, including: Segmentation, Part-of-Speech Tagging, full Morphological Tagging, Named-Entity Recognition and Sentiment Analysis. We make our AlephBERT model publicly available, providing a single point of entry for the development of Hebrew NLP applications.
△ Less
Submitted 8 April, 2021;
originally announced April 2021.
-
New developer metrics: Are comments as crucial as code contributions?
Authors:
Abdulkadir Şeker,
Banu Diri,
Halil Arslan
Abstract:
Open-source code development has become widespread in recent years. As a result, open-source software platforms have also become popular, and millions of developers from diverse locations are able to contribute to the same projects. On these platforms, various knowledge about them is obtained from user activity. This information is used in the form of developer metrics to solve a variety of challe…
▽ More
Open-source code development has become widespread in recent years. As a result, open-source software platforms have also become popular, and millions of developers from diverse locations are able to contribute to the same projects. On these platforms, various knowledge about them is obtained from user activity. This information is used in the form of developer metrics to solve a variety of challenges. In this study, we proposed new developer metrics, including commenting and issue-related activity, that require less information. We concluded that commenting on any feature of a project can be equally as valuable as code contribution. In addition, besides the quantitative ones, metrics based on only the existence of the activity have been shown to offer also considerable results. We saw that issues were crucial in identifying user contributions. Even if a developer makes a contribution to only one issue on a project, the relation between the developer and the project is tight. The hit scores are relatively lower because of the sparsity problem of our dataset; even so, we believe that we have presented improvable and remarkable new developer metrics.
△ Less
Submitted 27 July, 2020; v1 submitted 29 June, 2020;
originally announced June 2020.
-
Summarising Big Data: Common GitHub Dataset for Software Engineering Challenges
Authors:
Abdulkadir Şeker,
Banu Diri,
Halil Arslan
Abstract:
In open-source software development environments; textual, numerical and relationship-based data generated are of interest to researchers. Various data sets are available for this data, which is frequently used in areas such as software engineering and natural language processing. However, since these data sets contain all the data in the environment, the problem arises in the terabytes of data pr…
▽ More
In open-source software development environments; textual, numerical and relationship-based data generated are of interest to researchers. Various data sets are available for this data, which is frequently used in areas such as software engineering and natural language processing. However, since these data sets contain all the data in the environment, the problem arises in the terabytes of data processing. For this reason, almost all of the studies using GitHub data use filtered data according to certain criteria. In this context, using a different data set in each study makes a comparison of the accuracy of the studies quite difficult. In order to solve this problem, a common dataset was created and shared with the researchers, which would allow us to work on many software engineering problems.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
From SPMRL to NMRL: What Did We Learn (and Unlearn) in a Decade of Parsing Morphologically-Rich Languages (MRLs)?
Authors:
Reut Tsarfaty,
Dan Bareket,
Stav Klein,
Amit Seker
Abstract:
It has been exactly a decade since the first establishment of SPMRL, a research initiative unifying multiple research efforts to address the peculiar challenges of Statistical Parsing for Morphologically-Rich Languages (MRLs).Here we reflect on parsing MRLs in that decade, highlight the solutions and lessons learned for the architectural, modeling and lexical challenges in the pre-neural era, and…
▽ More
It has been exactly a decade since the first establishment of SPMRL, a research initiative unifying multiple research efforts to address the peculiar challenges of Statistical Parsing for Morphologically-Rich Languages (MRLs).Here we reflect on parsing MRLs in that decade, highlight the solutions and lessons learned for the architectural, modeling and lexical challenges in the pre-neural era, and argue that similar challenges re-emerge in neural architectures for MRLs. We then aim to offer a climax, suggesting that incorporating symbolic ideas proposed in SPMRL terms into nowadays neural architectures has the potential to push NLP for MRLs to a new level. We sketch strategies for designing Neural Models for MRLs (NMRL), and showcase preliminary support for these strategies via investigating the task of multi-tagging in Hebrew, a morphologically-rich, high-fusion, language
△ Less
Submitted 4 May, 2020;
originally announced May 2020.
-
Open Source Software Development Challenges: A Systematic Literature Review on GitHub
Authors:
Abdulkadir Şeker,
Banu Diri,
Halil Arslan,
Mehmet Fatih Amasyalı
Abstract:
Git is used as the distributed version control system for many open-source software projects. One Git-based service, GitHub, is the most common code hosting and repository service for open-source software projects. For researchers that study software engineering, the content that is hosted on these platforms provides much valuable data. There are some alternatives to get GitHub data such as GitHub…
▽ More
Git is used as the distributed version control system for many open-source software projects. One Git-based service, GitHub, is the most common code hosting and repository service for open-source software projects. For researchers that study software engineering, the content that is hosted on these platforms provides much valuable data. There are some alternatives to get GitHub data such as GitHub Archive, GitHub API or GHTorrent. Among these options, GHTorrent is the most widely known and used GitHub dataset in the literature. Although there are some review studies about software engineering challenges across the GitHub platform, no review of GHTorrent dataset-specific research is available. In this study, the 172 studies that use GHTorrent as a data source were categorized within the scope of open source software development challenges and a systematic literature review was carried out. Moreover, the pros and cons of the dataset have been indicated and the focused issues of the literature on and the open challenges have been noted.
△ Less
Submitted 27 July, 2020; v1 submitted 24 March, 2020;
originally announced March 2020.
-
Determination of Photonuclear Reaction Cross-Sections on stable p-shell Nuclei by Using Deep Neural Networks
Authors:
Serkan Akkoyun,
Hüseyin Kaya,
Abdulkadir Şeker,
Saliha Yeşilyurt
Abstract:
The photonuclear reactions which is induced by high-energetic photon are one of the important type of reactions in the nuclear structure studies. In this reaction, a target material is bombarded by photons with the energies in the range of gamma-ray energy scale and the photons can statistically be absorbed by a nucleus in the target material. In order to get rid of the excess energies of the exci…
▽ More
The photonuclear reactions which is induced by high-energetic photon are one of the important type of reactions in the nuclear structure studies. In this reaction, a target material is bombarded by photons with the energies in the range of gamma-ray energy scale and the photons can statistically be absorbed by a nucleus in the target material. In order to get rid of the excess energies of the excited target nuclei, it can first emit protons, neutrons, alphas and light particles according to the separation energy thresholds. After this emitting process, generally an unstable nucleus can be formed. By the investigation of this products forming after photonuclear reactions, nuclear structure information can be obtained. In the present work, (γ, n) photonuclear reaction cross-sections on stable p-shell nuclei have been estimated by using neural network method. The main purpose of this study is to find neural network structures that give the best estimations on the cross-sections and to compare them with each other and available literature data. According to the results, the method is convenient for this task.
△ Less
Submitted 16 March, 2020;
originally announced March 2020.
-
What's Wrong with Hebrew NLP? And How to Make it Right
Authors:
Reut Tsarfaty,
Amit Seker,
Shoval Sadde,
Stav Klein
Abstract:
For languages with simple morphology, such as English, automatic annotation pipelines such as spaCy or Stanford's CoreNLP successfully serve projects in academia and the industry. For many morphologically-rich languages (MRLs), similar pipelines show sub-optimal performance that limits their applicability for text analysis in research and the industry.The sub-optimal performance is mainly due to e…
▽ More
For languages with simple morphology, such as English, automatic annotation pipelines such as spaCy or Stanford's CoreNLP successfully serve projects in academia and the industry. For many morphologically-rich languages (MRLs), similar pipelines show sub-optimal performance that limits their applicability for text analysis in research and the industry.The sub-optimal performance is mainly due to errors in early morphological disambiguation decisions, which cannot be recovered later in the pipeline, yielding incoherent annotations on the whole. In this paper we describe the design and use of the Onlp suite, a joint morpho-syntactic parsing framework for processing Modern Hebrew texts. The joint inference over morphology and syntax substantially limits error propagation, and leads to high accuracy. Onlp provides rich and expressive output which already serves diverse academic and commercial needs. Its accompanying online demo further serves educational activities, introducing Hebrew NLP intricacies to researchers and non-researchers alike.
△ Less
Submitted 15 August, 2019;
originally announced August 2019.