Search | arXiv e-print repository

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra , et al. (90 additional authors not shown)

Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset for training, a scaled-up version of the one used for phi-2, composed of heavily filtered publicly available web data and synthetic data. The model is also further aligned for robustness, safety, and chat format. We also provide some initial parameter-scaling results with a 7B and 14B models trained for 4.8T tokens, called phi-3-small and phi-3-medium, both significantly more capable than phi-3-mini (e.g., respectively 75% and 78% on MMLU, and 8.7 and 8.9 on MT-bench). Moreover, we also introduce phi-3-vision, a 4.2 billion parameter model based on phi-3-mini with strong reasoning capabilities for image and text prompts. △ Less

Submitted 23 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: 19 pages

arXiv:2306.11644 [pdf, other]

Textbooks Are All You Need

Authors: Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li

Abstract: We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accu… ▽ More We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accuracy 50.6% on HumanEval and 55.5% on MBPP. It also displays surprising emergent properties compared to phi-1-base, our model before our finetuning stage on a dataset of coding exercises, and phi-1-small, a smaller model with 350M parameters trained with the same pipeline as phi-1 that still achieves 45% on HumanEval. △ Less

Submitted 2 October, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: 26 pages; changed color scheme of plot. fixed minor typos and added couple clarifications

arXiv:2301.13671 [pdf, ps, other]

doi 10.1007/978-3-030-93420-0_11

Enhancing Hyper-To-Real Space Projections Through Euclidean Norm Meta-Heuristic Optimization

Authors: Luiz C. F. Ribeiro, Mateus Roder, Gustavo H. de Rosa, Leandro A. Passos, João P. Papa

Abstract: The continuous computational power growth in the last decades has made solving several optimization problems significant to humankind a tractable task; however, tackling some of them remains a challenge due to the overwhelming amount of candidate solutions to be evaluated, even by using sophisticated algorithms. In such a context, a set of nature-inspired stochastic methods, called meta-heuristic… ▽ More The continuous computational power growth in the last decades has made solving several optimization problems significant to humankind a tractable task; however, tackling some of them remains a challenge due to the overwhelming amount of candidate solutions to be evaluated, even by using sophisticated algorithms. In such a context, a set of nature-inspired stochastic methods, called meta-heuristic optimization, can provide robust approximate solutions to different kinds of problems with a small computational burden, such as derivative-free real function optimization. Nevertheless, these methods may converge to inadequate solutions if the function landscape is too harsh, e.g., enclosing too many local optima. Previous works addressed this issue by employing a hypercomplex representation of the search space, like quaternions, where the landscape becomes smoother and supposedly easier to optimize. Under this approach, meta-heuristic computations happen in the hypercomplex space, whereas variables are mapped back to the real domain before function evaluation. Despite this latter operation being performed by the Euclidean norm, we have found that after the optimization procedure has finished, it is usually possible to obtain even better solutions by employing the Minkowski $p$-norm instead and fine-tuning $p$ through an auxiliary sub-problem with neglecting additional cost and no hyperparameters. Such behavior was observed in eight well-established benchmarking functions, thus fostering a new research direction for hypercomplex meta-heuristic optimization. △ Less

Submitted 31 January, 2023; originally announced January 2023.

arXiv:2212.11119 [pdf, ps, other]

doi 10.1016/j.patcog.2021.108098

A survey on text generation using generative adversarial networks

Authors: Gustavo Henrique de Rosa, João Paulo Papa

Abstract: This work presents a thorough review concerning recent studies and text generation advancements using Generative Adversarial Networks. The usage of adversarial learning for text generation is promising as it provides alternatives to generate the so-called "natural" language. Nevertheless, adversarial text generation is not a simple task as its foremost architecture, the Generative Adversarial Netw… ▽ More This work presents a thorough review concerning recent studies and text generation advancements using Generative Adversarial Networks. The usage of adversarial learning for text generation is promising as it provides alternatives to generate the so-called "natural" language. Nevertheless, adversarial text generation is not a simple task as its foremost architecture, the Generative Adversarial Networks, were designed to cope with continuous information (image) instead of discrete data (text). Thus, most works are based on three possible options, i.e., Gumbel-Softmax differentiation, Reinforcement Learning, and modified training objectives. All alternatives are reviewed in this survey as they present the most recent approaches for generating text using adversarial-based techniques. The selected works were taken from renowned databases, such as Science Direct, IEEEXplore, Springer, Association for Computing Machinery, and arXiv, whereas each selected work has been critically analyzed and assessed to present its objective, methodology, and experimental results. △ Less

Submitted 20 December, 2022; originally announced December 2022.

Journal ref: Pattern Recognition 119 (2021): 108098

arXiv:2212.09447 [pdf, other]

Improving Pre-Trained Weights Through Meta-Heuristics Fine-Tuning

Authors: Gustavo H. de Rosa, Mateus Roder, João Paulo Papa, Claudio F. G. dos Santos

Abstract: Machine Learning algorithms have been extensively researched throughout the last decade, leading to unprecedented advances in a broad range of applications, such as image classification and reconstruction, object recognition, and text categorization. Nonetheless, most Machine Learning algorithms are trained via derivative-based optimizers, such as the Stochastic Gradient Descent, leading to possib… ▽ More Machine Learning algorithms have been extensively researched throughout the last decade, leading to unprecedented advances in a broad range of applications, such as image classification and reconstruction, object recognition, and text categorization. Nonetheless, most Machine Learning algorithms are trained via derivative-based optimizers, such as the Stochastic Gradient Descent, leading to possible local optimum entrapments and inhibiting them from achieving proper performances. A bio-inspired alternative to traditional optimization techniques, denoted as meta-heuristic, has received significant attention due to its simplicity and ability to avoid local optimums imprisonment. In this work, we propose to use meta-heuristic techniques to fine-tune pre-trained weights, exploring additional regions of the search space, and improving their effectiveness. The experimental evaluation comprises two classification tasks (image and text) and is assessed under four literature datasets. Experimental results show nature-inspired algorithms' capacity in exploring the neighborhood of pre-trained weights, achieving superior results than their counterpart pre-trained architectures. Additionally, a thorough analysis of distinct architectures, such as Multi-Layer Perceptron and Recurrent Neural Networks, attempts to visualize and provide more precise insights into the most critical weights to be fine-tuned in the learning process. △ Less

Submitted 19 December, 2022; originally announced December 2022.

arXiv:2211.17045 [pdf, other]

doi 10.1109/SSCI50451.2021.9660128

From Actions to Events: A Transfer Learning Approach Using Improved Deep Belief Networks

Authors: Mateus Roder, Jurandy Almeida, Gustavo H. de Rosa, Leandro A. Passos, André L. D. Rossi, João P. Papa

Abstract: In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks… ▽ More In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks regarding the learning process as training complex models over large datasets are expensive and time-consuming. Such a problem is even more evident when dealing with video analysis. Some works have considered transfer learning or domain adaptation, i.e., approaches that map the knowledge from one domain to another, to ease the training burden, yet most of them operate over individual or small blocks of frames. This paper proposes a novel approach to map the knowledge from action recognition to event recognition using an energy-based model, denoted as Spectral Deep Belief Network. Such a model can process all frames simultaneously, carrying spatial and temporal information through the learning process. The experimental results conducted over two public video dataset, the HMDB-51 and the UCF-101, depict the effectiveness of the proposed model and its reduced computational burden when compared to traditional energy-based models, such as Restricted Boltzmann Machines and Deep Belief Networks. △ Less

Submitted 30 November, 2022; originally announced November 2022.

arXiv:2210.03251 [pdf, other]

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

Authors: Ganesh Jawahar, Subhabrata Mukherjee, Debadeepta Dey, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Caio Cesar Teodoro Mendes, Gustavo Henrique de Rosa, Shital Shah

Abstract: Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with high frequency user prompt patterns (or focused prompts) where word-based language models have been quite effective. In this work, we study the more cha… ▽ More Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with high frequency user prompt patterns (or focused prompts) where word-based language models have been quite effective. In this work, we study the more challenging open-domain setting consisting of low frequency user prompt patterns (or broad prompts, e.g., prompt about 93rd academy awards) and demonstrate the effectiveness of character-based language models. We study this problem under memory-constrained settings (e.g., edge devices and smartphones), where character-based representation is effective in reducing the overall model size (in terms of parameters). We use WikiText-103 benchmark to simulate broad prompts and demonstrate that character models rival word models in exact match accuracy for the autocomplete task, when controlled for the model size. For instance, we show that a 20M parameter character model performs similar to an 80M parameter word model in the vanilla setting. We further propose novel methods to improve character models by incorporating inductive bias in the form of compositional information and representation transfer from large word models. Datasets and code used in this work are available at https://github.com/UBC-NLP/char_autocomplete. △ Less

Submitted 7 June, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

Comments: SustaiNLP 2023

arXiv:2203.02094 [pdf, other]

LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models

Authors: Mojan Javaheripi, Gustavo H. de Rosa, Subhabrata Mukherjee, Shital Shah, Tomasz L. Religa, Caio C. T. Mendes, Sebastien Bubeck, Farinaz Koushanfar, Debadeepta Dey

Abstract: The Transformer architecture is ubiquitously used as the building block of large-scale autoregressive language models. However, finding architectures with the optimal trade-off between task performance (perplexity) and hardware constraints like peak memory utilization and latency is non-trivial. This is exacerbated by the proliferation of various hardware. We leverage the somewhat surprising empir… ▽ More The Transformer architecture is ubiquitously used as the building block of large-scale autoregressive language models. However, finding architectures with the optimal trade-off between task performance (perplexity) and hardware constraints like peak memory utilization and latency is non-trivial. This is exacerbated by the proliferation of various hardware. We leverage the somewhat surprising empirical observation that the number of decoder parameters in autoregressive Transformers has a high rank correlation with task performance, irrespective of the architecture topology. This observation organically induces a simple Neural Architecture Search (NAS) algorithm that uses decoder parameters as a proxy for perplexity without need for any model training. The search phase of our training-free algorithm, dubbed Lightweight Transformer Search (LTS), can be run directly on target devices since it does not require GPUs. Using on-target-device measurements, LTS extracts the Pareto-frontier of perplexity versus any hardware performance cost. We evaluate LTS on diverse devices from ARM CPUs to NVIDIA GPUs and two popular autoregressive Transformer backbones: GPT-2 and Transformer-XL. Results show that the perplexity of 16-layer GPT-2 and Transformer-XL can be achieved with up to 1.5x, 2.5x faster runtime and 1.2x, 2.0x lower peak memory utilization. When evaluated in zero and one-shot settings, LTS Pareto-frontier models achieve higher average accuracy compared to the 350M parameter OPT across 14 tasks, with up to 1.6x lower latency. LTS extracts the Pareto-frontier in under 3 hours while running on a commodity laptop. We effectively remove the carbon footprint of hundreds of GPU hours of training during search, offering a strong simple baseline for future NAS methods in autoregressive language modeling. △ Less

Submitted 17 October, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

arXiv:2202.03854 [pdf, ps, other]

Comparative Study Between Distance Measures On Supervised Optimum-Path Forest Classification

Authors: Gustavo Henrique de Rosa, Mateus Roder, João Paulo Papa

Abstract: Machine Learning has attracted considerable attention throughout the past decade due to its potential to solve far-reaching tasks, such as image classification, object recognition, anomaly detection, and data forecasting. A standard approach to tackle such applications is based on supervised learning, which is assisted by large sets of labeled data and is conducted by the so-called classifiers, su… ▽ More Machine Learning has attracted considerable attention throughout the past decade due to its potential to solve far-reaching tasks, such as image classification, object recognition, anomaly detection, and data forecasting. A standard approach to tackle such applications is based on supervised learning, which is assisted by large sets of labeled data and is conducted by the so-called classifiers, such as Logistic Regression, Decision Trees, Random Forests, and Support Vector Machines, among others. An alternative to traditional classifiers is the parameterless Optimum-Path Forest (OPF), which uses a graph-based methodology and a distance measure to create arcs between nodes and hence sets of trees, responsible for conquering the nodes, defining their labels, and sha** the forests. Nevertheless, its performance is strongly associated with an appropriate distance measure, which may vary according to the dataset's nature. Therefore, this work proposes a comparative study over a wide range of distance measures applied to the supervised Optimum-Path Forest classification. The experimental results are conducted using well-known literature datasets and compared across benchmarking classifiers, illustrating OPF's ability to adapt to distinct domains. △ Less

Submitted 8 February, 2022; originally announced February 2022.

Comments: 16 pages, 2 figures

MSC Class: 68T01 ACM Class: I.2.0

arXiv:2106.11828 [pdf, ps, other]

Speeding Up OPFython with Numba

Authors: Gustavo H. de Rosa, João Paulo Papa

Abstract: A graph-inspired classifier, known as Optimum-Path Forest (OPF), has proven to be a state-of-the-art algorithm comparable to Logistic Regressors, Support Vector Machines in a wide variety of tasks. Recently, its Python-based version, denoted as OPFython, has been proposed to provide a more friendly framework and a faster prototy** environment. Nevertheless, Python-based algorithms are slower tha… ▽ More A graph-inspired classifier, known as Optimum-Path Forest (OPF), has proven to be a state-of-the-art algorithm comparable to Logistic Regressors, Support Vector Machines in a wide variety of tasks. Recently, its Python-based version, denoted as OPFython, has been proposed to provide a more friendly framework and a faster prototy** environment. Nevertheless, Python-based algorithms are slower than their counterpart C-based algorithms, impacting their performance when confronted with large amounts of data. Therefore, this paper proposed a simple yet highly efficient speed up using the Numba package, which accelerates Numpy-based calculations and attempts to increase the algorithm's overall performance. Experimental results showed that the proposed approach achieved better results than the naïve Python-based OPF and speeded up its distance measurement calculation. △ Less

Submitted 22 June, 2021; originally announced June 2021.

Comments: 12 pages, 1 figure

MSC Class: 68T10 ACM Class: I.2.1; I.2.5

arXiv:2101.06741 [pdf, other]

doi 10.1109/TETCI.2020.3043764

Energy-based Dropout in Restricted Boltzmann Machines: Why not go random

Authors: Mateus Roder, Gustavo H. de Rosa, Victor Hugo C. de Albuquerque, André L. D. Rossi, João P. Papa

Abstract: Deep learning architectures have been widely fostered throughout the last years, being used in a wide range of applications, such as object recognition, image reconstruction, and signal processing. Nevertheless, such models suffer from a common problem known as overfitting, which limits the network from predicting unseen data effectively. Regularization approaches arise in an attempt to address su… ▽ More Deep learning architectures have been widely fostered throughout the last years, being used in a wide range of applications, such as object recognition, image reconstruction, and signal processing. Nevertheless, such models suffer from a common problem known as overfitting, which limits the network from predicting unseen data effectively. Regularization approaches arise in an attempt to address such a shortcoming. Among them, one can refer to the well-known Dropout, which tackles the problem by randomly shutting down a set of neurons and their connections according to a certain probability. Therefore, this approach does not consider any additional knowledge to decide which units should be disconnected. In this paper, we propose an energy-based Dropout (E-Dropout) that makes conscious decisions whether a neuron should be dropped or not. Specifically, we design this regularization method by correlating neurons and the model's energy as an importance level for further applying it to energy-based models, such as Restricted Boltzmann Machines (RBMs). The experimental results over several benchmark datasets revealed the proposed approach's suitability compared to the traditional Dropout and the standard RBMs. △ Less

Submitted 17 January, 2021; originally announced January 2021.

arXiv:2101.05652 [pdf, other]

doi 10.1016/j.asoc.2020.106453

A Nature-Inspired Feature Selection Approach based on Hypercomplex Information

Authors: Gustavo H. de Rosa, João Paulo Papa, Xin-She Yang

Abstract: Feature selection for a given model can be transformed into an optimization task. The essential idea behind it is to find the most suitable subset of features according to some criterion. Nature-inspired optimization can mitigate this problem by producing compelling yet straightforward solutions when dealing with complicated fitness functions. Additionally, new mathematical representations, such a… ▽ More Feature selection for a given model can be transformed into an optimization task. The essential idea behind it is to find the most suitable subset of features according to some criterion. Nature-inspired optimization can mitigate this problem by producing compelling yet straightforward solutions when dealing with complicated fitness functions. Additionally, new mathematical representations, such as quaternions and octonions, are being used to handle higher-dimensional spaces. In this context, we are introducing a meta-heuristic optimization framework in a hypercomplex-based feature selection, where hypercomplex numbers are mapped to real-valued solutions and then transferred onto a boolean hypercube by a sigmoid function. The intended hypercomplex feature selection is tested for several meta-heuristic algorithms and hypercomplex representations, achieving results comparable to some state-of-the-art approaches. The good results achieved by the proposed approach make it a promising tool amongst feature selection research. △ Less

Submitted 14 January, 2021; originally announced January 2021.

Comments: 17 pages, 7 figures

ACM Class: I.2.0

Journal ref: APPLIED SOFT COMPUTING; v. 94, SEP 2020

arXiv:2101.01042 [pdf, ps, other]

Fast Ensemble Learning Using Adversarially-Generated Restricted Boltzmann Machines

Authors: Gustavo H. de Rosa, Mateus Roder, João P. Papa

Abstract: Machine Learning has been applied in a wide range of tasks throughout the last years, ranging from image classification to autonomous driving and natural language processing. Restricted Boltzmann Machine (RBM) has received recent attention and relies on an energy-based structure to model data probability distributions. Notwithstanding, such a technique is susceptible to adversarial manipulation, i… ▽ More Machine Learning has been applied in a wide range of tasks throughout the last years, ranging from image classification to autonomous driving and natural language processing. Restricted Boltzmann Machine (RBM) has received recent attention and relies on an energy-based structure to model data probability distributions. Notwithstanding, such a technique is susceptible to adversarial manipulation, i.e., slightly or profoundly modified data. An alternative to overcome the adversarial problem lies in the Generative Adversarial Networks (GAN), capable of modeling data distributions and generating adversarial data that resemble the original ones. Therefore, this work proposes to artificially generate RBMs using Adversarial Learning, where pre-trained weight matrices serve as the GAN inputs. Furthermore, it proposes to sample copious amounts of matrices and combine them into ensembles, alleviating the burden of training new models'. Experimental results demonstrate the suitability of the proposed approach under image reconstruction and image classification tasks, and describe how artificial-based ensembles are alternatives to pre-training vast amounts of RBMs. △ Less

Submitted 4 January, 2021; originally announced January 2021.

Comments: 26 pages, 7 figures

ACM Class: I.2.0

arXiv:2003.07443 [pdf, other]

Learnergy: Energy-based Machine Learners

Authors: Mateus Roder, Gustavo Henrique de Rosa, João Paulo Papa

Abstract: Throughout the last years, machine learning techniques have been broadly encouraged in the context of deep learning architectures. An exciting algorithm denoted as Restricted Boltzmann Machine relies on energy- and probabilistic-based nature to tackle the most diverse applications, such as classification, reconstruction, and generation of images and signals. Nevertheless, one can see they are not… ▽ More Throughout the last years, machine learning techniques have been broadly encouraged in the context of deep learning architectures. An exciting algorithm denoted as Restricted Boltzmann Machine relies on energy- and probabilistic-based nature to tackle the most diverse applications, such as classification, reconstruction, and generation of images and signals. Nevertheless, one can see they are not adequately renowned compared to other well-known deep learning techniques, e.g., Convolutional Neural Networks. Such behavior promotes the lack of researches and implementations around the literature, co** with the challenge of sufficiently comprehending these energy-based systems. Therefore, in this paper, we propose a Python-inspired framework in the context of energy-based architectures, denoted as Learnergy. Essentially, Learnergy is built upon PyTorch to provide a more friendly environment and a faster prototy** workspace and possibly the usage of CUDA computations, speeding up their computational time. △ Less

Submitted 23 September, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

Comments: 12 pages, 12 figures

MSC Class: 68T07 ACM Class: I.2.0; I.5.0

arXiv:2001.10420 [pdf, other]

doi 10.1016/j.simpa.2021.100113

OPFython: A Python-Inspired Optimum-Path Forest Classifier

Authors: Gustavo Henrique de Rosa, João Paulo Papa, Alexandre Xavier Falcão

Abstract: Machine learning techniques have been paramount throughout the last years, being applied in a wide range of tasks, such as classification, object recognition, person identification, and image segmentation. Nevertheless, conventional classification algorithms, e.g., Logistic Regression, Decision Trees, and Bayesian classifiers, might lack complexity and diversity, not suitable when dealing with rea… ▽ More Machine learning techniques have been paramount throughout the last years, being applied in a wide range of tasks, such as classification, object recognition, person identification, and image segmentation. Nevertheless, conventional classification algorithms, e.g., Logistic Regression, Decision Trees, and Bayesian classifiers, might lack complexity and diversity, not suitable when dealing with real-world data. A recent graph-inspired classifier, known as the Optimum-Path Forest, has proven to be a state-of-the-art technique, comparable to Support Vector Machines and even surpassing it in some tasks. This paper proposes a Python-based Optimum-Path Forest framework, denoted as OPFython, where all of its functions and classes are based upon the original C language implementation. Additionally, as OPFython is a Python-based library, it provides a more friendly environment and a faster prototy** workspace than the C language. △ Less

Submitted 30 July, 2021; v1 submitted 28 January, 2020; originally announced January 2020.

Comments: 14 pages, 11 figures

MSC Class: 68T01 ACM Class: I.2.0; I.5.0

Journal ref: Software Impacts, 1 (2021), Article 100113

arXiv:1912.13002 [pdf, other]

Opytimizer: A Nature-Inspired Python Optimizer

Authors: Gustavo H. de Rosa, Douglas Rodrigues, João P. Papa

Abstract: Optimization aims at selecting a feasible set of parameters in an attempt to solve a particular problem, being applied in a wide range of applications, such as operations research, machine learning fine-tuning, and control engineering, among others. Nevertheless, traditional iterative optimization methods use the evaluation of gradients and Hessians to find their solutions, not being practical due… ▽ More Optimization aims at selecting a feasible set of parameters in an attempt to solve a particular problem, being applied in a wide range of applications, such as operations research, machine learning fine-tuning, and control engineering, among others. Nevertheless, traditional iterative optimization methods use the evaluation of gradients and Hessians to find their solutions, not being practical due to their computational burden and when working with non-convex functions. Recent biological-inspired methods, known as meta-heuristics, have arisen in an attempt to fulfill these problems. Even though they do not guarantee to find optimal solutions, they usually find a suitable solution. In this paper, we proposed a Python-based meta-heuristic optimization framework denoted as Opytimizer. Several methods and classes are implemented to provide a user-friendly workspace among diverse meta-heuristics, ranging from evolutionary- to swarm-based techniques. △ Less

Submitted 2 December, 2020; v1 submitted 30 December, 2019; originally announced December 2019.

Showing 1–16 of 16 results for author: de Rosa, G