Search | arXiv e-print repository

Is ChatGPT a game changer for geocoding -- a benchmark for geocoding address parsing techniques

Authors: Zhengcong Yin, Diya Li, Daniel W. Goldberg

Abstract: The remarkable success of GPT models across various tasks, including toponymy recognition motivates us to assess the performance of the GPT-3 model in the geocoding address parsing task. To ensure that the evaluation more accurately mirrors performance in real-world scenarios with diverse user input qualities and resolve the pressing need for a 'gold standard' evaluation dataset for geocoding syst… ▽ More The remarkable success of GPT models across various tasks, including toponymy recognition motivates us to assess the performance of the GPT-3 model in the geocoding address parsing task. To ensure that the evaluation more accurately mirrors performance in real-world scenarios with diverse user input qualities and resolve the pressing need for a 'gold standard' evaluation dataset for geocoding systems, we introduce a benchmark dataset of low-quality address descriptions synthesized based on human input patterns mining from actual input logs of a geocoding system in production. This dataset has 21 different input errors and variations; contains over 239,000 address records that are uniquely selected from streets across all U.S. 50 states and D.C.; and consists of three subsets to be used as training, validation, and testing sets. Building on this, we train and gauge the performance of the GPT-3 model in extracting address components, contrasting its performance with transformer-based and LSTM-based models. The evaluation results indicate that Bidirectional LSTM-CRF model has achieved the best performance over these transformer-based models and GPT-3 model. Transformer-based models demonstrate very comparable results compared to the Bidirectional LSTM-CRF model. The GPT-3 model, though trailing in performance, showcases potential in the address parsing task with few-shot examples, exhibiting room for improvement with additional fine-tuning. We open source the code and data of this presented benchmark so that researchers can utilize it for future model development or extend it to evaluate similar tasks, such as document geocoding. △ Less

Submitted 15 December, 2023; v1 submitted 22 October, 2023; originally announced October 2023.

arXiv:2301.13631 [pdf]

TopoBERT: Plug and Play Toponym Recognition Module Harnessing Fine-tuned BERT

Authors: Bing Zhou, Lei Zou, Yingjie Hu, Yi Qiang, Daniel Goldberg

Abstract: Extracting precise geographical information from textual contents is crucial in a plethora of applications. For example, during hazardous events, a robust and unbiased toponym extraction framework can provide an avenue to tie the location concerned to the topic discussed by news media posts and pinpoint humanitarian help requests or damage reports from social media. Early studies have leveraged ru… ▽ More Extracting precise geographical information from textual contents is crucial in a plethora of applications. For example, during hazardous events, a robust and unbiased toponym extraction framework can provide an avenue to tie the location concerned to the topic discussed by news media posts and pinpoint humanitarian help requests or damage reports from social media. Early studies have leveraged rule-based, gazetteer-based, deep learning, and hybrid approaches to address this problem. However, the performance of existing tools is deficient in supporting operations like emergency rescue, which relies on fine-grained, accurate geographic information. The emerging pretrained language models can better capture the underlying characteristics of text information, including place names, offering a promising pathway to optimize toponym recognition to underpin practical applications. In this paper, TopoBERT, a toponym recognition module based on a one dimensional Convolutional Neural Network (CNN1D) and Bidirectional Encoder Representation from Transformers (BERT), is proposed and fine-tuned. Three datasets (CoNLL2003-Train, Wikipedia3000, WNUT2017) are leveraged to tune the hyperparameters, discover the best training strategy, and train the model. Another two datasets (CoNLL2003-Test and Harvey2017) are used to evaluate the performance. Three distinguished classifiers, linear, multi-layer perceptron, and CNN1D, are benchmarked to determine the optimal model architecture. TopoBERT achieves state-of-the-art performance (f1-score=0.865) compared to the other five baseline models and can be applied to diverse toponym recognition tasks without additional training. △ Less

Submitted 3 February, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: 8 Pages, 6 figures

arXiv:2012.00058 [pdf]

PMLB v1.0: An open source dataset collection for benchmarking machine learning methods

Authors: Joseph D. Romano, Trang T. Le, William La Cava, John T. Gregg, Daniel J. Goldberg, Natasha L. Ray, Praneel Chakraborty, Daniel Himmelstein, Weixuan Fu, Jason H. Moore

Abstract: Motivation: Novel machine learning and statistical modeling studies rely on standardized comparisons to existing methods using well-studied benchmark datasets. Few tools exist that provide rapid access to many of these datasets through a standardized, user-friendly interface that integrates well with popular data science workflows. Results: This release of PMLB provides the largest collection of… ▽ More Motivation: Novel machine learning and statistical modeling studies rely on standardized comparisons to existing methods using well-studied benchmark datasets. Few tools exist that provide rapid access to many of these datasets through a standardized, user-friendly interface that integrates well with popular data science workflows. Results: This release of PMLB provides the largest collection of diverse, public benchmark datasets for evaluating new machine learning and data science methods aggregated in one location. v1.0 introduces a number of critical improvements developed following discussions with the open-source community. Availability: PMLB is available at https://github.com/EpistasisLab/pmlb. Python and R interfaces for PMLB can be installed through the Python Package Index and Comprehensive R Archive Network, respectively. △ Less

Submitted 6 April, 2021; v1 submitted 30 November, 2020; originally announced December 2020.

Comments: 4 pages, 1 figure. *: These authors contributed equally

ACM Class: H.2.8

arXiv:1807.02227 [pdf, ps, other]

Polynomial time algorithm for optimal stopping with fixed accuracy

Authors: David A. Goldberg, Yilun Chen

Abstract: The problem of high-dimensional path-dependent optimal stopping (OS) is important to multiple academic communities and applications. Modern OS tasks often have a large number of decision epochs, and complicated non-Markovian dynamics, making them especially challenging. Standard approaches, often relying on ADP, duality, deep learning and other heuristics, have shown strong empirical performance,… ▽ More The problem of high-dimensional path-dependent optimal stopping (OS) is important to multiple academic communities and applications. Modern OS tasks often have a large number of decision epochs, and complicated non-Markovian dynamics, making them especially challenging. Standard approaches, often relying on ADP, duality, deep learning and other heuristics, have shown strong empirical performance, yet have limited rigorous guarantees (which may scale exponentially in the problem parameters and/or require previous knowledge of basis functions or additional continuity assumptions). Although past work has placed these problems in the framework of computational complexity and polynomial-time approximability, those analyses were limited to simple one-dimensional problems. For long-horizon complex OS problems, is a polynomial time solution even theoretically possible? We prove that given access to an efficient simulator of the underlying information process, and fixed accuracy epsilon, there exists an algorithm that returns an epsilon-optimal solution (both stopping policies and approximate optimal values) with computational complexity scaling polynomially in the time horizon and underlying dimension. Like the first polynomial-time (approximation) algorithms for several other well-studied problems, our theoretical guarantees are polynomial yet impractical. Our approach is based on a novel expansion for the optimal value which may be of independent interest. △ Less

Submitted 14 May, 2024; v1 submitted 5 July, 2018; originally announced July 2018.

arXiv:1301.1762 [pdf, ps, other]

Second-order Markov random fields for independent sets on the infinite Cayley tree

Authors: David A. Goldberg

Abstract: Recently, there has been significant interest in understanding the properties of Markov random fields (M.r.f.) defined on the independent sets of sparse graphs. When these M.r.f. are restricted to pairwise interactions (i.e. hardcore model), much progress has been made. However, considerably less is known in the presence of higher-order interactions, which arise e.g. in the analysis of independent… ▽ More Recently, there has been significant interest in understanding the properties of Markov random fields (M.r.f.) defined on the independent sets of sparse graphs. When these M.r.f. are restricted to pairwise interactions (i.e. hardcore model), much progress has been made. However, considerably less is known in the presence of higher-order interactions, which arise e.g. in the analysis of independent sets with special properties and the study of resource-constrained communication networks. In this paper, we further our understanding of such models by analyzing M.r.f. with second-order interactions on the independent sets of the infinite Cayley tree. We prove that the associated Gibbsian specification satisfies the celebrated FKG Inequality whenever the local potentials defining the Hamiltonian satisfy a log-convexity condition. Under this condition, we give necessary and sufficient conditions for the existence of a unique infinite-volume Gibbs measure in terms of an explicit system of equations, prove the existence of a phase transition, and give explicit bounds on the associated critical activity, which we prove to exhibit a certain robustness. For potentials which are small perturbations of those coinciding to the hardcore model at the critical activity, we characterize whether the resulting specification has a unique infinite-volume Gibbs measure in terms of whether these perturbations satisfy an explicit linear inequality. Our analysis reveals an interesting non-monotonicity with regards to biasing towards excluded nodes with no included neighbors. △ Less

Submitted 18 June, 2015; v1 submitted 9 January, 2013; originally announced January 2013.

arXiv:1001.5454 [pdf, other]

doi 10.1007/s10955-010-0018-5

Non-Equilibrium Statistical Physics of Currents in Queuing Networks

Authors: Vladimir Y. Chernyak, Michael Chertkov, David A. Goldberg, Konstantin Turitsyn

Abstract: We consider a stable open queuing network as a steady non-equilibrium system of interacting particles. The network is completely specified by its underlying graphical structure, type of interaction at each node, and the Markovian transition rates between nodes. For such systems, we ask the question ``What is the most likely way for large currents to accumulate over time in a network ?'', where tim… ▽ More We consider a stable open queuing network as a steady non-equilibrium system of interacting particles. The network is completely specified by its underlying graphical structure, type of interaction at each node, and the Markovian transition rates between nodes. For such systems, we ask the question ``What is the most likely way for large currents to accumulate over time in a network ?'', where time is large compared to the system correlation time scale. We identify two interesting regimes. In the first regime, in which the accumulation of currents over time exceeds the expected value by a small to moderate amount (moderate large deviation), we find that the large-deviation distribution of currents is universal (independent of the interaction details), and there is no long-time and averaged over time accumulation of particles (condensation) at any nodes. In the second regime, in which the accumulation of currents over time exceeds the expected value by a large amount (severe large deviation), we find that the large-deviation current distribution is sensitive to interaction details, and there is a long-time accumulation of particles (condensation) at some nodes. The transition between the two regimes can be described as a dynamical second order phase transition. We illustrate these ideas using the simple, yet non-trivial, example of a single node with feedback. △ Less

Submitted 19 June, 2010; v1 submitted 29 January, 2010; originally announced January 2010.

Comments: 26 pages, 5 figures

Report number: LA-UR 10-00419

arXiv:0912.0338 [pdf, ps, other]

Correlation Decay in Random Decision Networks

Authors: David Gamarnik, David Goldberg, Theophane Weber

Abstract: We consider a decision network on an undirected graph in which each node corresponds to a decision variable, and each node and edge of the graph is associated with a reward function whose value depends only on the variables of the corresponding nodes. The goal is to construct a decision vector which maximizes the total reward. This decision problem encompasses a variety of models, including maxi… ▽ More We consider a decision network on an undirected graph in which each node corresponds to a decision variable, and each node and edge of the graph is associated with a reward function whose value depends only on the variables of the corresponding nodes. The goal is to construct a decision vector which maximizes the total reward. This decision problem encompasses a variety of models, including maximum-likelihood inference in graphical models (Markov Random Fields), combinatorial optimization on graphs, economic team theory and statistical physics. The network is endowed with a probabilistic structure in which costs are sampled from a distribution. Our aim is to identify sufficient conditions to guarantee average-case polynomiality of the underlying optimization problem. We construct a new decentralized algorithm called Cavity Expansion and establish its theoretical performance for a variety of models. Specifically, for certain classes of models we prove that our algorithm is able to find near optimal solutions with high probability in a decentralized way. The success of the algorithm is based on the network exhibiting a correlation decay (long-range independence) property. Our results have the following surprising implications in the area of average case complexity of algorithms. Finding the largest independent (stable) set of a graph is a well known NP-hard optimization problem for which no polynomial time approximation scheme is possible even for graphs with largest connectivity equal to three, unless P=NP. We show that the closely related maximum weighted independent set problem for the same class of graphs admits a PTAS when the weights are i.i.d. with the exponential distribution. Namely, randomization of the reward function turns an NP-hard problem into a tractable one. △ Less

Submitted 2 December, 2009; originally announced December 2009.

arXiv:0807.1277 [pdf, ps, other]

Randomized greedy algorithms for independent sets and matchings in regular graphs: Exact results and finite girth corrections

Authors: David Gamarnik, David Goldberg

Abstract: We derive new results for the performance of a simple greedy algorithm for finding large independent sets and matchings in constant degree regular graphs. We show that for $r$-regular graphs with $n$ nodes and girth at least $g$, the algorithm finds an independent set of expected cardinality $f(r)n - O\big(\frac{(r-1)^{\frac{g}{2}}}{\frac{g}{2}!} n\big)$, where $f(r)$ is a function which we expl… ▽ More We derive new results for the performance of a simple greedy algorithm for finding large independent sets and matchings in constant degree regular graphs. We show that for $r$-regular graphs with $n$ nodes and girth at least $g$, the algorithm finds an independent set of expected cardinality $f(r)n - O\big(\frac{(r-1)^{\frac{g}{2}}}{\frac{g}{2}!} n\big)$, where $f(r)$ is a function which we explicitly compute. A similar result is established for matchings. Our results imply improved bounds for the size of the largest independent set in these graphs, and provide the first results of this type for matchings. As an implication we show that the greedy algorithm returns a nearly perfect matching when both the degree $r$ and girth $g$ are large. Furthermore, we show that the cardinality of independent sets and matchings produced by the greedy algorithm in \emph{arbitrary} bounded degree graphs is concentrated around the mean. Finally, we analyze the performance of the greedy algorithm for the case of random i.i.d. weighted independent sets and matchings, and obtain a remarkably simple expression for the limiting expected values produced by the algorithm. In fact, all the other results are obtained as straightforward corollaries from the results for the weighted case. △ Less

Submitted 8 July, 2008; originally announced July 2008.

Comments: 24 pages

ACM Class: F.2.2; G.1.6; G.2.1; G.2.2; G.3

arXiv:0801.3113 [pdf, ps, other]

iBOA: The Incremental Bayesian Optimization Algorithm

Authors: Martin Pelikan, Kumara Sastry, David E. Goldberg

Abstract: This paper proposes the incremental Bayesian optimization algorithm (iBOA), which modifies standard BOA by removing the population of solutions and using incremental updates of the Bayesian network. iBOA is shown to be able to learn and exploit unrestricted Bayesian networks using incremental techniques for updating both the structure as well as the parameters of the probabilistic model. This re… ▽ More This paper proposes the incremental Bayesian optimization algorithm (iBOA), which modifies standard BOA by removing the population of solutions and using incremental updates of the Bayesian network. iBOA is shown to be able to learn and exploit unrestricted Bayesian networks using incremental techniques for updating both the structure as well as the parameters of the probabilistic model. This represents an important step toward the design of competent incremental estimation of distribution algorithms that can solve difficult nearly decomposable problems scalably and reliably. △ Less

Submitted 20 January, 2008; originally announced January 2008.

Comments: Also available at the MEDAL web site, http://medal.cs.umsl.edu/

Report number: MEDAL Report No. 2008002 ACM Class: I.2.6; I.2.8; G.1.6

Journal ref: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2008), ACM Press, 455-462

arXiv:cs/0502057 [pdf, ps, other]

Decomposable Problems, Niching, and Scalability of Multiobjective Estimation of Distribution Algorithms

Authors: Kumara Sastry, Martin Pelikan, David E. Goldberg

Abstract: The paper analyzes the scalability of multiobjective estimation of distribution algorithms (MOEDAs) on a class of boundedly-difficult additively-separable multiobjective optimization problems. The paper illustrates that even if the linkage is correctly identified, massive multimodality of the search problems can easily overwhelm the nicher and lead to exponential scale-up. Facetwise models are s… ▽ More The paper analyzes the scalability of multiobjective estimation of distribution algorithms (MOEDAs) on a class of boundedly-difficult additively-separable multiobjective optimization problems. The paper illustrates that even if the linkage is correctly identified, massive multimodality of the search problems can easily overwhelm the nicher and lead to exponential scale-up. Facetwise models are subsequently used to propose a growth rate of the number of differing substructures between the two objectives to avoid the niching method from being overwhelmed and lead to polynomial scalability of MOEDAs. △ Less

Submitted 12 February, 2005; originally announced February 2005.

Comments: Submitted to Genetic and Evolutionary Computation Conference, GECCO-2005

Report number: IlliGAL Report No. 2005004

arXiv:cs/0502034 [pdf, ps, other]

Multiobjective hBOA, Clustering, and Scalability

Authors: Martin Pelikan, Kumara Sastry, David E. Goldberg

Abstract: This paper describes a scalable algorithm for solving multiobjective decomposable problems by combining the hierarchical Bayesian optimization algorithm (hBOA) with the nondominated sorting genetic algorithm (NSGA-II) and clustering in the objective space. It is first argued that for good scalability, clustering or some other form of niching in the objective space is necessary and the size of ea… ▽ More This paper describes a scalable algorithm for solving multiobjective decomposable problems by combining the hierarchical Bayesian optimization algorithm (hBOA) with the nondominated sorting genetic algorithm (NSGA-II) and clustering in the objective space. It is first argued that for good scalability, clustering or some other form of niching in the objective space is necessary and the size of each niche should be approximately equal. Multiobjective hBOA (mohBOA) is then described that combines hBOA, NSGA-II and clustering in the objective space. The algorithm mohBOA differs from the multiobjective variants of BOA and hBOA proposed in the past by including clustering in the objective space and allocating an approximately equally sized portion of the population to each cluster. The algorithm mohBOA is shown to scale up well on a number of problems on which standard multiobjective evolutionary algorithms perform poorly. △ Less

Submitted 7 February, 2005; originally announced February 2005.

Comments: Also IlliGAL Report No. 2005005 (http://www-illigal.ge.uiuc.edu/). Submitted to GECCO-2005

Report number: IlliGAL Report No. 2005005 ACM Class: I.2.8; I.2.6; G.1.6; I.5.3

arXiv:cs/0502023 [pdf, ps, other]

Sub-structural Niching in Estimation of Distribution Algorithms

Authors: K. Sastry, H. A. Abbass, D. E. Goldberg, D. D. Johnson

Abstract: We propose a sub-structural niching method that fully exploits the problem decomposition capability of linkage-learning methods such as the estimation of distribution algorithms and concentrate on maintaining diversity at the sub-structural level. The proposed method consists of three key components: (1) Problem decomposition and sub-structure identification, (2) sub-structure fitness estimation… ▽ More We propose a sub-structural niching method that fully exploits the problem decomposition capability of linkage-learning methods such as the estimation of distribution algorithms and concentrate on maintaining diversity at the sub-structural level. The proposed method consists of three key components: (1) Problem decomposition and sub-structure identification, (2) sub-structure fitness estimation, and (3) sub-structural niche preservation. The sub-structural niching method is compared to restricted tournament selection (RTS)--a niching method used in hierarchical Bayesian optimization algorithm--with special emphasis on sustained preservation of multiple global solutions of a class of boundedly-difficult, additively-separable multimodal problems. The results show that sub-structural niching successfully maintains multiple global optima over large number of generations and does so with significantly less population than RTS. Additionally, the market share of each of the niche is much closer to the expected level in sub-structural niching when compared to RTS. △ Less

Submitted 3 February, 2005; originally announced February 2005.

Report number: IlliGAL Report No. 2005002

arXiv:cs/0502022 [pdf, ps, other]

Sub-Structural Niching in Non-Stationary Environments

Authors: K. Sastry, H. A. Abbass, D. E. Goldberg

Abstract: Niching enables a genetic algorithm (GA) to maintain diversity in a population. It is particularly useful when the problem has multiple optima where the aim is to find all or as many as possible of these optima. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. In these problems, niching can maintain useful solution… ▽ More Niching enables a genetic algorithm (GA) to maintain diversity in a population. It is particularly useful when the problem has multiple optima where the aim is to find all or as many as possible of these optima. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. In these problems, niching can maintain useful solutions to respond quickly, reliably and accurately to a change in the environment. In this paper, we present a niching method that works on the problem substructures rather than the whole solution, therefore it has less space complexity than previously known niching mechanisms. We show that the method is responding accurately when environmental changes occur. △ Less

Submitted 3 February, 2005; originally announced February 2005.

Comments: Final version published in 2005 Australian Artificial Intelligence Conference, pp. 873--885

Report number: IlliGAL Report No. 2004035

arXiv:cs/0502021 [pdf, ps, other]

Oiling the Wheels of Change: The Role of Adaptive Automatic Problem Decomposition in Non--Stationary Environments

Authors: H. A. Abbass, K. Sastry, D. E. Goldberg

Abstract: Genetic algorithms (GAs) that solve hard problems quickly, reliably and accurately are called competent GAs. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. This paper investigates the use of competent GAs for optimizing non--stationary optimization problems. More specifically, we use an information theoretic appr… ▽ More Genetic algorithms (GAs) that solve hard problems quickly, reliably and accurately are called competent GAs. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. This paper investigates the use of competent GAs for optimizing non--stationary optimization problems. More specifically, we use an information theoretic approach based on the minimum description length principle to adaptively identify regularities and substructures that can be exploited to respond quickly to changes in the environment. We also develop a special type of problems with bounded difficulties to test non--stationary optimization problems. The results provide new insights into non-stationary optimization problems and show that a search algorithm which automatically identifies and exploits possible decompositions is more robust and responds quickly to changes than a simple genetic algorithm. △ Less

Submitted 3 February, 2005; originally announced February 2005.

arXiv:cs/0502020 [pdf, ps, other]

Population Sizing for Genetic Programming Based Upon Decision Making

Authors: K. Sastry, U. -M. O'Reilly, D. E. Goldberg

Abstract: This paper derives a population sizing relationship for genetic programming (GP). Following the population-sizing derivation for genetic algorithms in Goldberg, Deb, and Clark (1992), it considers building block decision making as a key facet. The analysis yields a GP-unique relationship because it has to account for bloat and for the fact that GP solutions often use subsolution multiple times.… ▽ More This paper derives a population sizing relationship for genetic programming (GP). Following the population-sizing derivation for genetic algorithms in Goldberg, Deb, and Clark (1992), it considers building block decision making as a key facet. The analysis yields a GP-unique relationship because it has to account for bloat and for the fact that GP solutions often use subsolution multiple times. The population-sizing relationship depends upon tree size, solution complexity, problem difficulty and building block expression probability. The relationship is used to analyze and empirically investigate population sizing for three model GP problems named ORDER, ON-OFF and LOUD. These problems exhibit bloat to differing extents and differ in whether their solutions require the use of a building block multiple times. △ Less

Submitted 3 February, 2005; originally announced February 2005.

Comments: Final version published in O'Reilly, U.-M., et al. (2004). Genetic Programming Theory and Practice II. Boston, MA: Kluwer Academic Publishers. 49--66

Report number: IlliGAL Report No. 2004028

arXiv:cs/0405065 [pdf, ps, other]

doi 10.1109/CEC.2004.1330930

Efficiency Enhancement of Genetic Algorithms via Building-Block-Wise Fitness Estimation

Authors: Kumara Sastry, Martin Pelikan, David E. Goldberg

Abstract: This paper studies fitness inheritance as an efficiency enhancement technique for a class of competent genetic algorithms called estimation distribution algorithms. Probabilistic models of important sub-solutions are developed to estimate the fitness of a proportion of individuals in the population, thereby avoiding computationally expensive function evaluations. The effect of fitness inheritanc… ▽ More This paper studies fitness inheritance as an efficiency enhancement technique for a class of competent genetic algorithms called estimation distribution algorithms. Probabilistic models of important sub-solutions are developed to estimate the fitness of a proportion of individuals in the population, thereby avoiding computationally expensive function evaluations. The effect of fitness inheritance on the convergence time and population sizing are modeled and the speed-up obtained through inheritance is predicted. The results show that a fitness-inheritance mechanism which utilizes information on building-block fitnesses provides significant efficiency enhancement. For additively separable problems, fitness inheritance reduces the number of function evaluations to about half and yields a speed-up of about 1.75--2.25. △ Less

Submitted 18 May, 2004; originally announced May 2004.

Comments: IEEE International Conference on Evolutionary Computation (CEC-2004)

Report number: IlliGAL Report No. 2004010 ACM Class: G.1.6; G.3; I.2.6; I.2.8

arXiv:cs/0405064 [pdf, ps, other]

Designing Competent Mutation Operators via Probabilistic Model Building of Neighborhoods

Authors: Kumara Sastry, David E. Goldberg

Abstract: This paper presents a competent selectomutative genetic algorithm (GA), that adapts linkage and solves hard problems quickly, reliably, and accurately. A probabilistic model building process is used to automatically identify key building blocks (BBs) of the search problem. The mutation operator uses the probabilistic model of linkage groups to find the best among competing building blocks. The c… ▽ More This paper presents a competent selectomutative genetic algorithm (GA), that adapts linkage and solves hard problems quickly, reliably, and accurately. A probabilistic model building process is used to automatically identify key building blocks (BBs) of the search problem. The mutation operator uses the probabilistic model of linkage groups to find the best among competing building blocks. The competent selectomutative GA successfully solves additively separable problems of bounded difficulty, requiring only subquadratic number of function evaluations. The results show that for additively separable problems the probabilistic model building BB-wise mutation scales as O(2^km^{1.5}), and requires O(k^{0.5}logm) less function evaluations than its selectorecombinative counterpart, confirming theoretical results reported elsewhere (Sastry & Goldberg, 2004). △ Less

Submitted 18 May, 2004; originally announced May 2004.

Comments: Genetic and Evolutionary Computation Conference (GECCO-2004)

Report number: IlliGAL Report No. 2004006 ACM Class: G.1.6; G.3; I.2.6; I.2.8

arXiv:cs/0405063 [pdf, ps, other]

Let's Get Ready to Rumble: Crossover Versus Mutation Head to Head

Authors: Kumara Sastry, David E. Goldberg

Abstract: This paper analyzes the relative advantages between crossover and mutation on a class of deterministic and stochastic additively separable problems. This study assumes that the recombination and mutation operators have the knowledge of the building blocks (BBs) and effectively exchange or search among competing BBs. Facetwise models of convergence time and population sizing have been used to det… ▽ More This paper analyzes the relative advantages between crossover and mutation on a class of deterministic and stochastic additively separable problems. This study assumes that the recombination and mutation operators have the knowledge of the building blocks (BBs) and effectively exchange or search among competing BBs. Facetwise models of convergence time and population sizing have been used to determine the scalability of each algorithm. The analysis shows that for additively separable deterministic problems, the BB-wise mutation is more efficient than crossover, while the crossover outperforms the mutation on additively separable problems perturbed with additive Gaussian noise. The results show that the speed-up of using BB-wise mutation on deterministic problems is O(k^{0.5}logm), where k is the BB size, and m is the number of BBs. Likewise, the speed-up of using crossover on stochastic problems with fixed noise variance is O(mk^{0.5}log m). △ Less

Submitted 18 May, 2004; originally announced May 2004.

Comments: Genetic and Evolutionary Computation Conference (GECCO-2004)

Report number: IlliGAL Report No. 2004005 ACM Class: G.1.6; G.3; I.2.6; I.2.8

arXiv:cs/0405062 [pdf, ps, other]

Efficiency Enhancement of Probabilistic Model Building Genetic Algorithms

Authors: Kumara Sastry, David E. Goldberg, Martin Pelikan

Abstract: This paper presents two different efficiency-enhancement techniques for probabilistic model building genetic algorithms. The first technique proposes the use of a mutation operator which performs local search in the sub-solution neighborhood identified through the probabilistic model. The second technique proposes building and using an internal probabilistic model of the fitness along with the p… ▽ More This paper presents two different efficiency-enhancement techniques for probabilistic model building genetic algorithms. The first technique proposes the use of a mutation operator which performs local search in the sub-solution neighborhood identified through the probabilistic model. The second technique proposes building and using an internal probabilistic model of the fitness along with the probabilistic model of variable interactions. The fitness values of some offspring are estimated using the probabilistic model, thereby avoiding computationally expensive function evaluations. The scalability of the aforementioned techniques are analyzed using facetwise models for convergence time and population sizing. The speed-up obtained by each of the methods is predicted and verified with empirical results. The results show that for additively separable problems the competent mutation operator requires O(k 0.5 logm)--where k is the building-block size, and m is the number of building blocks--less function evaluations than its selectorecombinative counterpart. The results also show that the use of an internal probabilistic fitness model reduces the required number of function evaluations to as low as 1-10% and yields a speed-up of 2--50. △ Less

Submitted 18 May, 2004; originally announced May 2004.

Comments: Optimization by Building and Using Probabilistic Models. Workshop at the 2004 Genetic and Evolutionary Computation Conference

Report number: IlliGAL Report No. 2004020 ACM Class: G.1.6; G.3; I.2.6; I.2.8

Showing 1–19 of 19 results for author: Goldberg, D