Evolutionary Computation (진 화 연 산) 장병탁 서울대 컴퓨터공학부 E-mail: [email protected] http://scai.snu.ac.kr./~btzhang/ Byoung-Tak Zhang School of Computer Science and Engineering Seoul National University This material is also available online at http://scai.snu.ac.kr/ Outline 1. Basic Concepts 2. Theoretical Backgrounds 3. Applications 4. Current Issues 5. References and URLs 2 1. Basic Concepts 3 Charles Darwin (1859) “Owing to this struggle for life, any variation, however slight and from whatever cause proceeding, if it be in any degree profitable to an individual of any species, in its infinitely complex relations to other organic beings and to external nature, will tend to the preservation of that individual, and will generally be inherited by its offspring.” 4 Evolutionary Algorithms A Computational Model Inspired by Natural Evolution and Genetics Proved Useful for Search, Machine Learning and Optimization Population-Based Search (vs. Point-Based Search) Probabilistic Search (vs. Deterministic Search) Collective Learning (vs. Individual Learning) Balance of Exploration (Global Search) and Exploitation (Local Search) 5 Biological Terminology Gene Functional entity that codes for a specific feature e.g. eye color Set of possible alleles Allele Value of a gene e.g. blue, green, brown Codes for a specific variation of the gene/feature Locus Position of a gene on the chromosome Genome Set of all genes that define a species The genome of a specific individual is called genotype The genome of a living organism is composed of several Chromosomes Population Set of competing genomes/individuals 6 Analogy to Evolutionary Biology Individual (Chromosome) = Possible Solution Population = A Collection of Possible Solutions Fitness = Goodness of Solutions Selection (Reproduction) = Survival of the Fittest Crossover = Recombination of Partial Solutions Mutation = Alteration of an Existing Solution 7 The Evolution Loop initialize population evaluate select mating partners (terminate) recombine select mutate evaluate 8 Basic Evolutionary Algorithm Generation of initial solutions (A priori knowledge, results of earlier run, random) Evaluation Generation of variants by mutation and crossover Selection Solution Sufficiently good? NO YES END 9 Procedure begin t <- 0; initialize P(t); evaluate P(t); while (not termination condition) do recombine P(t) to yield C(t); evaluate C(t); select P(t+1) from P(t) and C(t); t <- t+1; end end 10 General Structure of EAs crossover chromosomes encoding solutions 1100101010 1011101110 0011011001 1100110001 110010 1010 101110 1110 110010 1110 101110 1010 mutation 00110 1 1001 new population selection evaluation 1100101110 1011101010 0011001001 roulette wheel 00110 0 1001 solutions fitness computation 11 Population and Fitness 6 4 6 2 8 4 6 6 12 Selection, Crossover, and Mutation 6 Distinction 4 Mutation 6 2 Crossover 8 10 Reproduction 8 4 8 6 6 13 Simulated Evolution Population (chromosomes) Offspring New generation Parents Genetic operators Decoded strings Evaluation (fitness) Manipulation Mating Selection (mating pool) Reproduction 14 Selection Strategies Reproduce offspring in proportion to fitness fi. Ranking Selection ps t i f ai j 1 Select individuals according to rank(fi). a f a t Proportionate Selection t p s ai t j 1 , 1 i 0, i Tournament Selection Choose q individuals at random, the best of which survives. p s ai t 1 q i 1 q i q Other Ways 15 Roulette Wheel Selection Selection is a stochastic process Probability of reproduction pi = fi / S k fk intermediate parent population: 01011 11010 10001 10001 16 Genetic Operators for Bitstrings • Reproduction: make copies of chromosome (the fitter the chromosome, the more copies) 10000100 10000100 10000100 • Crossover: exchange subparts of two chromosomes 100|00100 10011111 111|11111 11100100 • Mutation: randomly flip some bits 00000100 00000000 17 Mutation For a binary string, just randomly “flip” a bit For a more complex structure, randomly select a site, delete the structure associated with this site, and randomly create a new sub-structure Some EAs just use mutation (no crossover) Normally, however, mutation is used to search in the “local search space”, by allowing small changes in the genotype (and therefore hopefully in the phenotype) 18 Recombination (Crossover) Crossover is used to swap (fit) parts of individuals, in a similar way to sexual reproduction Parents are selected based on fitness Crossover sites selected (randomly, although other mechanisms exist), with some prob. Parts of the parents are exchanged to produce children 19 Crossover One-point crossover parent A parent B 11010 10001 offspring A offspring B 1101 1 offspring A offspring B 11 00 0 1000 0 Two-point crossover parent A parent B 11010 10001 10 01 1 20 2. Theoretical Backgrounds 21 Major Evolutionary Algorithms Genetic Programming Classifier Systems Evolution Strategies Genetic Algorithms Evolutionary Programming Hybrids: BGA • Genetic representation of candidate solutions • Genetic operators • Selection scheme • Problem domain 22 Variants of Evolutionary Algorithms Genetic Algorithm (Holland et al., 1960’s) Bitstrings, mainly crossover, proportionate selection Evolution Strategy (Rechenberg et al., 1960’s) Real values, mainly mutation, truncation selection Evolutionary Programming (Fogel et al., 1960’s) FSMs, mutation only, tournament selection Genetic Programming (Koza, 1990) Trees, mainly crossover, proportionate selection Hybrids: BGA (Muehlenbein et al., 1993) BGP (Zhang et al., 1995) and others. 23 Evolution Strategy (ES) Problem of real-valued optimization Find extremum (minimum) of function F(X): Rn ->R Operate directly on real-valued vector X Generate new solutions through Gaussian mutation of all components Selection mechanism for determining new parents 24 ES: Representation One individual: a x1 , , xn , 1 , , n , 1 , , n I x The three parts of an individual: x : Object variables : Standard deviations : Rotation angles Fitness f (x ) Variances Covariances 25 ES: Operator - Recombination rr x r r , where rx, r , r {-, d, D, i, I, g, G}, e.g. rdII x S ,i x S ,i x S ,i xi xS ,i x S ,i x S ,i x S ,i no recombinat ion r or xT ,i discrete rd or xTi ,i panmictic discrete rD ( xT ,i xS ,i ) / 2 intermedia te ri ( xTi ,i xS ,i ) / 2 panmictic intermedia te rI ( xT ,i xS ,i ) generalize d intermedia te rg i ( xTi ,i xS ,i ) panmictic generalize d intermedia te rG 26 ES: Operator - Mutation m{,’,} : I I is an asexual operator. n = n, n = n(n-1)/2 i i exp( N (0,1) N i (0,1)) 2 n j j N j (0,1) x x N (0, C( , )) 1 1 2n 0.0873 1 < n < n, n = 0 i i exp( N (0,1) N i (0,1)) xi xi i N i (0,1) n = 1, n = 0 exp( 0 N (0,1)) xi xi N i (0,1) 0 1/ n 27 ES: Illustration of Mutation Hyperellipsoids Line of equal probability density to place an offspring 28 ES: Evolution Strategy vs. Genetic Algorithm Create random initial population Insert into population Evaluate population Create random initial population Insert into population Evaluate population Select individuals for variation Vary Vary Selection 29 Evolutionary Programming (EP) Original form (L. J. Fogel) Uniform random mutations Discrete alphabets ( ) selection Extended Evolutionary Programming (D. B. Fogel) Continuous parameter optimization Similarities with ES • Normally distributed mutation • Self-adaptation of mutation parameters 30 EP: Representation Constraints for domain and variances Initialization only-operators does not survey • Search space is principally unconstrained. Individual a ( x, ) I Ax As , Ax R n and As (std EP) or As R (meta EP) n 31 EP: Operator - Recombination No recombination Gaussian mutation does better (Fogel and Atmar). • Not all situations Evolutionary biology • The role of crossover is often overemphasized. Mutation-enhancing evolutionary optimization Crossover-segregating defects The main point of view from researchers in the field of Genetic Algorithms 32 EP: Operator - Mutation General form (std. EP) xi xi i ( x ) i N i (0,1) Exogenous parameters i , i must be tuned for a particular task Usually xi xi ( x ) Ni (0,1) Problem • If global minimum’s fitness value is not zero, exact approachment is not possible • If fitness values are very large, the search is almost random walk • If user does not know about approximate position of global minimum, parameter tuning is not possible 33 EP: Differences from ES Procedures for self-adaptation Deterministic vs. Probabilistic selection ( , ) -ES vs. ( ) -EP Level of abstraction: Individual vs. Species 34 Genetic Programming Applies principles of natural selection to computer search and optimization problems - has advantages over other procedures for “badly behaved” solution spaces [Koza, 1992] Genetic programming uses variable-size tree-representations rather than fixed-length strings of binary values. Program tree = S-expression = LISP parse tree Tree = Functions (Nonterminals) + Terminals 35 GP: Representation S-expression: (+ 1 2 (IF (> TIME 10) 3 4)) Terminals = {1, 2, 3, 4, 10, TIME} Functions = {+, >, IF} + 1 TIME 2 IF > 3 4 10 36 GP: Operator - Crossover + + b a + b b a a b + + a b a + b b b a 37 GP: Operator - Mutation + + / b a a + / b b a b b a 38 Breeder GP (BGP) [Zhang and Muehlenbein, 1993, 1995] ES (real-vector) GA (bitstring) GP (tree) Muehlenbein et al. (1993) Breeder GA (BGA) (real-vector + bitstring) Zhang et al. (1993) Breeder GP (BGP) (tree + real-vector + bitstring) 39 GAs: Theory of Simple GA Assumptions Bitstrings of fixed size Proportionate selection Definitions Schema H: A set of substrings (e.g., H = 1**0) Order o: number of fixed positions (FP) (e.g., o(H) = 2) Defining length d: distance between leftmost FP and rightmost FP (e.g., d(H) = 3) 40 GAs: Schema Theorem (Holland et al. 1975) f (H , t) d (H ) o( H ) m( H , t 1) m( H , t ) 1 pc 1 pm f (t ) n 1 m( H , t ) pc , pm Number of members of H Probability of crossover and mutation, respectively Interpretation: Fit, short, low-order schemata (or building blocks) exponentially grow. 41 ES: Theory Convergence velocity of (+, )-ES 1 v v 1 k p p ( 1 p ) dk k j k k j k k j k , ) k min k v 1 v 1 where k min 0 for a ( ) - ES and k min for a ( , ) - ES ( 2 ~ ~ 1 1 1 1 1 1 exp erf 2 exp 1 1 2 1 2 1 2 2 1 ~1 1 1 1 exp 2 ~ 2 2 2 2 2 exp 1 erf 2 2 1 2 2 8 8 8 2 2 42 EP: Theory (1) Analysis of std. EP(Fogel) Aims at giving a proof of convergence for resulting algorithm EP(1,0, q, ) • Mutation: xi xi ( x ) Ni (0,1) Analysis of a special case EP(1,0,q,1) • Identical to a (1+1)-ES having Objective function f (x ) • Simplified sphere model n ~ 2 2 f 2 ( xi ) xi r i 1 43 EP: Theory (2) Combination with the optimal SD 2* 1.224 r / n ~ 2n f 2 ( xi ) r 1.224 * When dimension is increased, the performance is worse than an algorithm that is able to retain the opt. SD 2 1 n 2 2n 2n 2 2 1 erf exp 4r 2 r 8 8 r 2 The convergence rate of a (1+1)-EP by ~2 2 ( 2 r ) ~ 2 n2 exp 2 8 1 n n r 1 erf 4 8 44 Breeder GP: Motivation for GP Theory In GP, parse trees of Lisp-like programs are used as chromosomes. Performance of programs are evaluated by training error and the program size tends to grow as training error decreases. Eventual goal of learning is to get small generalization error and the generalization error tends to increase as program size grows. How to control the program growth? 45 Breeder GP: MDL-Based Fitness Function Fi ( g ) F ( Ai | D) E ( D | Ai ) C ( Ai ) g g E ( D | Ai ) g C ( Ai ) , g Training error of neural trees A for data set D Structural complexity of neural trees A Relative importance of each term 46 Breeder GP: Adaptive Occam Method (Zhang et al., 1995) Fi (t ) Ei (t ) (t )Ci (t ) 2 Ebest (t 1) N if Ebest (t 1) Cbest (t ) (t ) 1 2 N otherwise Ebest (t 1)Cbest (t ) Ebest (t 1) Cbest (t ) Desired performance level in error Training error of best progr. at gen t-1 Complexity of best progr. at gen. t 47 Bayesian Evolutionary Computation (1/2) The best individual is defined as the most probable model of data D given the priori knowledge P( A | D) P( D | A) P( A) P( D) P( D | A) P( A) P( D | A) P( A)dA P( D | A) P( A) P( D | A) P( A) AA ( g ) The objective of evolutionary computation is defined to find the model A* that maximizes the posterior probability A arg max P( A | D) * A Bayesian theorem is used to estimate P(A|D) from a population A(g) of individuals at each generation. 48 Bayesian Evolutionary Computation (2/2) Bayesian process: The BEAs attempt to explicitly estimate the posterior distribution of the individuals from their prior probability and likelihood, and then sample offspring from the distribution. [Zhang, 99] 49 Canonical BEA (Initialize) Generate A0 { A10 ,, AM0 } from the prior distribution P0(A). Set generation count t 0. (P-step) Estimate posterior distribution Pt(A|D) of the individuals in At. (V-step) Generate L variations At { A1t 1 ,, AMt 1} by sampling from Pt(A|D). (S-step) Select M individuals from A´ into A {A1,, AL } based on Pt(A´i |D). Set the best individual . (Loop) If the termination condition is met, then stop. Otherwise, set t t+1 and go to (P-step). 50 Basic Idea behind BEA 51 Evolving Neural Trees with BEA Posterior probability of a neural tree A P ( A | D ) P ( D | A) P ( A) P ( D | w , k ) p ( w , k ) P( D | w, k ) P(w | k ) P(k ) N 2 ( yc f ( w ,k ) ( x c )) N 1 c 1 exp 2 2 2 k 1 2 wj k 1 1 j 1 exp 2 2 k 3 exp( ) ( k 3)! 52 Features of EAs Evolutionary techniques are good for problems that are ill-defined or difficult Many different forms of representation Many different types of EA Leads to many different types of crossover, mutation, etc. Some problems with convergence, efficiency However, they are able to solve a diverse range of problems 53 Advantages of EAs Efficient investigation of large search spaces Quickly investigate a problem with a large number of possible solutions Problem independence Can be applied to many different problems Best suited to difficult combinatorial problems 54 Disadvantages of EAs No guarantee for finding optimal solutions with a finite amount of time: True for all global optimization methods. No complete theoretical basis (yes). But much process is being made. Parameter tuning is largely based on trial and error (genetic algorithms); solution: Self-adaptation (evolution strategies). Often computationally expensive: Parallelism. 55 3. Applications 56 Application Fields (1) Experimental optimization & optimization with subjective evaluation Coffee recipes; general food recipes Biochemical fermentation processes Wind tunnel experiments Two-phase nozzle optimization experiments Technical optimization Design & Production Logistics Control of dynamic processes 57 Application Fields (2) Structure optimization Structure & parameters of plants Connection structure & weights of neural nets Number of thicknesses of layers in multilayer structures Data Mining Clustering (number & centers of clusters) Fitting models to data Time series prediction 58 Application Fields (3) Path Planning Traveling Salesman Problem Robot Control Evolutionary Robotics Evolvable Hardware 59 Application Example 1 Hot Water Flashing Nozzle (1) Hans-Paul Schwefel performed the original experiments Start Hot water entering Steam and droplet at exit At throat: Mach 1 and onset of flashing 60 Application Example 1 Hot Water Flashing Nozzle (2) 61 Application Example 2 Minimal Weight Trust Layout Load Start 922 kp (LP minimum) Optimum 738 kp 62 Application Example 3 Concrete Shell Roof under own and outer load (snow and wind) Spherical shape Optimal shape Height 1.34m Half span 5.00m max m min Orthogonal bending strength Savings : 36% shell thickness 27% armation 63 Application Example 4 Dipole Magnet Structure (1) y y-Range y y Magnet P1 Magnet Magnet N S N S N S n x Field region of interest x x 64 Application Example 4 Dipole Magnet Structure (2) Analysis of the magnetic field by Finite Element Analysis (FEM) Minimize sum of squared deviations from the ideal Individuals: Vectors of positions (y1, …, yn) Middle: 9.8% better than upper graphic; bottom: 2.7% better 65 Application Example 5 Optical Multilayers (1) Desired Substrate (Glass) Wavelength …….. Current Reflection Filter layers; - Thicknesses - Materials Goal: Find a filter structure such that the real reflection behavior matches the desired one as close as possible. 66 Application Example 5 Optical Multilayers (2) Problem parameters; Thickness d d1 ,, d n of layers Layer materials 1 ,,n (integer values) Number of layers n. Mixed-integer, variable-dimensional problem. 67 Application Example 5 Optical Multilayers (3) Objective function: 2 ~ f d , R d , , R d R d , , d : Reflection of the actual filter for wavelength Calculation according to matrix method. ~ R : Desired reflection value 68 Application Example 5 Optical Multilayers (4) Example topology: Only layer thicknesses vary; n=2 69 Application Example 5 Optical Multilayers (5) Example structure: 70 Application Example 5 Optical Multilayers (6) Parallel evolutionary algorithm Per node: EA for mixed-integer representation Isolation and migration of best individuals Mutation of discrete variables: Fixed pm per population 71 Application Example 6 Circuit Design (1) Difficulty of automated circuit design: A vary hard problem Exponential in the number of components More than 10300 circuits with a mere 20 components An important problem Too few analog designers There is an “Egg-shell” of analog circuitry around almost all digital circuits Analog circuits must be redesigned with each new generation of process technology No existing automated techniques In contrast with digital Existing analog techniques do only sizing of components, but do not create the topology 72 Application Example 6 Circuit Design (2) Development of a circuit with Genetic Programming In Out Developing Circuit 73 Application Example 6 Circuit Design (3) Each function in the circuit-constructing tree acts on a part of the circuit and changes it in some way (e.g. creates a capacitor, creates a parallel structure, adds a connection to ground, etc) A “writing head” points from each function to the part of the circuit that the function will act on. Each function inherits writing heads from its parent in the tree 74 Application Example 6 Circuit Design (4) Example of circuit – constructing program tree (LIST (C (-0.963 (- (- -0.875 – 0.113) 0.880)) (series (flip end) (series (flip end) (L –0.277 end) end) (L (- -0.640 0.749) (L –0.123 end)))) (flip (nop (L –0.657 end))))) LIST C FLIP -0.963 - -0.875 SERIES FLIP -0.880 -0.113 NOP SERIES END FLIP END L -0.277 L END END -0.658 L - L -0.749 -0.658 -0.658 END END 75 Application Example 7 Neural Network Design (1) Introduction: EC for NNs Preprocessing of Training Data Feature selection Training set optimization Training of Network Weights Non-gradient search Optimization of Network Architecture Topology adaptation Pruning unnecessary connections/units 76 Application Example 7 Neural Network Design (2) Encoding Schemes for NNs Bit-strings Properties of network structure are encoded as bitstrings. Rules Network configuration is specified by a graphgeneration grammar. Trees Network is represented as “neural trees”. 77 Application Example 7 Neural Network Design (3) Neural Tree Models Neural trees Tree-structured neural networks Nonterminal nodes: Neural units F {, } Terminal nodes: Inputs T {x1 , x2 ,, xn } Links: connection weights wij from j to i Layer of node i: path length of the longest path to a terminal node of the substrees of i. Type of Units neti wij y j Sigma unit: the sum of weighted inputs j wij y j Pi unit: the product of weighted inputs neti j Output of a neuron: sigmoid transfer function yi f (neti ) 1 1 e neti 78 Application Example 7 Neural Network Design (4) Tree-Structured Model S1 W11 W12 X2 W22 X4 W31 W32 S3 X1 W41 X2 W14 S4 P1 S2 W21 W13 S5 W33 W51 W52 X3 P2 X4 W42 W61 W62 X1 X3 X2 W63 X5 W71 X4 W72 S6 W81 W82 X1 X3 79 Application Example 7 Neural Network Design (5) Evolving Neural Networks by BEAs Generate M Trees Prior Distribution Evaluate Fitness of Trees Posterior Distribution Yes Termination Condition? No STOP Create L New Trees Model Variation (Variation Operators) Select Fitter Trees Model Selection 80 Application Example 7 Neural Network Design (6) Structural Adaptation by Subtree Crossover Neuron type, topology, size and shape of networks are adapted by crossover. Neuron types and receptive fields are adapted by mutation. Connection weights are adapted by an EC-based stochastic hill-climbing. 81 Application Example 8 Fuzzy System Design (1) Fuzzy system comprises Fuzzy membership functions (MF) Rules Task is to tailor the MF and rules to get best performance Every change to MF affects the rules Every change to the rules affects the MF 82 Application Example 8 Fuzzy System Design (2) Solution is to design MF and rules simultaneously Encode in chromosomes Aarameters of the MF Associations and Certainty Factors in rules Fitness is measured by performance of the Fuzzy System 83 Application Example 8 Fuzzy System Design (3) Evolutionary Computation and Fuzzy Systems Fuzzy Sets (Zadeh): Points have Memberships FUZZY MEMBERSHIP FOR “COOL” 1.0 Membership 0.0 0o Temperature (C) 2.5o Evolutionary Computation can be Used to Optimize Fuzzy Control Systems Evolve Membership Functions/Rules 84 Application Example 8 Fuzzy System Design (4) GA for Fuzzy Control of Cart F m v x(v=x) NM NS ZE PS PM x NM NS ZE NM PM PM PM NS PM PM PM ZE PM PM PS PM PM PM Cart Centering and Fuzzy Partitions PS PM NM NM NS NM NM NM NM NM Fuzzy Control Strategy after 100 Gen. NM=Negative Medium, NS=Negative Small, ZE=Zero, PS=Positive Small, PM=Positive Medium, _=No Entry Defuzzification is Just Weighed Centroid Mutation Up/Down by One. Two-Point Xover on String Representing the 5x5 Metrix 85 Application Example 9 Data Mining (1) Data Mining Task Types of problem to be solved: Classification Clustering Dependence Modeling etc., etc. 86 Application Example 9 Data Mining (2) Basic Ideas of GAs in Data Mining Candidate rules are represented as individuals of a population Rule quality is computed by a fitness function Using task-specific knowledge 87 Application Example 9 Data Mining (3) Classification with Genetic Algorithms Each individual represents a rule set • i.e. an independent candidate solution Each individual represents a single rule A set of individuals (or entire population) represents a candidate solution (rule set) 88 Application Example 9 Data Mining (4) Individual Representation In GABIL, an individual is a rule set, encoded as a bit string [DeJong et al. 93] It uses k bits for the k values of a categorical attribute If all k bits of an attribute are set to 1 the attribute is not used by the rule Goal attribute: Buy furniture Marital_status: Single/Married/Divorced House:Own/Rented/University Marital_status House Buy? The string 011 100 1 Represents the rule IF(Marital_status=M or D) and (House = 0) THEN (Buy furniture=y) 89 Application Example 9 Data Mining (5) An individual is a variable-length string representing a set of fixed-length rules rule 1 011 100 1 rule 2 101 110 0 Mutation: traditional bit inversion Crossover: corresponding crossover points in the two parents must semantically match 90 Application Example 10 Information Retrieval (1) Preprocessing and Indexing Classification System Text Data Text Classification Information Extraction Information Filtering DB Template Filling & Information Extraction System Information Filtering System DB Record Location user profile Date question filtered data answer feedback DB 91 Application Example 10 Information Retrieval (2) Evolutionary Learning: Applications in IR - Web-Document Retrieval [Kim & Zhang, 2000] Link Information, HTML Tag Information Retrieval <TITLE> <H> <B> … <A> ww11 ww22 ww33 … wn w1 w2 w3 … … wwnn chromosomes Fitness ww11 ww22 ww33 … wn w1 w2 w3 … … wwnn 92 Application Example 10 Information Retrieval (3) Evolutionary Learning: Applications in IR – Tag Weighting Crossover Mutation chromosome X x1 x2 x3 chromosome Y … xn y1 y2 y3 … yn chromosome X x1 x2 x3 change value w.p. Pm zi = (xi + yi ) / 2 w.p. Pc z1 z2 z3 … zn chromosome Z (offspring) … xn x1 x2 x3 … xn chromosome X’ Truncation selection 93 Application Example 10 Information Retrieval (4) Evolutionary Learning : Applications in IR - Experimental Results Datasets TREC-8 Web Track Data 2GB, 247491 web documents (WT2g) No. of training topics: 10, No. of test topics: 10 Results 94 Application Example 11 Time Series Prediction (1) [Zhang & Joung, 2000] Autonomous Model Discovery Raw Time Series Preprocessing Evolutionary Neural Trees (ENTs) Prediction Combining Neural Trees Combine Neural Trees Outputs 95 Application Example 11 Time Series Prediction (2) Experimental Setup for the far-infrared NH3 Laser Design Paramenters population size max generation crossover rate mutation rate no. of iterations training set size test set size size of committee pool no. of committee members Values Used 200 50 0.95 0.1 100 1000 1000 10 3 96 Application Example 11 Time Series Prediction (3) Experimental Result 1 0.9 0.8 0.7 x(t) 0.6 Target Function 0.5 0.4 0.3 0.2 0.1 0 1 201 401 601 801 1001 t 1201 1401 1601 1801 2001 1 0.9 0.8 Computed Approximation 0.7 0.6 x(t) 0.5 0.4 0.3 0.2 0.1 0 0 200 400 600 800 1000 1200 1400 1600 1800 2000 t 97 Application Example 12 Traveling Salesman Problem (1) This is a minimization problem Given n cities, what is the shortest route to each city, visiting each city exactly once Want to minimize total distance traveled Also must obey the “Visit Once” constraint 98 Application Example 12 Traveling Salesman Problem (2) Encoding represents the order of cities to visit Candidates scored by total distance traveled 4 3 1 0 6 2 5 99 Application Example 12 Traveling Salesman Problem (3) Traveling Salesman Problem (TSP) combinatorial optimization problems a salesman seeks the shortest tour through n cities NP-complete problem 100 Application Example 12 Traveling Salesman Problem (4) Simple Example with the TSP The “house-call problem” • Problem: Doctor must visit patients once and only once and return home in the shortest possible path Difficulty: The number of possible routes increases faster than exponentially • 10 patients = 191,440 possible routes • 22 patients = 10,000,000,000,000,000,000,000 101 Application Example 12 Traveling Salesman Problem (5) Representation Permutation Representation • cities are listed in the order in which they are visited [3 2 5 4 7 1 6 9 0 8] :3-2-5-4-7-6-1-9-0-8 • path presentation, order representation • lead to illegal tour if the traditional one-point crossover operator is used Random Keys Representation • encodes a solution with random numbers from (0, 1) [0.23 0.82 0.45 0.74 0.87 0.11 0.56 0.69 0.78] position i in the list represent city I random number in position I determines the visiting order of city I in a TSP tour sort random keys in ascending order to get tour 6-3-7-8-4-9-2-5 eliminate the infeasibility of the offspring 102 Application Example 13 Cooperating Robots (1) What is Evolutionary Robotics? An attempt to create robots which evolve using various evolutionary computational methods Evolve behaviors or competence modules implemented in various manners: several languages, relay, neuro chips, FPGA’s, etc. “Intelligence is emergent” Presently, limited to mostly evolution of robot’s control software. However, some attempts to evolve hardware began. GA and its variants used. Most current attempts center around {population size=50 ~ 500, generations = 50 ~ 500} 103 Application Example 13 Cooperating Robots (2) Industrial Robots Autonomous Mobile Robots OPEN !! CLOSED 104 Application Example 13 Cooperating Robots (3) Cooperating Autonomous Robots 105 Application Example 13 Cooperating Robots (4) Why Build Cooperating Robots? Increased scope for missions inherently distributed in: • Space • Time • Functionality Increased reliability, robustness (through redundancy) Decreased task completion time (through parallelism) Decreased cost (through simpler individual robot design) 106 Application Example 13 Cooperating Robots (5) Cooperating Autonomous Robots: Application domain Mining Construction Planetary exploration Automated manufacturing Search and rescue missions Cleanup of hazardous waste Industrial/household maintenance Nuclear power plant decommissioning Security, surveillance, and reconnaissance 107 Application Example 14 Co-evolving Soccer Softbots (1) Co-evolving Soccer Softbots With Genetic Programming At RoboCup there are two "leagues": the "real" robot league and the "virtual" softbot league How do you do this with GP? GP breeding strategies: homogeneous and heterogeneous Decision of the basic set of function with which to evolve players Creation of an evaluation environment for our GP individuals 108 Application Example 14 Co-evolving Soccer Softbots (2) Initial Random Population 109 Application Example 14 Co-evolving Soccer Softbots (3) Kiddie Soccer 110 Application Example 14 Co-evolving Soccer Softbots (4) Learning to Block the Goal 111 Application Example 14 Co-evolving Soccer Softbots (5) Becoming Territorial 112 Application Example 15 Evolvable Hardware (1) EVOLVABLE HARDWARE HARDWARE IMPLEMENTATION OF EVOLVABLE SOFTWARE (i.e. GP) Reconfigurable logic is too slow to make it worthwhile 113 Application Example 15 Evolvable Hardware (2) FPGAs Bad, because they are designed for conventional electronic design Good, because they can be abused and allow the exploitation of the physics WHAT WE NEED FPMAs Field Programmable Matter Arrays 114 Application Example 15 Evolvable Hardware (3) What is a FPMA? wires Chemical substrate KEY REQUIREMENT Removing the voltage must cause the material to relax to its former state A single piece of material Region to which voltage may be applied Can we, by evolving the voltages configure the material to carry out a useful function? 115 Application Example 15 Evolvable Hardware (4) XC6216 FPGA 116 Application Example 15 Evolvable Hardware (5) The Evolvable Hardware Robot Controller The logic functions are evolved, and implemented in a RAM chip A allows a signal to be either synchronised to a clock of evolved frequency, or passed straight through asynchronously, all under genetic control. All evaluations performed using the REAL HARDWARE 117 Application Example 15 Evolvable Hardware (6) 118 4. Current Issues 119 Innovative Techniques for EC Effective Operators Novel Representation Exploration/Exploitation Population Sizing Niching Methods Dynamic Fitness Evaluation Multi-objective Optimization Co-evolution Self-Adaptation EA/NN/Fuzzy Hybrids Distribution Estimation Algorithms Parallel Evolutionary Algorithms Molecular Evolutionary Computation 120 1000-Pentium Beowulf-Style Cluster Computer for Parallel GP 121 Molecular Evolutionary Computing 011001101010001 ATGCTCGAAGCT 122 Applications of EAs Optimization Machine learning Data mining Intelligent Agents Bioinformatics Engineering Design Telecommunications Evolvable Hardware Evolutionary Robotics 123 5. References and URLs 124 Journals & Conferences Journals: Evolutionary Computation (MIT Press) Trans. on Evolutionary Computation (IEEE) Genetic Programming & Evolvable Hardware (Kluwer) Evolutionary Optimization Conferences: Congress on Evolutionary Computation (CEC) Genetic and Evolutionary Computation Conference (GECCO) Parallel Problem Solving from Nature (PPSN) 125 WWW Resources • Hitch-Hiker’s Guide to Evolutionary Computation • http://alife.santafe.edu/~joke/encore/www FAQ for comp.ai.genetic • Genetic Algorithms Archive • http://www.aic.nrl.navy.mil/galist Repository for GA related information, conferences, etc. • EVONET European Network of Excellence on Evolutionary Comp. : http://www.dcs.napier.ac.uk/evonet • Genetic Programming Notebook • http://www.geneticprogramming.com software, people, papers, tutorial, FAQs 126 Books • Bäck, Th., Evolutionary Algorithms in Theory and Practice, Oxford University Press, New York, 1996. • Mitchell, M., An Introduction to Genetic Algorithms, MIT Press, Cambridge, MA, 1996. • Fogel, D., Evolutionary Computation: Toward a New Philosophy of Machine Intelligence, IEEE Press, NJ, 1995. • Schwefel, H-P., Evolution and Optimum Seeking, Wiley, New York, 1995. • Koza, J., Genetic Programming, MIT Press, Cambridge, MA, 1992. • Goldberg, D., Genetic Algorithms in Search and Optimization, AddisonWesley, Reading, MA, 1989. • Holland, J., Adaptation in Natural and Artificial Systems, Univeristy of Michigan Press, Ann Arbor, 1975. • Rechenberg, I., Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution, Frommann-Holzboog Verlag, Stuttgart, 1973. • Fogel, L.J., Owens, A.J, and Walsh, M.J., Artificial Intelligence through Simulated Evolution, John Wiley, NY, 1966. 127 For more information: http://scai.snu.ac.kr/

Descargar
# Machine Learning for Information Retrieval