Learning Fuzzy Controllers in Mobile Robotics with Embedded Preprocessing
LLearning Fuzzy Controllers in Mobile Robotics with Embedded Preprocessing
I. Rodr´ıguez-Fdez ∗ , M. Mucientes, A. Bugar´ın Centro de Investigaci´on en Tecnolox´ıas da Informaci´on (CITIUS), Universidade de Santiago de Compostela, SPAIN
Abstract
The automatic design of controllers for mobile robots usually requires two stages. In the first stage, sensorial data are preprocessedor transformed into high level and meaningful values of variables which are usually defined from expert knowledge. In the secondstage, a machine learning technique is applied to obtain a controller that maps these high level variables to the control commandsthat are actually sent to the robot. This paper describes an algorithm that is able to embed the preprocessing stage into the learningstage in order to get controllers directly starting from sensorial raw data with no expert knowledge involved. Due to the highdimensionality of the sensorial data, this approach uses Quantified Fuzzy Rules (QFRs), that are able to transform low-level inputvariables into high-level input variables, reducing the dimensionality through summarization. The proposed learning algorithm,called Iterative Quantified Fuzzy Rule Learning (IQFRL), is based on genetic programming. IQFRL is able to learn rules withdi ff erent structures, and can manage linguistic variables with multiple granularities. The algorithm has been tested with theimplementation of the wall-following behavior both in several realistic simulated environments with di ff erent complexity and on a Pioneer 3-AT robot in two real environments. Results have been compared with several well-known learning algorithms combinedwith di ff erent data preprocessing techniques, showing that IQFRL exhibits a better and statistically significant performance.Moreover, three real world applications for which IQFRL plays a central role are also presented: path and object tracking withstatic and moving obstacles avoidance. Keywords: mobile robotics, Quantified Fuzzy Rules, Iterative Rule Learning, Genetic Fuzzy System
1. Introduction
The control architecture of mobile robots usually includesa number of behaviors that are implemented as controllers,which are able to solve specific tasks such as motion planning,following a moving object, wall-following, avoiding collisions,etc. in real time. These behaviors are implementedas controllers whose outputs at each time point (controlcommands) depend on both the internal state of the robot andthe environment in which it evolves. The robot sensors (e.g.laser range finders, sonars, cameras, etc.) are used in orderto obtain the augmented state of the robot (internal state andenvironment). When the robot operates in real environments,both the data obtained by these sensors and the internal stateof the robot present uncertainty or noise. Therefore, the use ofmechanisms that manage them properly is necessary. The useof fuzzy rules is convenient to cope with this uncertainty, sinceit combines the interpretability and expressiveness of the ruleswith the ability of fuzzy logic for representing uncertainty.The first step for designing controllers for mobile robotsconsists of the preprocessing of the raw sensor data:the low-level input variables obtained by the sensors aretransformed into high-level variables that are significant for the ∗ Corresponding author. Tel.: +
34 881816392.
Email addresses: [email protected] (I. Rodr´ıguez-Fdez), [email protected] (M. Mucientes), [email protected] (A. Bugar´ın) behavior to be learned. Usually, expert knowledge is used forthe definition of these high-level variables and the mappingfrom the sensorial data. After this preprocessing stage, machinelearning algorithms can be used to automatically obtain themapping from the high-level input variables to the robot controlcommands. This paper describes an algorithm that is ableto perform the preprocessing stage embedded in the learningstage, thus avoiding the use of expert knowledge. Therefore,the mapping between low-level and high-level input variables isdone automatically during the learning phase of the controller.The data provided by the sensors is of high dimensionality.For example, a robot equipped with two laser range finderscan generate over 720 low-level variables. However, in mobilerobotics it is more common to work with sets or groupingsof these variables, (e.g. “frontal sector”) that are much moresignificant and relevant for the behavior. As a result, it isnecessary to use a model that is capable of grouping low-levelvariables, thus reducing the dimensionality of the problemand providing meaningful descriptions. The model shouldprovide propositions that are able to summarize the data withexpressions like “part of the distances in the frontal sector arehigh”. This kind of expressions can model the underlyingknowledge in a better way than just using average, maximum orminimum values of sets of low level variables. Moreover, theseexpressions also include the definition of the set of low-levelvariables to be used. Since these propositions involve fuzzyquantifiers (e.g. “part”), they are called Quantified Fuzzy
Preprint submitted to Applied Soft Computing October 4, 2018 a r X i v : . [ c s . R O ] N ov ropositions (QFPs) [1]. QFP provide a formal model that iscapable of modeling the knowledge involved in this groupingtask.Evolutionary algorithms have some characteristics that makethem suitable for learning fuzzy rules. The well-knowncombination of evolutionary algorithms and fuzzy logic(genetic fuzzy systems) is one of the approaches that aims tomanage the balance between accuracy and interpretability ofthe rules [2, 3]. As it was pointed out before, fuzzy rules canbe composed of both conventional and QFPs (therefore, theywill be referred to as QFRs). Furthermore, the transformationfrom low-level to high-level variables using QFPs producesa variable number of propositions in the antecedent of therules. Therefore, genetic programming, where the structure ofindividuals is a tree of variable size derived from a context-freegrammar, is here the most appropriate choice.This paper describes an algorithm that is able to learnQFRs of variable structure for the design of controllers withembedded preprocessing in mobile robotics. This proposal,called Iterative Quantified Fuzzy Rule Learning (IQFRL), isbased on the Iterative Rule Learning (IRL) approach anduses linguistic labels defined with unconstrained multiplegranularity, i.e. without limiting the granularity levels. Thisproposal has been designed to solve control (regression)problems in mobile robotics having as input variables theinternal state of the robot and the sensors data. Expertknowledge is only used to generate the training data for eachof the situations of the task to be learned and, also, to define thecontext-free grammar that specifies the structure of the rules.The main contributions of the paper are: (i) the proposedalgorithm is able to learn using the state of the robot and thesensors data, with no preprocessing. Instead, the mappingbetween low-level variables and high-level variables is doneembedded in the algorithm; (ii) the algorithm uses QFPs,a model able to summarize the low-level input data; (iii)moreover, IQFRL uses linguistic labels with unconstrainedmultiple granularity. With this approach, the interpretabilityof the membership functions used in the resulting rules isuna ff ected while the flexibility of representation remains.The proposal was validated in several simulated and realenvironments with the wall-following behavior. Results showa better and statistically significant performance of IQFRL overseveral combinations of well-known learning algorithms andpreprocessing techniques. The approach was also tested inthree real world behaviors that were built as a combinationof controllers: path tracking with obstacles avoidance, objecttracking with fixed obstacles avoidance, and object trackingwith moving obstacle avoidance.The paper is structured as follows: Section 2 summarizesrecent work related with this proposal and Section 3 presentsthe QFRs model and its advantages in mobile robotics. Section4 describes the IQFRL algorithm that has been used to learnthe QFRs. Section 5 presents the obtained results, and Section6 shows three real world applications of IQFRL in robotics.Finally, Section 7 points out the most relevant conclusions.
2. Related Work
The learning of controllers for autonomous robots has beendealt with by using di ff erent machine learning techniques.Among the most popular approaches can be found evolutionaryalgorithms [4, 5], neural networks [6] and reinforcementlearning [7, 8]. Also hibridations of them, like evolutionaryneural networks [9], reinforcement learning with evolutionaryalgorithms [10, 11], the widely used genetic fuzzy systems [12,13, 14, 15, 16, 17, 18], or even more uncommon combinationslike ant colony optimization with reinforcement learning [19] ordi ff erential evolution [20] or evolutionary group based particleswarm optimization [21] have been successfully applied.Furthermore, over the last few years, mobile robotic controllershave been getting some attention as a test case for the automaticdesign of type-2 fuzzy logic controllers [8, 5, 20].An extensive use of expert knowledge is made in all ofthese approaches. In [12] 360 laser sensor beams are usedas input data, and are heuristically combined into 8 sectorsas inputs to the learning algorithm. On the other hand, in[9, 13, 14, 15, 16, 18, 19, 21] the input variables of the learningalgorithm are defined by an expert. Moreover, in [13, 14, 16,18, 20] the evaluation function of the evolutionary algorithmmust be defined by an expert for each particular behavior. Asin the latter case, the reinforcement learning approaches needthe definition of an appropriate reward function using expertknowledge.The approaches based on genetic fuzzy systems use di ff erentalternatives in the definition of the membership functions. In[10, 12, 16] the membership functions are defined heuristically.In [14, 15] labels have been uniformly distributed, but thegranularity of each input variable is defined using expertknowledge. On the other hand, in [13, 17, 18, 19, 21] anapproximative approach is used, i.e., di ff erent membershipfunctions are learned for each rule, reducing the interpretabilityof the learned controller.The main problem of learning behaviors using raw sensorinput data is the curse of dimensionality. In [7], this issuehas been managed from the reinforcement learning perspective,by using a probability density estimation of the joint spaceof states. Among all the approaches based on evolutionaryalgorithms, only in [4] no expert knowledge has been taken intoaccount. In this work, the number of sensors and their positionare learned from a reduced number of sensors.In [22] a Genetic Cooperative-Competitive Learning(GCCL) approach was presented. The proposal learnsknowledge bases without preprocessing raw data, but the rulesinvolved approximative labels while the IQFRL proposal usesunconstrained multiple granularity. Moreover, in this approachit is di ffi cult to adjust the balance between cooperation andcompetition, which is typical when learning rules in GCCL.As a result, the obtained rules where quite specific and theperformance of the behavior was not comparable to otherproposals based on expert knowledge.2 . Quantified Fuzzy Rules (QFRs) Machine learning techniques in mobile robotics are used toobtain the mapping from inputs to outputs (control commands).In general, two categories can be established for the inputvariables: • High-level input variables: variables that provide, bythemselves, information that is relevant and meaningful tothe expert for modeling the system (e.g. the linear velocityof the robot, or the right-hand distance from the robot to awall). • Low-level input variables: variables that do not provideby themselves information for the expert to model thesystem (e.g. a single distance measure provided by asensor). Relevance of these variables emerge when theyare grouped into more significant sets of variables. Forexample, the control actions cannot be decided by simplyanalyzing the individual distance values provided by eachbeam of a laser range finder, since noisy measurementsor gaps between objects (very frequent in clutteredenvironments) may occur. Instead, more significantvariables and models involving complex groupings andstructures are used.Usually, high-level variables, or sectors, consisting of a set oflaser beam measures instead of the beam measures themselves(e.g., right distance, frontal distance, etc.) are used in mobilerobotics. The low-level input variables are transformed intohigh-level input variables in a preprocessing stage previous tothe learning of the controller. Traditionally, this transformationand the resulting high-level input variables are defined usingexpert knowledge. Doing this preprocessing automaticallyduring the learning phase demands a model that groups thelow-level input variables in an expressive and meaningful way.Within this context Quantified Fuzzy Propositions (QFPs) suchas “part of the distances of the frontal sector are low” areuseful for representing relevant knowledge for the experts andtherefore for performing intelligent control. Modeling withQFPs as in the previous example demands the definition ofseveral elements: • part : how many distances of the frontal sector must below? • frontal sector : which beams belong to the frontal sector? • low : what is the actual semantics of low?This example clearly sets out the need to use propositionsthat are di ff erent from the conventional ones. The use ofQFPs in robotics eliminates the need of expert knowledge intwo ways: i) the preprocessing of the low-level variables canbe embedded in the learning stage; ii) the definition of thehigh-level variables obtained from low-level variables is doneautomatically, also during the learning stage. In this paperQFPs are used for representing knowledge about high-level IF part o f distances of FRONTAL SECTOR are LOW and(1) . . . velocity is HIGH (2)THEN vlin is VERY LOW and vang is TURN LEFT Figure 1: An example of QFR to model the behavior of a mobile robot. variables that are defined as the grouping of low-level variables.Conventional fuzzy propositions are also used to representconventional high-level variables, i.e., high-level variables notrelated to low-level ones (e.g. velocity).
An example of a QFR is shown in Fig. 1, involving bothQFPs (1) and conventional ones (2); the outputs of the rule arealso fuzzy sets. In order to determine the degree to which theoutput of the rule will be applied, it is necessary to reason aboutthe propositions (using, for example, the Mamdani’s reasoningscheme).The general expression for QFPs in this case is: d ( h ) is F id in Q i of F ib (3)where, for each i =
1, ..., g maxb ( g maxb being the maximum possiblenumber of sectors of distances): • d ( h ) is the signal. In this example, it represents thedistance measured by beam h . • F id is a linguistic value for variable d ( h ) (e.g., “low” ). • Q i is a (spatial, defined in the laser beam domain) fuzzyquantifier (e.g., “part” ). • F ib is a fuzzy set in the laser beam domain (e.g., the “frontal sector” ).Evaluation of the Degree of Fulfillment ( DOF ) for QFP(Eq. 3) is carried out using Zadeh’s quantification modelfor proportional quantifiers (such as “most of”, “part of”, ...)[23]. This model allows to consider non-persistence, partialpersistence and total persistence situations for the event “ d ( h ) is F id ” in the range of laser beams (spatial interval F ib ). Therefore,for the considered example, it is possible to make a total orpartial assessment on how many distances should be low, inorder to decide the corresponding control action. This is arelevant feature of this model, since it allows to consider partial,single or total fulfillment of an event within the laser beams set.The number of analyzed sectors of distances and theirdefinition may vary for each of the rules. There can bevery generic rules that only need to evaluate a single sectorconsisting of many laser beams, while other rules may need afiner granularity, with more specific laser sectors. Moreover,the rules may require a mix of QFPs and standard fuzzy3ropositions (for conventional high-level variables). Therefore,the automatic learning of QFRs demands an algorithm with thecapability of managing rules with di ff erent structures.
4. Iterative Quantified Fuzzy Rule Learning of Controllers
Evolutionary learning methods follow two approaches inorder to encode rules within a population of individuals [3, 24]: • Pittsburgh approach: each individual represents the entirerule base. • Michigan, IRL [25], and GCCL [26]: each individualcodifies a rule. The learned rule base is the result ofcombining several individuals. The way in which theindividuals interact during the learning process definesthese three di ff erent approaches.The discussion is focused on those approaches for which anindividual represents a rule, discarding the Michigan approachas it is used in reinforcement learning problems in which thereward from the environment needs to be maximized [27].Therefore, the IRL and GCCL approaches are analyzed.In the IRL approach, the individuals compete among thembut only a single rule is learned for each run (epoch) of theevolutionary algorithm. After each sequence of iterations,the best rule is selected and added to the final rule base.The selected rule must be penalized in order to induce nicheformation in the search space. A common way to penalize theobtained rules is to delete the training examples that have beencovered by the set of rules in the final rule base. The final step ofthe IRL approach is to check whether the obtained set of rules isa complete knowledge base. In the case it is not, the process isrepeated. A weak point of this approach is that the cooperationamong rules is not taken into account when a rule is evaluated.For example, a new rule could be added to the final rule base,deteriorating the behavior of the whole rule base over a set ofexamples that were already covered. The cooperation amongrules can be improved with a posterior rules selection process.In the GCCL approach the entire population codifies therule base. That is, rules evolve together but competing amongthem to obtain the higher fitness. For this type of algorithmit is fundamental to include a mechanism to maintain thediversity of the population (niche induction). This mechanismmust warrant that individuals of the same niche competeamong themselves, but also has to avoid deleting those weakindividuals that occupy a niche that remains uncovered. This isusually done using token competition [24].Although GCCL works well for classification problems [1],the same does not occur for regression problems [22], mostlydue to the di ffi culty of achieving in this realm an adequatebalance between cooperation and competition. It is frequent inregression that an individual tries to capture examples seizedby other individual, improving the performance on many ofthe examples, but decreasing the accuracy on a few ones.In subsequent iterations, new and more specific individuals KB cur : = ∅ repeat it : = equal ind : = Initialization Evaluation repeat Selection Crossover and Mutation
Evaluation
Replacement if best it − ind = best itind then equal ind : = equal ind + else equal ind : = end if it : = it + until ( it ≥ it min ∧ equal ind ≥ it check ) ∨ ( it ≥ it max ) KB cur : = KB cur ∪ best ind uncov ex : = uncov ex − cov ex until uncov ex = ∅ Figure 2: IQFRL algorithm. replace the rule that was weakened. As a result, the individualsimprove their individual fitness, but the performance of theknowledge base does not increase. In particular, for mobilerobotics, the obtained knowledge bases over-fit the training datadue to a polarization e ff ect of the rule base: few very generalrules and many very specific rules. Moreover, many times, theerrors of the individual rules compensate each other, generatinga good output of the rule base over the training data, but not ontest data.This proposal, called IQFRL (Iterative Quantified FuzzyRule Learning), is based on IRL. The learning process isdivided into epochs (set of iterations), and at the end of eachepoch a new QFR (Sec. 3.2) is obtained. The following sectionsdescribe each of the stages of the algorithm (Fig. 2). The learning process is based on a set of training examples.In mobile robotics, each example can be composed of severalvariables that define the state of the robot (position, orientation,linear and angular velocity, etc.), and the data measured bythe sensors. If the robot is equipped with laser range finders,the sensors data are vectors of distances. A laser range finderprovides the distances to the closest obstacle in each direction(Fig. 3) with a given angular resolution (number of degreesbetween two consecutive beams). In this paper, each example e l is represented by a tuple: e l = ( d (1) , . . . , d ( N b ) , velocity , vlin , vang ) (4)where d ( h ) is the distance measured by beam h , N b is thenumber of beams (e.g. 722 for a robot equipped with twoSick LMS200 laser range scanners as in Fig. 3), velocity is themeasured linear velocity of the robot, and vlin and vang are the4 (181) d(120)d(180)d(361) d(91)d(1)d(650)d(542) d(N b =722) Figure 3: Some of the distances measured by a robot equipped with two laserrange finders. output variables (control commands for the linear and angularvelocities respectively).The individuals in the population include both conventionalpropositions and QFPs (Sec. 3.2). Also, the number of relevantinputs can be di ff erent. Therefore, genetic programming isthe most appropriate approach, as each individual is a tree ofvariable size. In order to generate valid individuals of thepopulation, and to produce right structures for the individualsafter crossover and mutation, some constraints have to beadded. With a context-free grammar all the valid structures ofa tree (genotype) in the population can be defined in a compactform. A context-free grammar is a quadruple (V, Σ , P, S), whereV is a finite set of variables, Σ is a finite set of terminal symbols,P is a finite set of rules or productions, and S the start symbol.The basic grammar is described in Fig. 4. As usual, di ff erentproductions for the same variable are separated by symbol “ | ”.Fig. 5 represents a typical chromosome generated with thiscontext-free grammar. Terminal symbols (leaves of the tree)are represented by ellipses, and variables as rectangles. Thereare two di ff erent types of antecedents: • The sector antecedent. Consecutive beams are groupedinto sectors in order to generate more general (high-level)variables (frontal distance, right distance, etc.). This typeof antecedent is defined by the terminal symbols F d , F b and Q : i) the linguistic label F d represents the measureddistances ( HIGH in Fig. 1, prop. 1); ii) F b is the linguisticlabel that defines the sector, i.e., which beams belong tothe sector ( FRONT AL S ECT OR in Fig. 1, prop. 1); iii) Q is the quantifier ( part in Fig. 1, prop. 1). • The measured linear velocity of the antecedent is definedby the F v linguistic label.Finally, F lv and F av are the linguistic labels of the linear andangular velocity control commands respectively, which are theconsequents of the rule.The linguistic labels of the antecedent ( F v , F d , F b ) aredefined using a multiple granularity approach. The universeof discourse of a variable is divided into a di ff erent numberof equally spaced labels for each granularity. Specifically, agranularity g ivar divides the variable var in i uniformly spaced • V = { rule, antecedent, consequent, sector }• Σ = { F lv , F av , F v , F d , F b , Q }• S = rule • P: 1. rule −→ antecedent consequent2. antecedent −→ sector F v | sector3. consequent −→ F lv F av
4. sector −→ F d Q F b sector | F d Q F b Figure 4: Basic context-free grammar for controllers in robotics.Figure 5: An individual representing a QFR that models the behavior of a robot. x µ ( x ) g x µ ( x ) g x µ ( x ) g x µ ( x ) g Figure 6: Multiple granularity approach from g x to g x . labels, i.e., A ivar = { A i , var , ..., A i , ivar } . Fig. 6 shows a partitioning ofup to granularity five. On the other hand, the linguistic labels ofthe consequents ( F lv , F av ) are defined using a single granularityapproach . Multiple granularity makes no sense if the labels are defined as singletons,which is the usual choice for the output variables in control applications. equire: mask var i : = g var result : = ∅ loop for all j ∈ [1 , i ] do if support( mask var ) ≥ support( A i , jvar ) then if similarity( mask var , A i , jvar ) > similarity( mask var , result ) then result : = A i , jvar end if else break loop end if end for i : = i + end loop return result Figure 7: Function that searches for the most similar label to mask var . An individual (Fig. 5) is generated for each example in thetraining set. The consequent part ( F lv and F av ) is initialized as F var = A g var , β var where β = argmax j µ g var , jvar (cid:16) e l (cid:17) , i.e., the label withthe largest membership value for the example.The initialization of the antecedent part of a rule requiresobtaining the most similar linguistic label to a given fuzzymembership function (which is called mask label). As themaximum granularity of the linguistic labels in the antecedentpart of a rule is not limited, the function maskToLabel (Fig. 7)is applied to obtain the most appropriate linguistic label. Thisfunction uses a similarity measure defined as [28]: similarity ( F φ , F ψ ) = − (cid:80) x ∈ X | µ φ ( x ) − µ ψ ( x ) || X | (5)where F φ and F ψ are the labels being compared and X is a finiteset of points x uniformly distributed on the support of φ ∪ ψ .The maskToLabel function (Fig. 7) receives a triangularmembership function ( mask var ) and searches for the label A i , jvar with the highest similarity (Eq. 5, line 6) with less or equalsupport (line 5), starting from g var (line 1).For the initialization of the quantified propositions (sectors),the distances measured in the example are divided into groupsof consecutive laser beams whose deviation does not exceed acertain threshold ( σ bd ). Each group represents a sector that isgoing to be included in the individual. Afterwards, for each ofthe previously obtained sectors, the components ( F b , F d and Q )are calculated:1. F b = maskT oLabel ( mask b ), with mask b = ( left b , center b , right b ) where left b is the lower beamof the group, right b is the higher beam, center b is themiddle beam and the following properties are satisfied: µ ( left b ) = µ ( right b ) = . µ ( center b ) = F d = maskT oLabel ( mask d ), with mask d = ( ¯ d − σ d , ¯ d , ¯ d + σ d ) where ¯ d is the mean of the distances left b center b right b b µ ( b ) (a) mask b - d −σ d + σ d d µ ( d ) - d - d (b) mask d Figure 8: mask var representations for beam ( b ) and distance ( d ) variables. x µ ( x ) Figure 9: Example of a definition of the quantified label Q . measured by the beams of the group, σ d is the standarddeviation of these distances and the following propertiesare satisfied: µ ( ¯ d − σ d ) = µ ( ¯ d + σ d ) = . µ ( ¯ d ) = Q (Fig. 9) is calculated as the percentage of beams of thesector ( h ∈ F b ) that fulfill F d : Q = (cid:80) h ∈ F b min (cid:0) µ F d ( d ( h )) , µ F b ( h ) (cid:1)(cid:80) h ∈ F b µ F b ( h ) (6)Finally, the velocity antecedent F v is initialized as F v = A g iv , β v where β = argmax j µ g iv , jv ( e l ) and g iv is the granularity thatsatisfies that two consecutive linguistic labels have a separationof σ v , where σ v is a threshold of the velocity deviation. The fitness of an individual of the population is calculated asfollows. Firstly, it is necessary to estimate the probability thatan example e l matches the output ( C j ) associated to the j -thindividual rule: P (cid:16) C j | e l (cid:17) = exp − error lj ME (7)where ME is a parameter that defines the meaningful errorand error lj is the di ff erence between output C j and the outputcodified in the example: error lj = (cid:88) k y lk − c j , k max k − min k (8)where y lk is the value of the k -th output variable of example e l , c j , k is the output of the k -th output variable associatedto individual j , and max k and min k are the maximum andminimum values of output variable k . In regression problems,there can be several consequents that are di ff erent from the one6odified in the example, but that produce small errors, i.e., thatare very similar to the desired output. Thus, P (cid:16) C j | e l (cid:17) can beinterpreted as a normal distribution with covariance ME , and error lj is the square of the di ff erence between the mean (outputcodified in the example) and the output value proposed in therule codified by the individual.In an IRL approach, C j = C R j , i.e., the output coded inindividual j is the output associated to rule j . The fitness ofan individual in the population is calculated as the combinationof two values. On one hand, the accuracy with which theindividual covers the examples, called confidence. On the otherhand, the ability of generalization of the rule, called support.The confidence can be defined as: confidence = ρ u (cid:80) l DOF j ( e lu ) (9)where DOF j ( e lu ) is the degree of fulfillment of e lu for rule j , and e lu ∈ uncov ex , where uncov ex is defined as: uncov ex = { e l : DOF KB cur ( e l ) < DOF min } (10)i.e., the set of examples that are covered with a degree offulfillment below DOF min by the current final knowledge base( KB cur ) (line 19, Fig. 2), and ρ u can be defined as: ρ u = (cid:88) l DOF j ( e lu ) : P (cid:16) C j | e lu (cid:17) > P min and DOF j ( e lu ) > DOF min (11)where P min is the minimum admissible accuracy. Therefore, thehigher the accuracy over the examples covered by the rule (andnot covered yet by the current knowledge base), the higher theconfidence. Support is calculated as: support = ρ u uncov ex (12)Thus, support measures the percentage of examples that arecovered with accuracy, related to the total number of uncoveredexamples. Finally, f itness is defined as a linear combination ofboth values: fitness = α f · confidence + (1 − α f ) · support (13)which represents the strength of an individual over the set ofexamples in uncov ex . α f ∈ [0 ,
1] is a parameter that codifiesthe trade-o ff between accuracy and generalization of the rule. The matching of the pairs of individuals that are going tobe crossed is implemented following a probability distributiondefined as: P close ( α, β ) = − (cid:80) N c k = ( c α, k − c β, k max k − min k ) N c (14)where c α, k ( c β, k ) is the value of the k -th output variableof individual α ( β ), and N c is the number of consequents.With this probability distribution, the algorithm selects with Require: ind α , ind β a α = a β = ∅ N a = g maxb + repeat m = random ∈ [1 , N a ] if m is a sector then a α = argmax r similarity ( F b , r , A g maxb , mb ) ≥ ∀ r ∈ ind α a β = argmax r similarity ( F b , r , A g maxb , mb ) ≥ ∀ r ∈ ind β else a α = F v ∈ ind α a β = F v ∈ ind β end if until ( a α (cid:44) ∅ ) ∨ ( a β (cid:44) ∅ ) Figure 10: Selection of antecedents for crossover. higher probability mates that have similar consequents. Theobjective is to extract information on which propositions ofthe antecedent part of the rules are important, and which arenot. Crossover has been designed to generate more generalindividuals, as the initialization of the population produces veryspecific rules. The crossover operator generates two o ff springs: o ff spring = crossover ( ind i , ind j ) o ff spring = crossover ( ind j , ind i ) (15)This operator modifies a single proposition in antecedentpart of the rule. As individuals have a variable number ofantecedents, the total number of propositions can be di ff erentfor two individuals. Moreover, the propositions can be definedusing di ff erent granularities. Therefore, the first step is to selectthe propositions (one for each individual) to be crossed betweenboth individuals (Fig. 10) as follows:1. Get the most specific granularity of the sectors of theindividuals to cross ( g maxb ). Then, an antecedent m ∈ [1 , N a ] is selected, where N a is g maxb plus one, due to thevelocity proposition.2. Check the existence of this antecedent in both individuals,according to the following criteria:(a) If the antecedent m is a sector, then calculate for eachproposition of each individual the similarity betweenthe definition of the sector for the proposition and thelinguistic label that defines sector m . Finally, selectfor each individual the proposition with the highestsimilarity.(b) If the antecedent m is the velocity, then thecorresponding proposition is F v (in case it exists).Once the propositions to be crossed have been selected, anoperation must be picked depending on the existence of theantecedent in both parents (table 1): • If the proposition does not exist in the first individual butexists in the second one, then the proposition of the secondindividual is copied to the first one, as this propositioncould be meaningful.7 able 1: Crossover operationsIndividual 1 Individual 2 Actionno yes copy proposition from individual 2 to 1yes no delete proposition in individual 1yes yes combine propositions x μ ( x ) Total Similarity x μ ( x ) Partial Similarity x μ ( x ) No Similarity
Figure 11: Di ff erent possibilities of similarity for the labels of equalproposition of two individuals used in the crossover operator. • If the situation is the opposite to the previous one, then theproposition of the first individual is deleted, as it might benot important. • If the proposition exists in both individuals, then bothpropositions are combined in order to obtain a propositionthat generalizes both antecedents.In this last case, the combination of propositions is done bytaking into account the degree of similarity (Eq. 5) betweenthem (Fig. 11). If the proposition is of type sector, the similaritytakes into account both F b and F d labels. Only when bothsimilarities are partial, the propositions are merged: • If there is no similarity, then the propositions correspondto di ff erent situations. For example, “the distance is highin part of the frontal sector” and “the distance is low inpart of the frontal sector” . This means that the propositionof the first individual might not contain meaningfulinformation and it could be deleted to generalize the rule.For example, both individuals have the proposition “thedistance is high in part of the frontal sector” . • If the similarity is total, then, in order to obtain a newindividual with di ff erent antecedents, the proposition iseliminated. • Finally, if the similarity is partial, then the propositionsare merged in order to obtain a new one that combinesthe information provided by the two original propositions.For example, “the distance is high in part of the frontalsector“ and “the distance is medium-high in part of thefrontal sector“ . Therefore, the individual is generalized.The merge action is defined as the process of finding thelabel with the highest possible granularity that has somesimilarity with the labels of both original propositions.This is done for both F b and F d labels. Q is calculatedas the minimum Q of both individuals. If crossover is not performed, both individuals are mutated.Mutation implements two di ff erent strategies (Fig. 12):generalize or specialize a rule. The higher the value ofconfidence (Eq. 9), the higher the probability to generalizethe rule by mutation. This occurs with rules that cover theirexamples with high accuracy and that could be modified tocover other examples. On the contrary, when the confidenceof the individual is low, this means that it is covering some ofits examples with a low performance. In order to improve therule some of the examples that are currently covered should bediscarded in order to get a more specific rule.For generalization, the following steps are performed:1. Select an example e sel ∈ uncov jex , where uncov jex = { e lu : DOF j ( e lu ) < DOF min } , i.e. the set of examples that belongto uncov ex and are not covered by individual j . Theexample is selected with a probability distribution givenby P (cid:16) C j | e lu (cid:17) (Eq. 7). The higher the similarity betweenthe output of the example and the consequent of rule j , thehigher the probability of being selected.2. The individual is modified in order to cover e sel .Therefore, all the propositions that are not covering theexample (those with µ prop (cid:16) e sel (cid:17) < DOF min ) are selectedfor mutation.(a) For sector propositions (Eq. 1), there are threedi ff erent ways in which the proposition can bemodified: F d , F b , and Q . The modificationis selected among the three possibilities, with aprobability proportional to the µ prop (cid:16) e sel (cid:17) value afterapplying each one.i. F d and F b are generalized choosing the mostsimilar label in the adjacent partition with lowergranularity. The process is repeated until µ prop (cid:16) e sel (cid:17) ≥ DOF min .ii. On the other hand, Q is decreased until µ prop (cid:16) e sel (cid:17) ≥ DOF min .(b) For velocity propositions (Eq. 2), generalization isdone choosing the most similar label in the adjacentpartition with lower granularity until µ prop (cid:16) e sel (cid:17) > DOF min .For specialization, the process is equivalent:1. Select an example e sel ∈ cov jex , where cov jex = { e lu : DOF j ( e lu ) > DOF min } , i.e. the set of examples thatbelong to uncov ex and are covered by individual j . Theexample is selected with a probability distribution that isinversely proportional to P (cid:16) C j | e lu (cid:17) (Eq. 7). The higherthe similarity between the output of the example and theconsequent of rule j , the lower the probability of beingselected.2. Only one proposition needs to be modified to specializethe individual. This proposition is selected randomly.(a) For sector propositions there are, again, threedi ff erent ways in which the proposition can bemodified: F d , F b , and Q . The modification8 .00.51.0 μ ( x ) Generalization x μ ( x ) μ ( x ) Specialization x μ ( x ) Figure 12: The strategies used for mutation for variables d , b and v . Higher Probability Lower Probabiltiy x μ ( x ) IndividualOutput ExampleOutput
Probability Distribution
Figure 13: Probability distribution example for consequent mutation. Labelsclosest to the individual output have higher probability to be selected. is selected among these three possibilities, witha probability that is inversely proportional to the µ prop (cid:16) e sel (cid:17) value after applying each one.i. F d and F b are specialized, choosing the mostsimilar label in the adjacent partition with highergranularity. The process is repeated until µ prop (cid:16) e sel (cid:17) < DOF min .ii. On the other hand, Q is increased until µ prop (cid:16) e sel (cid:17) < DOF min .(b) For velocity propositions, specialization is done bychoosing the most similar label in the adjacentpartition with higher granularity until µ prop (cid:16) e sel (cid:17) < DOF min .Finally, once the antecedent is mutated, the consequent alsomutates. Again, this mutation requires the selection of anexample. If generalization was selected for the mutation of theantecedent, then the example will be e sel . On the other hand,for specialization an example is randomly selected from thosecurrently in cov jex . For each variable in the consequent part ofthe rule, the label of the individual is modified selecting a labelfollowing a probability distribution (Fig. 13): P (cid:16) A g var , γ var | A g var , α var , A g var , β var (cid:17) = − | α − γ || α − β | + A g var , α var is the label of each of the consequents of theindividual, A g var , β var is the label with the largest membership valuefor e sel and A g var , γ var is a label between them. Thus, the labelscloser to the label of the individual have a higher probability tobe selected, while the labels closer to the example label have alower one. Selection has been implemented following the binarytournament strategy. Replacement follows an steady-stateapproach. The new individuals and those of the previouspopulation are joined, and the best pop max individuals areselected for the next population.
An epoch is a set of iterations at the end of which a newrule is added to KB cur . The stopping criterion of each epoch(inner loop in Fig. 2) is the number of iterations, but this limitvaries according to the following criteria: once the number ofiterations ( it ) reaches it min , the algorithm stops if there are it check consecutive iterations (counted by equal ind ) with no change inthe best individual ( best ind ). If the number of iterations reachesthe maximum ( it max ), then the algorithm stops regardless of theprevious condition.When the epoch ends, the rule defined in best ind is added to KB cur . Moreover, the examples that are covered with accuracy(according to the criterion in Eq. 11) are marked as coveredby the algorithm (line 20, Fig. 2). Finally, the algorithm stopswhen there are no uncovered examples. After the end of the iterative part of the algorithm, theperformance of the obtained rule base can be improvedselecting a subset of rules with better cooperation among them.The rule selection algorithm described in [1] has been used.The rule selection process has the following steps:1. Generate R gp rule bases, where R gp is the number ofrules of the population obtained by the IQFRL algorithm( RB gp ) Each rule base is coded as: RB i = r i · · · r i R gp , with: r ij = , i f j > i , i f j ≤ i (17)where r ij indicates if the j -th rule of RB gp is included( r ij =
1) or not ( r ij =
0) in RB i . With this codification, RB i will contain the best i rules of RB gp , as these ruleshave been ranked in decreasing order of their individualfitness. Notice that RB R gp is RB gp
2. Evaluate all the rule bases, and select the best one, RB sel .3. Execute a local search on RB sel to obtain the best rule set, RB best .The last step was implemented with the iterated local search(ILS) algorithm [29].threshold (maxRestarts).
5. Results
The proposed algorithm has been validated with thewell-known in mobile robotics wall-following behavior. Themain objectives of a controller for this behavior are: to keep9 igure 14:
Pioneer 3-AT robot equipped with two laser range scanners.Figure 15: The three di ff erent situations for the wall-following behavior. a suitable distance between the robot and the wall, to moveat the highest possible velocity, and to implement smoothcontrol actions. The Player / Stage robot software [30] has beenused for the tests on the simulated environments and also forthe connection with the real robot
Pioneer 3-AT (Fig. 14).This real robot was equipped with two laser range scannerswith an amplitude of 180 ◦ and a precision of 0 . ◦ (i.e. 361measurements for each laser scan). Without loss of generality,all the examples and tests here described were made with therobot following the wall at its right.The examples that have been used for learning weregenerated for three di ff erent situations (Fig. 15) that have beenidentified by an expert:1. Convex corner: it is characterized by the existence of a gapin the wall (like an open door) (labeled CX in Fig. 15).2. Concave corner: it is a situation in which the robot finds awall in front of it (labeled CC in Fig. 15).3. Straight wall: any other situation (labeled SW in Fig. 15).For each of the above situations, the robot was placed indi ff erent positions and the associated control order was the onethat minimized the error. Therefore, each example consistsof 722 distances (one for each laser beam), the current linearvelocity of the robot, and the control commands (linear andangular velocity). The expert always tried to follow the wall at,approximately, 50 cm and the maximum values for the linearand angular velocities were 50 cm / s and 45 o s − respectively.572 training examples were generated for the straight wallsituation, 540 for the convex corner and 594 for the concavecorner.The IQFRL algorithm was used to learn a di ff erent controllerfor each of the three situations. In order to decide which Table 2: Characteristics of the test environmentsEnvironment Dim. ( m × m ) Length (m) ×
10 20 8 3 1gfs b 14 ×
10 43 10 6 0dec 19 ×
12 53 8 4 0domus 26 ×
16 60 9 6 3citius 16 ×
10 63 12 6 2raid a 16 ×
16 66 16 12 0wsc8a 15 ×
15 70 4 7 1home b 18 ×
11 76 17 6 2raid b 20 ×
10 86 12 10 2rooms 19 ×
19 86 12 6 4flower 22 ×
20 98 9 6 1o ffi ce 26 ×
26 146 23 10 8autolab 26 ×
28 154 21 11 10maze 18 ×
18 205 13 9 0hospital 74 ×
45 1046 98 69 43real env 1 9 × × knowledge base should be used at each time instant, theclassification version of IQFRL (IQFRL-C, see Appendix A)was used. In this way, IQFRL learning could be tested withthree completely di ff erent controllers.In order to analyze the performance of the proposed learningalgorithm, several tests were done in 15 simulated environmentsand two real ones. Table 2 shows some of the characteristicsof the environments: the dimensions of the environment, thepath length, the number of concave ( ffi culty as the robot has to negotiate a convex corner with avery close wall in front of it.The simulated environments are shown in Figs. 16 and 17.The trace of the robot is represented by marks, and the higherthe concentration of marks, the lower the velocity of the robot.Furthermore, Fig. 18 shows the real environments. Each ofthem represents an occupancy grid map of the environment,together with the trajectory of the robot. The following values were used for the parameters of theevolutionary algorithm: ME = . DOF min = . α f = . P cross = . pop max = it min = it check = it max = σ bd = . σ v = . P min = . P min isa parameter that has a high influence in the performance of thesystem. A single value of P min was used in testing, obtainedfrom Eqs. 7 and 8 for the case the error for each consequentis one label (Eq. 8). The granularities and the universe ofdiscourse of each output of a rule are shown in table 3. Forthe rule subset selection algorithm, the parameters have valuesof radius nbhood = maxRestarts = ff erentalgorithms:10 a) home (b) gfs b (c) dec (d) domus (e) citius (f) raid a (g) wsc8a (h) home b (i) raid b Figure 16: Path of the robot along the simulated environments (I).Table 3: Universe of discourse and granularities
Variable Min Max GranularitiesDistance 0 1 . − Velocity 0 0 . . { } Angular velocity − π/ π/ { } • Methodology to Obtain Genetic fuzzy rule-based systemsUnder the iterative Learning approach (MOGUL): athree-stage genetic algorithm [31]:1. An evolutionary process for learning fuzzy rules,with two components: a fuzzy-rule generatingmethod based on IRL, and an iterative coveringmethod.2. A genetic simplification process for selecting rules.3. A genetic tuning process, that tunes the membershipfunctions for each fuzzy rule or for the complete rule base.The soft-constrained MOGUL was used, as it has betterperformance in very hard problems [25] . • Multilayer Perceptron Neural Network (MPNN): asingle-hidden-layer neural network trained with the BFGSmethod [33] with the following parameters: abstol = . reltol = . maxit = n to 2 · n , being n the numberof inputs . • ν -Support Vector Regression ( ν -SVR) : a ν -SVM [36]version for regression with a Gaussian RBF kernel. Theparameter sigma is estimated based upon the 0.1 and 0.9quantile of || x − x (cid:48) || . The implementation in
Keel [32], an open source (GPLv3) Java softwaretool to assess evolutionary algorithms for Data Mining problems, was used. The package nnet [34] of the statistical software R was used. The package kernlab [35] of the statistical software R was used. a) rooms (b) flower (c) o ffi ce (d) autolab (e) maze (f) hospital Figure 17: Path of the robot along the simulated environments (II).(a) realenvironment 1 (b) realenvironment 2
Figure 18: Path of the robot along the real environments. able 4: Di ff erent configurations for the preprocessing methods. Preprocessing ConfigurationMin (n) { , , , , } Sample (n) { , , , , } PCA ( σ PCA ) { . , . , . , . , . } Table 5: Number of inputs obtained with PCA. σ PCA
Straight Convex Concave0 .
90 35 15 270 .
95 51 24 400 .
975 66 35 530 .
99 85 57 680 .
999 127 99 109
As mentioned before, in the IQFRL proposal thepreprocessing of raw sensor data is embedded in the learningalgorithm. Since the algorithms for the comparison need topreprocess the data before the learning phase, three di ff erentapproaches were used for the transformation of the sensor data: • Min : the beams of the laser range finder are grouped in n equal sized sectors. For each sector, the minimum distancevalue is selected as input. • Sample : n equidistant beams are selected as the input data. • PCA: Principal Component Analysis computes the mostmeaningful basis to re-express the data. It is asimple, non-parametric method for extracting relevantinformation. The variances associated with the principalcomponents can be examined in order to select only thosethat cover a percentage of the total variance.Di ff erent parameters have been used for the preprocessingapproaches. For Min and
Sample methods, the number ofobtained inputs ( n ) was changed. For PCA , the percentageof variance ( σ PCA ) indicates the principal components selectedas input data. Table 4 shows the parameters used for thepreprocessing methods. Moreover, table 5 shows the numberof inputs obtained with PCA for the three datasets with eachconfiguration.
Table 6 shows the training and test errors over a 5-foldcross-validation. For each algorithm and dataset the mean andstandard deviation of the error (Eq. 8) were calculated.For each preprocessing technique, a 5-fold cross-validationwas performed for each combination of the parameters of thealgorithms. For example, for the
Min preprocessing with 16equal size sectors, a 5-fold cross-validation was run for eachnumber of neurons between 17 and 34 for the MPNN approach.Only the configuration of the algorithm with lowest test errorfor each configuration of the preprocessing methods was usedfor comparison purposes. Moreover, only those configurationsof preprocessing techniques with the best results are shown inthe tables of this section. Results for
PCA preprocessing have
Table 6: Training and test errors
Alg. Preproc. Dataset Training TestIQFRL − Straight 0 . ± .
03 0 . ± . . ± .
01 0 . ± . . ± .
01 0 . ± . . ± .
00 0 . ± . . ± .
00 0 . ± . . ± .
00 0 . ± . . ± .
00 0 . ± . . ± .
00 0 . ± . . ± .
00 0 . ± . . ± .
00 0 . ± . . ± .
01 0 . ± . . ± .
00 0 . ± . . ± .
00 0 . ± . . ± .
01 0 . ± . . ± .
00 0 . ± . ν -SVR min 16 other 0 . ± .
00 0 . ± . . ± .
01 0 . ± . . ± .
00 0 . ± . . ± .
01 0 . ± . . ± .
02 0 . ± . . ± .
00 0 . ± . not been included, as the learning algorithms were not able toobtain adequate controllers.Although, the MSE (Mean Squared Error) is the usualmeasure of the performance of the algorithms, this is not asu ffi cient criterion in mobile robotics. A good controller mustbe robust and able to provide a good and smooth output in anysituation. The only way to validate the controller is to test it onenvironments (simulated and real) with di ff erent di ffi culties andassessing on these tests a number of quality parameters such asmean distance to the wall, mean velocity along the paths, . . .Table 8 contains the results of the execution of each of thealgorithms for the di ff erent simulated environments (Figs. 16and 17). Furthermore, table 9 shows the average results for thefollowing five di ff erent indicators: the distance to the wall at itsright (Dist.), the linear velocity (Vel.), the change in the linearvelocity between two consecutive cycles (Vel.ch.) —whichreflects the smoothness in the control—, the time per lap, andthe number of blockades of the robot along the path and cannotrecover.The robot is blocked if it hits a wall or if it does not movefor 5 s. In this situation the robot is placed parallel to thewall at a distance of 0.5 m. The average values of the fiveindicators are calculated for each lap that the robot performs inthe environment. Results presented in the table are the averageand standard deviation values over five laps of the averagevalues of the indicators over one lap. The dash symbol in theresults table indicates that the controller could not complete thepath. This usually occurs when the number of blockades permeter is high (greater than 5 blockades in a short period of time)or when the robot completely deviates from the path.Moreover, in order to evaluate the performance of acontroller with a numerical value a general quality measure wasdefined. It is based on the error measure defined in [15], but13ncluding the number of blockades: quality = + (1 + Blockades ) · (0 . · | Dist − d wall | + . · | Vel − v max | ) (18) where d wall is the reference distance to the wall (50 cm) and v max is the maximum value of the velocity (50 cm / s). Thehigher the quality, the better the controller. This measuretakes the number of blockades into account in a linear formfor comparison purposes. However, it should be noted thatcontrollers with just a single blockade are not reliable andshould not be implemented on a real robot.In general, all the algorithms except MPNN with Sample 16 preprocessing, produced a distance that is very close to thereference (between 40 cm and 60 cm to the wall at its right).Note that in cases where the best distance is very di ff erentfrom that obtained by IQFRL, this is because several blockadeshappened. Therefore, those controllers have the advantage ofbeing continually repositioned into the perfect situation. Thebest results in speed are those obtained by ν -SVR and MOGULbut, in general, due to a worsening in the distance to the wallor an increase in the number of blockades. The same appliesto the speed change. In those cases where it is too low, like insome cases for MOGUL or MPNN, the robot is not able to tracesome curves safely. IQFRL is the approach that gets the bestquality values, reflecting not only the adequate values for thedistance, velocity and smoothness in all the environments but,also, its robustness: it is the unique approach that never blockedor failed to complete the laps in any of the environments.In order to compare the experimental results, non-parametrictests of multiple comparisons have been used. Their useis recommended in those cases in which the objective is tocompare the results of a new algorithm against various methodssimultaneously. The Friedman test with Holm post-hoc testwas selected as the method for detecting significant di ff erencesamong the results. The test is performed for the quality indicator in table 8.The statistical test (table 7) shows that the di ff erence ofthe quality of the IQFRL approach is statistically significant.Only ν -SVR and MOGUL with sample 16 preprocessing arecomparable to IQFRL, as the number of blockades is very lowor null in some environments.Additionally, table 10 shows the results obtained by IQFRLin two real environments. As in the previous tables, the resultsare the average and standard deviation over 5 laps. The distance Table 7: Non-parametric test for quality of table 8.
Alg. Preprocessing Ranking Holm p -valueIQFRL − . − MOGUL min 16 4 . . .
47 0 . .
57 0 . . ν -SVR min 16 3 .
83 0 . . . p -value = . p -value < = . to the wall is lower than 60 cm, showing a good behavior,although the velocity seems to be low, this is because cornersare very close to each other and the robot does not have timeto accelerate. Also, the velocity change reflects a very smoothmovement as changes in velocity take more time in the realrobot.Finally, the IQFRL proposal was compared with theproposals presented in [15] for learning rules for thewall-following behavior. The purpose of this comparison is tocheck if IQFRL is competitive against other methods whichuse expert knowledge for sensor data preprocessing. Fourdi ff erent approaches were used: the COR methodology, theweighted COR methodology (WCOR), Hierarchical Systems ofWeighted Linguistic Rules (HSWLR) and a local evolutionarylearning of Takagi-Sugeno rules (TSK). For these approaches,four input variables were defined by an expert: right distance,left distance, velocity, and the orientation (alignment) of therobot to the wall at its right. Moreover, the granularities of eachvariable were also defined by the expert. Table 12 presents thecomparison between these approaches and the IQFRL proposalon those environments which are common.The IQFRL approach exhibited the highest quality in the twomost complex environments (o ffi ce and hospital). Moreover,table 11 shows the non-parametric tests performed over quality .The Friedman p-value is higher than in table 9, due to the lownumber of environments available for comparisons. As can beseen, there is no statistically significant di ff erence regardingthe quality . That is, the controllers learned with embeddedpreprocessing has similar performance to the methods that useexpert knowledge to preprocess the data. An example of a rule learned by IQFRL is presented inFig. 19. The antecedent part is composed of a single QFP.The linguistic value A , d indicates a low distance, while A , b denotes that the beams sector of the proposition is formed bythe frontal and right parts of the robot. Therefore, the ruledescribes a situation where the robot is too close to the wall and,if it continues, it will collide. Because of that, the consequentindicates a zero linear velocity and a turn of the robot to the left,in order to get away from the wall without getting the robot intorisk.Table 13 shows the number of rules learned for the di ff erentsituations by each of the methods based on rules. MOGULis implemented as a multiple-input single-output (MISO) Table 11: Non-parametric test for quality of table 12.
Alg. Ranking Holm p -valueIQFRL 2 . − COR 3 . . . . . . . . p -value = . p -value < = . able 8: Average results ( x ± σ ) for each simulated environment Alg. Prepr. Env. Dist.(cm) Vel.(cm / s) Vel.ch.(cm / s) Time(s) quality IQFRL − home 55 . ± .
25 27 . ± .
66 5 . ± .
14 164 . ± .
26 0 . ± .
00 0 . . ± .
70 22 . ± .
92 7 . ± .
74 163 . ± .
72 0 . ± .
00 0 . . ± .
02 32 . ± .
67 5 . ± .
18 168 . ± .
76 0 . ± .
00 0 . . ± .
53 29 . ± .
19 5 . ± .
49 198 . ± .
39 0 . ± .
00 0 . . ± .
87 26 . ± .
78 6 . ± .
64 249 . ± .
25 0 . ± .
00 0 . . ± .
72 25 . ± .
14 6 . ± .
55 262 . ± .
15 0 . ± .
00 0 . . ± .
00 27 . ± .
84 7 . ± .
33 233 . ± .
28 0 . ± .
00 0 . . ± .
72 25 . ± .
46 7 . ± .
36 300 . ± .
15 0 . ± .
00 0 . . ± .
44 28 . ± .
40 6 . ± .
47 242 . ± .
82 0 . ± .
00 0 . . ± .
34 30 . ± .
34 6 . ± .
43 261 . ± .
60 0 . ± .
00 0 . . ± .
25 33 . ± .
40 4 . ± .
34 290 . ± .
13 0 . ± .
00 0 . ffi ce 51 . ± .
57 24 . ± .
18 6 . ± .
25 578 . ± .
92 0 . ± .
00 0 . . ± .
20 28 . ± .
31 5 . ± .
48 499 . ± .
74 0 . ± .
00 0 . . ± .
22 35 . ± .
40 3 . ± .
28 567 . ± .
29 0 . ± .
00 0 . . ± .
19 26 . ± .
10 6 . ± .
35 3608 . ± .
72 0 . ± .
00 0 . . ± .
69 30 . ± .
30 5 . ± .
45 181 . ± .
70 7 . ± .
05 0 . . ± .
96 24 . ± .
02 6 . ± .
39 208 . ± .
86 14 . ± .
16 0 . . ± .
14 36 . ± .
43 5 . ± .
15 190 . ± .
29 8 . ± .
25 0 . . ± .
64 31 . ± .
80 5 . ± .
33 224 . ± .
25 8 . ± .
47 0 . . ± .
27 29 . ± .
13 5 . ± .
53 302 . ± .
58 18 . ± .
62 0 . . ± .
32 26 . ± .
28 6 . ± .
09 363 . ± .
21 27 . ± .
40 0 . . ± .
35 27 . ± .
63 6 . ± .
59 346 . ± .
40 26 . ± .
44 0 . . ± .
49 27 . ± .
62 6 . ± .
37 379 . ± .
05 22 . ± .
41 0 . . ± .
70 32 . ± .
17 5 . ± .
38 280 . ± .
29 14 . ± .
40 0 . . ± .
38 30 . ± .
04 5 . ± .
37 350 . ± .
04 20 . ± .
89 0 . . ± .
39 38 . ± .
00 3 . ± .
71 310 . ± .
41 11 . ± .
03 0 . ffi ce 51 . ± .
29 25 . ± .
49 6 . ± .
20 762 . ± .
28 49 . ± .
25 0 . . ± .
71 30 . ± .
24 5 . ± .
27 612 . ± .
46 31 . ± .
45 0 . . ± .
55 37 . ± .
53 2 . ± .
25 690 . ± .
92 32 . ± .
38 0 . . ± .
06 26 . ± .
08 5 . ± .
29 4908 . ± .
12 313 . ± .
34 0 . . ± .
20 29 . ± .
35 4 . ± .
15 161 . ± .
69 1 . ± .
47 0 . . ± .
62 23 . ± .
88 8 . ± .
51 160 . ± .
65 1 . ± .
25 0 . . ± .
66 37 . ± .
96 6 . ± .
25 148 . ± .
94 0 . ± .
47 0 . . ± .
20 36 . ± .
14 6 . ± .
57 165 . ± .
93 1 . ± .
82 0 . . ± .
72 27 . ± .
21 6 . ± .
52 241 . ± .
38 1 . ± .
82 0 . . ± .
51 26 . ± .
39 6 . ± .
44 275 . ± .
82 3 . ± .
05 0 . . ± .
24 30 . ± .
65 9 . ± .
42 220 . ± .
26 1 . ± .
94 0 . . ± .
36 26 . ± .
91 6 . ± .
72 303 . ± .
52 3 . ± .
82 0 . . ± .
38 34 . ± .
26 6 . ± .
41 206 . ± .
12 0 . ± .
47 0 . . ± .
09 32 . ± .
49 6 . ± .
64 254 . ± .
03 0 . ± .
94 0 . . ± .
58 41 . ± .
32 3 . ± .
27 244 . ± .
56 2 . ± .
41 0 . ffi ce 50 . ± .
10 22 . ± .
56 6 . ± .
46 655 . ± .
32 11 . ± .
36 0 . . ± .
27 29 . ± .
69 5 . ± .
32 498 . ± .
12 2 . ± .
25 0 . . ± .
34 40 . ± .
78 3 . ± .
18 512 . ± .
86 0 . ± .
47 0 . . ± .
25 25 . ± .
14 6 . ± .
34 3964 . ± .
81 64 . ± .
18 0 . . ± .
07 29 . ± .
66 4 . ± .
11 122 . ± .
36 6 . ± .
47 0 . . ± .
41 28 . ± .
71 7 . ± .
28 149 . ± .
03 4 . ± .
70 0 . . ± .
32 35 . ± .
38 4 . ± .
19 173 . ± .
58 3 . ± .
94 0 . − − − − − . . ± .
90 27 . ± .
68 4 . ± .
28 324 . ± .
51 15 . ± .
56 0 . . ± .
61 23 . ± .
01 6 . ± .
26 403 . ± .
17 27 . ± .
70 0 . . ± .
81 30 . ± .
03 7 . ± .
11 238 . ± .
45 7 . ± .
47 0 . . ± .
54 24 . ± .
81 7 . ± .
45 922 . ± .
90 66 . ± .
02 0 . . ± .
57 27 . ± .
64 5 . ± .
63 1137 . ± .
79 38 . ± .
53 0 . − − − − − . − − − − − . ffi ce 55 . ± .
48 28 . ± .
19 8 . ± .
32 626 . ± .
59 32 . ± .
41 0 . − − − − − . . ± .
32 42 . ± .
09 2 . ± .
38 621 . ± .
13 28 . ± .
56 0 . . ± .
02 28 . ± .
33 7 . ± .
17 3730 . ± .
69 205 . ± .
93 0 . − − − − − . . ± .
47 21 . ± .
35 7 . ± .
14 172 . ± .
78 1 . ± .
00 0 . . ± .
80 30 . ± .
21 6 . ± .
49 285 . ± .
55 1 . ± .
82 0 . − − − − − . . ± .
48 20 . ± .
23 6 . ± .
18 603 . ± .
12 16 . ± .
81 0 . . ± .
62 19 . ± .
61 6 . ± .
03 450 . ± .
59 5 . ± .
56 0 . . ± .
62 23 . ± .
41 7 . ± .
52 279 . ± .
70 1 . ± .
82 0 . . ± .
36 16 . ± .
09 6 . ± .
62 2477 . ± .
27 26 . ± .
15 0 . . ± .
50 21 . ± .
56 7 . ± .
48 1780 . ± .
69 49 . ± .
69 0 . . ± .
47 34 . ± .
65 6 . ± .
22 237 . ± .
88 0 . ± .
00 0 . . ± .
11 28 . ± .
35 7 . ± .
21 912 . ± .
62 51 . ± .
02 0 . ffi ce 57 . ± .
31 21 . ± .
31 5 . ± .
15 783 . ± .
34 28 . ± .
47 0 . − − − − − . − − − − − . . ± .
21 21 . ± .
14 5 . ± .
18 555 . ± .
71 16 . ± .
10 0 . ν -SVR min 16 home − − − − − . . ± .
58 26 . ± .
81 8 . ± .
52 140 . ± .
94 0 . ± .
00 0 . . ± .
05 39 . ± .
98 6 . ± .
48 143 . ± .
39 0 . ± .
00 0 . − − − − − . . ± .
78 25 . ± .
29 5 . ± .
28 258 . ± .
23 0 . ± .
00 0 . . ± .
29 30 . ± .
67 9 . ± .
28 208 . ± .
84 0 . ± .
00 0 . . ± .
24 30 . ± .
27 10 . ± .
38 218 . ± .
56 0 . ± .
00 0 . . ± .
46 27 . ± .
50 8 . ± .
10 289 . ± .
94 1 . ± .
00 0 . − − − − − . − − − − − . . ± .
11 38 . ± .
64 4 . ± .
22 257 . ± .
64 0 . ± .
00 0 . ffi ce 51 . ± .
47 23 . ± .
40 7 . ± .
10 582 . ± .
47 0 . ± .
00 0 . . ± .
25 28 . ± .
22 6 . ± .
19 522 . ± .
83 0 . ± .
47 0 . . ± .
70 32 . ± .
02 3 . ± .
17 675 . ± .
68 1 . ± .
82 0 . . ± .
36 25 . ± .
08 6 . ± .
21 3833 . ± .
48 6 . ± .
70 0 . − − − − − . . ± .
24 27 . ± .
64 8 . ± .
62 132 . ± .
00 0 . ± .
00 0 . . ± .
03 39 . ± .
29 5 . ± .
10 142 . ± .
66 0 . ± .
00 0 . − − − − − . . ± .
82 29 . ± .
58 6 . ± .
24 221 . ± .
99 0 . ± .
00 0 . . ± .
09 31 . ± .
20 8 . ± .
35 201 . ± .
59 0 . ± .
00 0 . . ± .
03 32 . ± .
21 9 . ± .
31 203 . ± .
84 0 . ± .
00 0 . . ± .
70 29 . ± .
06 8 . ± .
09 275 . ± .
91 0 . ± .
47 0 . . ± .
24 36 . ± .
70 6 . ± .
44 273 . ± .
39 12 . ± .
25 0 . . ± .
49 34 . ± .
26 6 . ± .
26 233 . ± .
12 0 . ± .
00 0 . . ± .
14 40 . ± .
13 5 . ± .
06 244 . ± .
13 0 . ± .
00 0 . ffi ce 51 . ± .
13 26 . ± .
11 7 . ± .
16 522 . ± .
41 0 . ± .
00 0 . . ± .
17 31 . ± .
38 6 . ± .
04 462 . ± .
94 0 . ± .
00 0 . − − − − − . . ± .
13 28 . ± .
15 6 . ± .
19 3359 . ± .
58 0 . ± .
00 0 . able 9: Average results ( x ± σ ) for all simulated environments Alg. Prepr. Dist.(cm) Vel.(cm / s) Vel.ch.(cm / s) quality IQFRL − . ± .
57 28 . ± .
60 6 . ± .
04 0 . ± .
00 0 . ± . . ± .
99 30 . ± .
13 5 . ± .
05 40 . ± .
78 0 . ± . . ± .
83 31 . ± .
76 6 . ± .
39 6 . ± .
78 0 . ± . . ± .
87 29 . ± .
09 6 . ± .
76 39 . ± .
37 0 . ± . . ± .
88 23 . ± .
99 6 . ± .
76 17 . ± .
12 0 . ± . ν -SVR min 16 57 . ± .
05 29 . ± .
83 7 . ± .
05 0 . ± .
88 0 . ± . . ± .
49 32 . ± .
27 7 . ± .
23 1 . ± .
40 0 . ± . Table 10: Average results ( x ± σ ) of IQFRL for the real environments Env. Dist.(cm) Vel.(cm / s) Vel.ch.(cm / s) Time(s) quality real env 1 54 . ± .
59 19 . ± .
52 1 . ± .
21 100 .
70 0 . ± .
00 0 . . ± .
74 21 . ± .
43 1 . ± .
50 118 .
50 0 . ± .
00 0 . Table 12: Average results ( x ± σ ) of IQFRL and several approaches with preprocessing based on expert knowledge [15] Alg. Env. Dist.(cm) Vel.(cm / s) Vel.ch.(cm / s) Time(s) quality IQFRL wsc8a 56 . ± .
00 27 . ± .
84 7 . ± .
33 233 . ± .
28 0 . ± .
00 0 . . ± .
34 30 . ± .
34 6 . ± .
43 261 . ± .
60 0 . ± .
00 0 . . ± .
20 28 . ± .
31 5 . ± .
48 499 . ± .
74 0 . ± .
00 0 . ffi ce 51 . ± .
57 24 . ± .
18 6 . ± .
25 578 . ± .
92 0 . ± .
00 0 . . ± .
19 26 . ± .
10 6 . ± .
35 3608 . ± .
72 0 . ± .
00 0 . . ± .
33 39 . ± .
71 5 . ± .
83 174 . ± .
79 0 . ± .
00 0 . . ± .
59 37 . ± .
41 6 . ± .
31 227 . ± .
03 0 . ± .
00 0 . . ± .
91 25 . ± .
79 10 . ± .
21 587 . ± .
72 0 . ± .
00 0 . ffi ce 55 . ± .
65 32 . ± .
90 4 . ± .
28 457 . ± .
00 0 . ± .
00 0 . . ± .
92 35 . ± .
77 6 . ± .
28 2864 . ± .
27 0 . ± .
00 0 . . ± .
36 36 . ± .
85 7 . ± .
62 187 . ± .
78 0 . ± .
00 0 . . ± .
77 37 . ± .
27 9 . ± .
24 234 . ± .
70 0 . ± .
00 0 . . ± .
10 33 . ± .
89 7 . ± .
52 455 . ± .
60 0 . ± .
00 0 . ffi ce 54 . ± .
10 33 . ± .
97 6 . ± .
53 448 . ± .
36 0 . ± .
00 0 . . ± .
01 33 . ± .
14 6 . ± .
12 3073 . ± .
63 0 . ± .
00 0 . . ± .
78 30 . ± .
01 3 . ± .
13 222 . ± .
09 0 . ± .
00 0 . . ± .
88 28 . ± .
29 3 . ± .
20 290 . ± .
66 0 . ± .
00 0 . . ± .
34 23 . ± .
97 3 . ± .
14 618 . ± .
98 0 . ± .
00 0 . ffi ce 53 . ± .
22 24 . ± .
66 3 . ± .
11 594 . ± .
16 0 . ± .
00 0 . . ± .
65 25 . ± .
49 3 . ± .
06 4209 . ± .
14 0 . ± .
00 0 . . ± .
36 37 . ± .
53 5 . ± .
50 182 . ± .
35 0 . ± .
00 0 . . ± .
08 37 . ± .
82 4 . ± .
21 227 . ± .
46 0 . ± .
00 0 . . ± .
99 33 . ± .
33 4 . ± .
11 465 . ± .
33 0 . ± .
00 0 . ffi ce 53 . ± .
97 34 . ± .
65 5 . ± .
22 432 . ± .
48 0 . ± .
00 0 . . ± .
49 34 . ± .
32 5 . ± .
11 3053 . ± .
72 0 . ± .
00 0 . d ( h ) is A , d in 50 percent of A , b THEN vlin is A vlin and vang is A vang Figure 19: A typical rule learned by IQFRL. A , d indicates a low distance and A , b indicates the frontal and right sectors. algorithm, therefore for each output, di ff erent rule bases werelearned. Moreover, table 14 shows the complexity of thelearned rules in terms of mean and standard deviation ofthe number of propositions and granularities for each input variable.The IQFRL approach is able to learn knowledge bases witha much lower number of rules than MOGUL, even though itis learning both outputs at the same time. The learning ofQFRs results in a low number of propositions per rule, thusdemonstrating its generalization ability, in spite of the hugeinput space dimensionality. Moreover, the granularities of eachof the input variables are, in general, also low. Therefore, thelearned knowledge bases show a low complexity without losingaccuracy.16 able 13: Number of rules learned Alg. Preproc. Output R straight R convex R concave IQFRL − Both 108 . ± .
88 47 . ± .
09 40 . ± . vlin . ± .
60 308 . ± .
12 680 . ± . vang . ± .
37 302 . ± .
57 712 . ± . vlin . ± .
88 268 . ± .
66 664 . ± . vang . ± .
48 252 . ± .
28 709 . ± . Table 14: Complexity of the rules
Alg. Preproc. Dataset Output Propositions g d g b g v IQFRL − Straight Both 2 . ± .
94 7 . ± .
52 5 . ± .
62 6 . ± . . ± .
69 15 . ± .
59 11 . ± .
50 6 . ± . . ± .
18 3 . ± .
79 7 . ± .
86 6 . ± . vlin . ± .
00 24 . ± .
80 16 . ± .
00 39 . ± . vang . ± .
00 24 . ± .
66 16 . ± .
00 35 . ± . vlin . ± .
00 32 . ± .
75 16 . ± .
00 51 . ± . vang . ± .
00 38 . ± .
86 16 . ± .
00 45 . ± . vlin . ± .
00 22 . ± .
76 16 . ± .
00 32 . ± . vang . ± .
00 23 . ± .
39 16 . ± .
00 37 . ± . vlin . ± .
00 26 . ± .
41 16 . ± .
00 33 . ± . vang . ± .
00 26 . ± .
18 16 . ± .
00 37 . ± . vlin . ± .
00 25 . ± .
29 16 . ± .
00 49 . ± . vang . ± .
00 31 . ± .
50 16 . ± .
00 46 . ± . vlin . ± .
00 23 . ± .
79 16 . ± .
00 33 . ± . vang . ± .
00 23 . ± .
27 16 . ± .
00 34 . ± .
6. Real World Applications
Two of the most used behaviors in mobile robotics are pathand object tracking. In recent years several real applicationsof these behaviors have been described in the literature indi ff erent realms. For instance, in [37], a tour-guide robot thatcan either follow a predefined route or a tour-guide person wasshown. With a similar goal, an intelligent hospital service robotwas presented in [38]. In this case, the robot can improvethe services provided in the hospital through autonomousnavigation based on following a path. More recently, in[39] a team of robots that cooperate in a building developingmaintenance and surveillance tasks was presented.More dynamic environments were described in [40, 41],where the robot had to operate in buildings and populated urbanareas. These environments introduce numerous challenges toautonomous mobile robots as they are highly complex. Finally,in [42] the authors presented a motion planner that was ableto generate paths taking into account the uncertainty due tocontrols and measurements.In these and other real applications, the robot has to dealwith static and moving objects, including the presence of peoplesurrounding the robot, etc. All these di ffi culties make necessarythe combination of behaviors to perform tasks like path orpeople tracking in real environments. In order to implementthese tasks in a safe way, the robot must be endowed with theability to avoid collisions with all the objects in the environmentwhile implementing the tasks. These behaviors are challengingtasks that allow us to show the performance of the IQFRL-basedapproach in realistic conditions. The following behaviors areconsidered in this section, in order of increasing complexity: 1. Path tracking with obstacles avoidance . In this behavior,the mobile robot must follow a path with obstacles in it.A typical application of this behavior is a tour-guide robotthat has to follow a predefined tour in a museum. Althoughin the initial path there were no obstacles in the trajectory,the modification of the environment with new exhibitorsand the presence of people make it necessary that therobot modify the predefined route, avoiding the collisionwith the obstacles and returning to the predefined path asquickly as possible.2.
Object tracking with fixed obstacles avoidance . In thiscase, the robot has to follow the path of a moving objectwhile being at a reference distance to the object. Forinstance, a tour-guide person being followed by a robotwith extended information on a screen. If the followedobject comes too close to an obstacle, the robot must avoidthe collision while maintaining the tracking behavior.3.
Object tracking with moving obstacle avoidance . Thisbehavior is a modification of the previous one, andpresents a more di ffi cult problem. In addition to thefixed obstacles avoidance, the robot has to track an objectwhile preventing collisions with moving obstacles that arecrossing between the robot and the tracked object. Thesemoving obstacles can be persons walking around or evenother mobile robots doing their own behaviors.In order to perform these behaviors, a fusion of two di ff erentcontrollers has been developed. On one hand, a trackingcontroller [43] was used in order to follow the path or themoving object. On the other hand, the wall-following controllerlearned with the IQFRL algorithm was used as the collision17voidance behavior. Section 5.3 showed that this controller isrobust and operates safely while performing the task. Therewere no blockades during the behavior in all the tests, neitherfrom collisions nor from other reasons. The way in whichthe wall-following behavior is used in order to avoid collisionsis: given an obstacle that is too close to the robot, it can besurrounded following the border of this obstacle in order toavoid a collision with it. The controller described in this paperfollows the wall on its right, while for this task, the obstaclecan be on both sides. This can easily be solved by a simplepermutation of the laser beams depending on which side theobstacle is detected.The wall-following behavior is only executed when the robotis too close to an object —a value of 0.4 m has been used asthreshold. The objective of the controller is to drive the robot toa state in which there is no danger of collision —a value of 0.5m has been established as a safe distance. As long as the robotis in a safe state the tracking behavior is resumed. This behaviorcontrols the linear and angular velocities of the robot in order toplace it at an objective point in every control cycle. This pointis defined using the desired distance between the robot and themoving object. The tracking controller uses four di ff erent inputvariables: • The distance between the robot and the objective point: d = (cid:112) ( x r − x ob j ) + ( y r − y ob j ) d re f (19)where ( x r , y r ) are the coordinates of the robot, ( x ob j , y ob j )are the coordinates of the objective point and d re f is thereference distance between the robot and the objectivepoint. • The deviation of the robot with respect to the objectivepoint: dev = arctan (cid:32) y ob j − y r x ob j − x r (cid:33) − θ r (20)where θ r is the angle of the robot. A negative value of thedeviation indicates that the robot is moving in a directionto the left of the objective point, while a positive valuemeans that it is moving to the right. • The di ff erence of velocity between the robot and theobjective point: ∆ v = v r − v m v max (21)where v r , v m and v max are the linear velocities of the robot,the moving object, and the maximum velocity attainableby the robot. • The di ff erence in angle between the object and the robot: ∆ θ = θ m − θ r (22)where θ m is the angle of the moving object. The reference distance ( d re f ) is di ff erent depending on thetype of behavior. For the path tracking behavior, there is nomoving object tracking and, therefore, the robot follows thepath with d re f = d re f = . ff erentenvironments ( M1 and Domus ) which try to reproduce the plantof a museum (Fig. 20). Figs. 20(a) and 20(b) show thepath tracking with obstacles avoidance behavior. The orange(medium grey) path represents the trajectory that has to befollowed by the robot. This path also includes informationof the velocity that the robot should have at each point. Thehigher the concentration of marks, the lower the linear velocityin that point of the path. Moreover, the path was generatedwithout obstacles and, once the obstacles were added to theenvironment, the robot was placed at the beginning of thepath in order to track it. The cyan (light grey) path indicatesthe trajectory implemented by the robot using the proposedcombination of controllers (wall-following and tracking). It canbe seen that the robot avoids successfully all the obstacles in itspath, i.e., the wall following behavior deviates the robot fromthe predefined path when an obstacle generates a possibility ofcollision. When the robot overcomes the obstacle, it returns tothe predefined path as quickly as possible.In the case of the moving object tracking with fixedobstacles avoidance behavior (Figs. 20(c) and 20(d)), the cyan(light grey) line represents the path of the robot due to thecombination of the controllers. Also, the orange (medium grey)path shows the trajectory of the moving object tracked by therobot. In this behavior, the moving object goes too close tosome obstacles in several situations, forcing the controller toexecute the wall following behavior in order to avoid collisions.Moreover, the wall-following controller is also executed whenthe moving object turns the corners very close to the obstacles,at a distance that is unsafe for the robot.The last and most complex behavior is moving objecttracking with moving obstacle avoidance (Figs. 20(e) and20(f)). The cyan (light grey) path shows, once again, the pathfollowed by the robot when it tracks the moving object (orange / medium grey path) while avoiding static and moving obstacles.Also, the path followed by the moving obstacle that shouldbe avoided by the robot is shown in blue (dark grey). Thearrows along the path indicate the places in which the obstacleinterferes with the robot. This behavior shows the ability of thecontroller learned with the IQFRL algorithm to avoid collisions,even when the moving obstacle tries to force the robot to fail:the controller can detect the situation and perform the tasksafely, avoiding collisions.
7. Conclusions
This paper describes a new algorithm which is able tolearn controllers with embedded preprocessing for mobilerobotics. The transformation of the low-level variables intohigh-level variables is done through the use of Quantified18 a) Path tracking with obstacles avoidance in M1. (b)
Path tracking with obstacles avoidance in Domus. (c)
Object tracking with fixed obstacles avoidancein M1. (d)
Object tracking with fixed obstacles avoidance in Domus. (e)
Object tracking with moving obstacleavoidance in M1. (f)
Object tracking with moving obstacle avoidance in Domus.
Figure 20: Experiments on real applications. Colors code: 1) Original path to be tracked in orange (medium grey); 2) Robot path in cyan (light grey); 3) Movingobstacle path in blue (dark grey). The arrows along the path in Figs. 20(e) and 20(f) indicate the places in which the moving obstacle interferes with the robot.
Fuzzy Propositions and Rules. Furthermore, the algorithminvolves linguistic labels defined by multiple granularitywithout limiting the granularity levels. The algorithm wasextensively tested with the wall-following behavior both inseveral simulated environments and on a
Pioneer 3-AT robot intwo real environments. The results were compared with someof the most well-known algorithms for learning controllers inmobile robotics. Non-parametric significance tests have been performed, showing a very good and a statistically significantperformance of the IQFRL approach.
8. Acknowledgements
This work was supported by the Spanish Ministry ofEconomy and Competitiveness under grants TIN2011-22935and TIN2011-29827-C02-02. I. Rodriguez-Fdez is supported19y the Spanish Ministry of Education, under the FPU nationalplan (AP2010-0627). M. Mucientes is supported by the Ram´ony Cajal program of the Spanish Ministry of Economy andCompetitiveness. This work was supported in part by theEuropean Regional Development Fund (ERDF / FEDER) underthe projects CN2012 /
151 and CN2011 /
058 of the GalicianMinistry of Education.NOTICE: this is the authors version of a work thatwas accepted for publication in Applied Soft Computing.Changes resulting from the publishing process, such as peerreview, editing, corrections, structural formatting, and otherquality control mechanisms may not be reflected in thisdocument. Changes may have been made to this worksince it was submitted for publication. A definitive versionwas subsequently published in Applied Soft Computing,26:123-142, 2015, doi:10.1016 / j.asoc.2014.09.021. Appendix A. IQFRL for Classification (IQFRL-C)
This section describes the modifications that are necessary toaccomplish for adapting the IQFRL algorithm for classificationproblems.
Appendix A.1. Examples and Grammar
The structure of the examples used for classification is verysimilar to the one described in expression 4: e l = ( d (1) , . . . , d ( N b ) , velocity , class ) (A.1)where class represents the class of the example.Furthermore, the consequent production (production 3) of thegrammar (Fig. 4) must be modified to:3. consequent −→ F c where F c is the linguistic label of the class. The output variable( class ) has a granularity g classc . Appendix A.2. Initialization
The consequent of the rules is initialized as F c = A γ c where γ is the class that represents the example. Only those exampleswhose class is di ff erent from the default class ( A fc ) are used inthe initialization of a new individual. Appendix A.3. Evaluation
For each individual (rule) of the population, the followingvalues are calculated: • True positives ( tp ): – tp = (cid:12)(cid:12)(cid:12)(cid:12)(cid:110) e l : C l = C j ∧ DOF j (cid:16) e l (cid:17) > (cid:111)(cid:12)(cid:12)(cid:12)(cid:12) , where C l is the class of example e l , C j is the class in theconsequent of the j -th rule, and DOF j (cid:16) e l (cid:17) is the DOF of the j -th rule for the example e l . tp representsthe number of examples that have been correctlyclassified by the rule. – tpd = (cid:80) l DOF j (cid:16) e l (cid:17) : C l = C j , i.e., the sum of the DOF s of the examples contributing to tp . – tp = tp + tpd / tp • False positives ( fp ): – fp = (cid:12)(cid:12)(cid:12)(cid:12)(cid:110) e l : C l (cid:44) C j ∧ DOF j (cid:16) e l (cid:17) > (cid:111)(cid:12)(cid:12)(cid:12)(cid:12) : number ofpatterns that have been classified by the rule butbelong to a di ff erent class. – fpd = (cid:80) l DOF j (cid:16) e l (cid:17) : C l (cid:44) C j , i.e., the sum of the DOF s of the patterns that contribute to fp . – fp = fp + fpd / fp • False negatives ( fn ): – fn = n C j ex − tp , where n C j ex = (cid:12)(cid:12)(cid:12)(cid:12)(cid:110) e l : C l = C j (cid:111)(cid:12)(cid:12)(cid:12)(cid:12) . fn isthe number of examples that have not been classifiedby the rule but belong to the class in the consequentof the rule.The values of tp and fp take into account not only the numberof examples that are correctly / incorrectly classified, but also thedegree of fulfillment of the rule for each of the examples. Incase that tpd ≈
0, then tp ≈ tp , while if it is high ( tpd ≈ tp )then tp ≈ tp +
1. Taking into account these definitions, theaccuracy of an individual of the population can be described as: confidence = fp (A.2)while the ability of generalization of a rule is calculated as: support = tptp + fn (A.3)Finally, fitness is defined as the combination of both values: fitness = confidence · support (A.4)which represents the strength of an individual. Appendix A.4. Mutation
For classification, the probability that an example matchesthe output associated to a rule (Eq. 7) is binary. Therefore, inorder to select the example ( e sel ) that is going to be used formutation, the following criteria is used: • For generalization, the probability for an example e l to beselected is: P ( e l = e sel ) = − (cid:80) j DOF j (cid:16) e l (cid:17) · confidence j (cid:80) j DOF j (cid:0) e l (cid:1) (A.5)where confidence j is the confidence (Eq. A.2) of the j -thindividual. This probability measures the accuracy withwhich the individuals of the population cover the example e l .20 able A.15: Number of rules learned for dataset by IQFRL-C R straight R convex R concave − . ± .
35 10 . ± . Table A.16: Complexity of the rules learned by IQFRL-C
Propositions g d g b g v . ± .
90 7 . ± .
65 8 . ± .
49 5 . ± . Table A.17: Confusion matrix for the classifier
Actual / Predicted Straight Convex ConcaveStraight 30.85 2.40 0.23Convex 0.70 30.97 0.00Concave 0.23 0.06 34.55
Accuracy = . κ = . • For specialization, the mutated individual uncovers theexample e sel . The probability to select e l for specializationis calculated as follow: P ( e l = e sel ) = − DOF j (cid:16) e l (cid:17) (A.6)Finally, the consequent is mutated considering the class ofthe examples covered by the individual. Thus, the probabilitythat the consequent of the individual j change to the class C γ isdefined as: P (cid:16) j | C γ (cid:17) = (cid:80) l DOF j (cid:16) e l (cid:17) : C l = C γ (cid:80) l DOF j (cid:0) e l (cid:1) (A.7) Appendix A.5. Performance
The parameters used for IQFRL-C are the same as forregression (Sec. 5.2). Moreover, the default class is straightwall. Tables A.15 and A.16 show the number of rules learnedby the classification method IQFRL-C and the complexity ofthe rules learned in terms of mean and standard deviation ofthe number of propositions and granularities for each inputvariable. The number of rules for each situation is very low,resulting in very interpretable knowledge bases. Furthermore,the complexity of the rules is also low, as the number ofpropositions and granularities learned show that the rules arevery general.Table A.17 shows the confusion matrix for the learnedclassifier. The matrix was obtained as the average of a 5-foldcross-validation over the sets. Moreover, the performance ofthe classifier was analyzed with the accuracy and the Cohen’s κ [44]. Both measures are very close to 1, showing the highperformance of the classifier obtained with IQFRL-C. References [1] M. Mucientes, A. Bugar´ın, People detection through quantified fuzzytemporal rules, Pattern Recognition 43 (2010) 1441–1453.[2] O. Cordon, F. Gomide, F. Herrera, F. Ho ff mann, L. Magdalena, Ten yearsof genetic fuzzy systems: current framework and new trends, Fuzzy setsand systems 141 (1) (2004) 5–31. [3] F. Herrera, Genetic fuzzy systems: taxonomy, current research trends andprospects, Evolutionary Intelligence 1 (1) (2008) 27–46.[4] B. Bonte, B. Wyns, Automatically designing robot controllers and sensormorphology with genetic programming, in: Proceedings of the 6th IFIPArtificial Intelligence Applications and Innovations (AIAI), 2010, pp.86–93.[5] R. Mart´ınez-Soto, O. Castillo, J. R. Castro, Genetic algorithmoptimization for type-2 non-singleton fuzzy logic controllers, RecentAdvances on Hybrid Approaches for Designing Intelligent Systems(2014) 3–18.[6] M. Umar Suleman, M. Awais, Learning from demonstration inrobots: Experimental comparison of neural architectures, Robotics andComputer-Integrated Manufacturing 27 (4) (2011) 794–801.[7] A. Agostini, E. Celaya Llover, Reinforcement learning for robot controlusing probability density estimations, in: Proceedings of the 7thInternational Conference on Informatics in Control (ICINCO), 2010, pp.160–168.[8] C. W. Lo, K. L. Wu, Y. C. Lin, J. S. Liu, An intelligent control systemfor mobile robot navigation tasks in surveillance, in: Robot IntelligenceTechnology and Applications 2, Springer, 2014, pp. 449–462.[9] T. Kondo, Evolutionary design and behavior analysis of neuromodulatoryneural networks for mobile robots control, Applied Soft Computing 7 (1)(2007) 189–202.[10] K. Samsudin, F. Ahmad, S. Mashohor, A highly interpretable fuzzyrule base using ordinal structure for obstacle avoidance of mobile robot,Applied Soft Computing 11 (2) (2011) 1631–1637.[11] S. Mabu, A. Tjahjadi, S. Sendari, K. Hirasawa, Evaluation on therobustness of genetic network programming with reinforcement learning,in: Proceedings of the IEEE International Conference on Systems Manand Cybernetics (SMC), 2010, pp. 1659–1664.[12] K. Senthilkumar, K. Bharadwaj, Hybrid genetic-fuzzy approach toautonomous mobile robot, in: Proceedings of the IEEE InternationalConference on Technologies for Practical Robot Applications (TePRA),2009, pp. 29–34.[13] M. Mucientes, D. Moreno, A. Bugar´ın, S. Barro, Design of a fuzzycontroller in mobile robotics using genetic algorithms, Applied SoftComputing 7 (2) (2007) 540–546.[14] M. Mucientes, R. Alcal´a, J. Alcal´a-Fdez, J. Casillas, Learning weightedlinguistic rules to control an autonomous robot, International Journal ofIntelligent Systems 24 (3) (2009) 226–251.[15] M. Mucientes, J. Alcal´a-Fdez, R. Alcal´a, J. Casillas, A case study forlearning behaviors in mobile robotics by evolutionary fuzzy systems,Expert Systems With Applications 37 (2010) 1471–1493.[16] J. Kuo, Y. Ou, An evolutionary fuzzy behaviour controller using geneticalgorithm in robocup soccer game, in: Proceedings of the NinthInternational Conference on Hybrid Intelligent Systems (HIS), Vol. 1,2009, pp. 281–286.[17] M. Khanian, A. Fakharian, M. Chegini, B. Jozi, An intelligent fuzzycontroller based on genetic algorithms, in: Proceedings of the IEEEInternational Symposium on Computational Intelligence in Robotics andAutomation (CIRA), 2009, pp. 486–491.[18] M. Mucientes, D. L. Moreno, A. Bugar´ın, S. Barro, Evolutionary learningof a fuzzy controller for wall-following behavior in mobile robotics, SoftComputing 10 (10) (2006) 881–889.[19] C. Juang, C. Hsu, Reinforcement ant optimized fuzzy controller formobile-robot wall-following control, IEEE Transactions on IndustrialElectronics 56 (10) (2009) 3931–3940.[20] C. Hsu, C. Juang, Evolutionary robot wall-following control usingtype-2 fuzzy controller with species-DE-activated continuous ACO, IEEETransactions on Fuzzy Systems 21 (1) (2013) 100–112.[21] C. Juang, Y. Chang, Evolutionary-group-based particle-swarm-optimizedfuzzy controller with application to mobile-robot navigation in unknownenvironments, IEEE Transactions on Fuzzy Systems 19 (2) (2011)379–392.[22] M. Mucientes, I. Rodr´ıguez-Fdez, A. Bugar´ın, Evolutionary learningof quantified fuzzy rules for hierarchical grouping of laser sensor datain intelligent control, in: Proceedings of the IFSA-EUSFLAT 2009conference, 2009, pp. 1559–1564.[23] L. Zadeh, A computational approach to fuzzy quantifiers in naturallanguages, Computers & Mathematics with Applications 9 (1) (1983)149–184.