[PDF] Fusion of laser diffraction and chord length distribution data for estimation of particle size distribution using multi-objective optimisation

Abstract

The in situ measurement of the particle size distribution (PSD) of a suspension of particles presents huge challenges. Various effects from the process could introduce noise to the data from which the PSD is estimated. This in turn could lead to the occurrence of artificial peaks in the estimated PSD. Limitations in the models used in the PSD estimation could also lead to the occurrence of these artificial peaks. This could pose a significant challenge to in situ monitoring of particulate processes, as there will be no independent estimate of the PSD to allow a discrimination of the artificial peaks to be carried out. Here, we present an algorithm which is capable of discriminating between artificial and true peaks in PSD estimates based on fusion of multiple data streams. In this case, chord length distribution and laser diffraction data have been used. The data fusion is done by means of multi-objective optimisation using the weighted sum approach. The algorithm is applied to two different particle suspensions. The estimated PSDs from the algorithm are compared with offline estimates of PSD from the Malvern Mastersizer and Morphologi G3. The results show that the algorithm is capable of eliminating an artificial peak in a PSD estimate when this artificial peak is sufficiently displaced from the true peak. However, when the artificial peak is too close to the true peak, it is only suppressed but not completely eliminated.

Full PDF

FFusion of laser diﬀraction and chord length distributiondata for estimation of particle size distribution usingmulti-objective optimisation

Okpeafoh S. Agimelen a, ∗ , Carla Ferreira a , Bilal Ahmed b , Javier Cardona c ,Yi-chieh Chen a , Alastair J. Florence b , Ivan Andonovic c , Jan Sefcik a ,Anthony J. Mulholland d, ∗ a EPSRC Future Manufacturing Hub for Continuous Manufacturing and AdvancedCrystallisation, Department of Chemical and Process Engineering, University ofStrathclyde, James Weir Building, 75 Montrose Street, Glasgow, G1 1XJ, UnitedKingdom. b EPSRC Future Manufacturing Hub for Continuous Manufacturing and AdvancedCrystallisation, Strathclyde Institute of Pharmacy and Biomedical Sciences, Technologyand Innovation Centre, University of Strathclyde, 99 George Street, Glasgow, G1 1RD,United Kingdom c Centre for Intelligent Dynamic Communications, Department of Electronic andElectrical Engineering, University Of Strathclyde, Royal College Building, 204 GeorgeStreet, Glasgow, G1 1XW, United Kingdom. d Department of Mathematics and Statistics, University of Strathclyde, LivingstoneTower, 26 Richmond Street, Glasgow, G1 1XH, United Kingdom.

Abstract

The in situ measurement of the particle size distribution (PSD) of a sus-pension of particles presents huge challenges. Various eﬀects from the processcould introduce noise to the data from which the PSD is estimated. This inturn could lead to the occurrence of artiﬁcial peaks in the estimated PSD.Limitations in the models used in the PSD estimation could also lead to theoccurrence of these artiﬁcial peaks. This could pose a signiﬁcant challenge toin situ monitoring of particulate processes, as there will be no independentestimate of the PSD to allow a discrimination of the artiﬁcial peaks to becarried out. ∗ Corresponding authors

Email addresses: [email protected] (Okpeafoh S. Agimelen), [email protected] (Anthony J. Mulholland)

Preprint submitted to Elsevier April 17, 2018 a r X i v : . [ phy s i c s . d a t a - a n ] A p r ere, we present an algorithm which is capable of discriminating betweenartiﬁcial and true peaks in PSD estimates based on fusion of multiple datastreams. In this case, chord length distribution and laser diﬀraction data havebeen used. The data fusion is done by means of multi-objective optimisationusing the weighted sum approach. The algorithm is applied to two diﬀerentparticle suspensions. The estimated PSDs from the algorithm are comparedwith oﬄine estimates of PSD from the Malvern Mastersizer and MorphologiG3. The results show that the algorithm is capable of eliminating an artiﬁcialpeak in a PSD estimate when this artiﬁcial peak is suﬃciently displaced fromthe true peak. However, when the artiﬁcial peak is too close to the true peak,it is only suppressed but not completely eliminated. Keywords:

Particle size distribution, chord length distribution, particleshape, crystallisation, inverse problems, laser diﬀraction, multi-objectiveoptimisation, data fusion

1. Introduction

The particle size distribution (PSD) is a key quality attribute of theparticles in powders. This is particularly important in the manufacturingof particulate products in the pharmaceutical, food, personal care and ﬁnechemicals industries where the success or failure of the process depends heav-ily on the PSD of the particles [1]. In addition, the perfomance of the ﬁnalproducts depend heavily on the PSD of the constituent particles [2].Various methods and techniques [3] exist for the estimation of the PSDof a suspension of particles. These methods typically involve the processingof sensor data or analysis of images. However, each method is limited by therange of applicability of the sensor used in the measurement. For example,imaging methods could yield inaccurate results in very dense suspensionswhere there are too many overlapping objects in the images or insuﬃcientcontrast between the objects and image background [4, 5]. Methods based onanalysis of laser diﬀraction data could produce artiﬁcial particle size modesif there is multiple scattering or the model used in the analysis is inadequatefor the shape of the particles in the suspension [6, 7]. Methods based onanalysis of chord length distribution (CLD) could overestimate the ﬁnes ina suspension for example, when bubbles are produced in a process and theycontribute to the CLD counts [5].Some of the inaccuracies involved in the estimation of PSD from a single2ensor data could be alleviated by adopting the approach of multi-sensor datafusion. This data fusion approach involves integrating data from multiplesensors for greater accuracy of estimated quantities [8]. Various techniquesexist for the implementation of multi-sensor data fusion. They are mostlyaimed at dealing with noise and imperfections of data from multiple sensors[8]. Here, we adopt the weighted sum approach of multi-objective optimisa-tion [9]. This approach is easy to implement as it allows a decision makerto unambiguously choose from a range of mathematically feasible solutions.We demonstrate the applicability of this approach in the estimation of PSDin particulate processes by applying this weighted sum method to combinedCLD and laser diﬀraction data. We vary the weights on the combined datastreams to produce a Pareto curve. We then apply a method for estimat-ing the knee point of the Pareto curve at which the desired PSD is chosen.Our results show that some of the artefacts encountered in estimating PSDfrom single sensors can be eliminated or mitigated by this multi-objectiveapproach.

2. Materials and experimental procedure

The materials used in this work consisted of polystyrene (PS) particlesmade by suspension polymerisation [10]. The particles were in a wide range ofsizes, hence sieving was carried out to obtain particles in diﬀerent size ranges.The sieve fraction PS 300-500 µ m exhibited multiple peaks upon analysis, andhence was included in this paper as it demonstrated the phenomenon whichis the subject of this paper. The particles in the PS 300-500 µ m size rangewere mostly spherical in shape (see supplementary information for sampleimages).Sieve fractions of metformin hydrochloride (MET), which was purchasedfrom Molekula, were also prepared for measurements in this work. The sievefraction MET 180-250 µ m was used in this work. Similar to the PS 300-500 µ msieve fraction, the MET 180-250 µ m sieve fraction also exhibited multiplepeaks, and hence was included in this paper. The MET particles were rod-like (see supplementary information for sample images), so that the largersieve fractions contained a signiﬁcant quantity of particles larger than 1mm.These particles would be too large for the measurement instruments usedin this work, hence the larger sieve fractions were not used. Similarly, the3maller sieve fractions contained a signiﬁcant quantity of ﬁnes which wouldcomplicate the analysis, so that the smaller sieve fractions were not used.A sample consisting of 6.160g of the PS 300-500 µ m was suspended in20.045g of water. The suspension was stirred with an overhead stirrer at 400rpm. The suspension was monitored with both the focus beam reﬂectancemeasurement (FBRM) G400 and particle vision and measurement (PVM)V819 probes from Mettler Toledo. Another sample of the PS 300-500 µ m wassuspended in water in the Malvern Mastersizer instrument, and subsequentlymeasured. Both the estimated volume based PSD and raw measured scat-tering intensity data were obtained from the instrument software. Finally,a sample of the PS 300-500 µ m sieve fraction was dispersed in the MalvernMorphologi G3 instrument for PSD estimation.A similar procedure was applied for the MET 180-250 µ m sieve fraction.In the case of MET 180-250 µ m sieve fraction, a mass of 6.968g of the samplewas suspended in 68.902g of isopropanol (IPA) for FBRM and PVM mea-surement. Samples of the MET 180-250 µ m sieve fraction were also analysedwith the Malvern Mastersizer and Morphologi instruments as in the case ofthe PS 300-500 µ m sieve fraction.

3. Method

The fusion of CLD and laser diﬀraction data was done in a multi-objectiveoptimisation manner in this work. The weighted sum approach [9, 11, 12] inmulti-objective optimisation was adopted because of its ease of implemen-tation and interpretation. The procedure for performing the data fusion isdescribed in the following subsections.

In the weighted sum approach involving two objective functions, the op-timisation problem is formulated as follows [9]min X F ( X , γ ) = γf ( X ) + (1 − γ ) f ( X )subject to X ≥ γ ∈ [0 , , (1)where γ is a weight parameter and F is the single objective function formedfrom the combination of f and f . Each objective function f and f depends4n the N -dimensional PSD X . The objective functions f i , i = 1 , f i : R N (cid:55)→ R .In this work each of the objective functions is the sum of squares errorbetween the experimentally measured quantity and the corresponding mod-elled quantity. Each objective function is weighted by the estimated standarddeviation of each measurement [13, 14]. In the case of CLD the objectivefunction is given as f = M (cid:88) i =1 (cid:20) C ∗ i − C i σ i (cid:21) , (2)where C ∗ i is the measured CLD and C i is the modelled CLD. The CLD wasmodelled using an ellipsoidal CLD model as in previous works [15–17]. Thestandard deviation σ i is estimated from multiple CLD measurements. ThePSD is discretised into N = 200 geometrically spaced bins and the CLD into M = 100 geometrically spaced bins [15–17].The elliptical model used in this work [15] admits aspect ratios ( r ) withinthe range r ∈ (0 , r = 1, implying that a spherical model for the CLD wasused. This is to make it consistent with the spherical model used for laserdiﬀraction in this work.Similar to the case of f , the objective function f for laser diﬀraction isgiven as f = M (cid:88) i =1 (cid:20) I ∗ i − I i σ i (cid:21) , (3)where I ∗ i is the experimentally measured scattering intensity by laser diﬀrac-tion. The modelled scattering intensity I i is obtained using the Mie model forspherical particles [7, 18]. This is because of its common use in the softwareof commercial laser diﬀraction instruments for particle sizing. The scatteringintensity data from the red laser (used in this work) of the Malver Master-sizer instrument is collected in M = 47 bins. These detectors cover the focalplane and wide angle directions [19]. Like the case of CLD, the standarddeviation σ i is estimated from multiple scattering intensity measurements.The PSD X i is modelled as a mixture of N b log-normal distribution (whichis applicable in modelling a large number of physical processes [20]) basis5unctions as X i = N b (cid:88) j =1 w j D i σ j √ π e − ( ln( Di ) − µj ) σ j . (4)The discretised particle size D i of bin i , which is the spherical equivalentdiameter, is the geometric mean of the particle sizes in the boundaries ofbins i and i + 1 consistent with the PSD [7, 16, 17]. The parameters µ j and σ j set the position of the peak and the width of each of the log-normal basisfunction. A minimal number of basis function that allows a good ﬁt to beobtained for each measured quantity is used in the calculations. The weights w j on the basis functions satisfy the following requirement N b (cid:88) j w j = 1 . (5)This implies that the PSD X i is parametrised by w, µ and σ . Hence themulti-objective optimisation problem in Eq. (1) can be rewritten asmin X ( w,µ,σ ) F ( X ( w, µ, σ ) , γ ) = γf ( X ( w, µ, σ )) + (1 − γ ) f ( X ( w, µ, σ ))subject to X ( w, µ, σ ) ≥ γ ∈ [0 , N b (cid:88) j w j = 1 σ + µ ≤ a . (6)The inequality for µ and σ in Eq. (6) constrains the search for µ and σ to acircular region of speciﬁed radius a .The combined objective function F in Eq. (6) is minimised using a gradi-ent based constrained optimisation solver in MATLAB to generate the Paretooptimal solutions. The set of Pareto optimal solutions is obtained using theprinciple of non-dominance [9].A solution vector X is said to dominate another solution vector X ifthese two conditions hold :1. The solution X is no worse than X in all objectives. That is, f i ( X ) ≤ f i ( X ), i = 1 ,

2. 6. The solution X is strictly better than X in at least one objective.That is, f i ( X ) < f i ( X ) for at least one i = 1 , f and f in Eqs. (2) and (3) occur at diﬀerentscales, a normalisation is carried out to put them on the same scale. This isdone using the Utopian f Ui and Nadir f Ni points estimated for each function[9, 12] as ˜ f i = f i − f Ui f Ni − f Ui , i = 1 , , (7)where the Utopian points are the set of global minima for each objectivefunction when minimised separately, and the Nadir points are the set ofupper bounds of each objective function in the Pareto set [9].The multi-objective optimisation problem in Eq. (6) is then solved byminimising the combined objective function F , where each of the constituentobjective function f and f has been replaced by it corresponding normalisedversion using Eq. (7).For each value of the weight parameter γ in Eq. (6), it may be possibleto ﬁnd a non-dominated solution, and corresponding values of ˜ f and ˜ f .Hence, a vector of γ values can be constructed to obtain a possible set ofnon-dominated solutions, which correspond to diﬀerent pairs of ˜ f and ˜ f values. The graph of these ˜ f and ˜ f values constitute the Pareto curve.The initially estimated Pareto curve obtained by the above proceduretypically contain non-uniformly distributed points even for uniformly spacedvalues of γ . This is a common issue [9, 11, 12] with the weighted sum ap-proach described in Eq. 6. However, the weight parameter γ in Eq. (6) isrelated to the slope of the Pareto curve as [11] γ = 11 − ∂ ˜ f ∂ ˜ f . (8)This assumes a functional form of the Pareto curve say ψ so that the mapping ψ : ˜ f (cid:55)→ ˜ f holds, that is ˜ f = ψ ( ˜ f ) [11]. This allows a new vector of weights γ to be estimated that gives a Pareto curve with better spread of points. Thesteps to estimate the Pareto curve are outlined in the next section.7 .2. Multi-objective optimisation algorithm3.2.1. Estimating guess starting points for searching The process of minimising the objective functions F in Eq. (6), f in Eq.(2) and f in Eq. (3) involves searching for w j , µ j and σ j values in Eq. (4)for a PSD, such that the corresponding CLD and/or scattering intensity datagive good ﬁts to the corresponding experimentally measured data. However,as the gradient based algorithms used can only guarantee ﬁnding values atthe local minima of corresponding function, then it is necessary to start thesearch with values of the quantities such that the corresponding objectivefunctions are close to their global minima. This is done using the method oftruncated singular value decomposition (TSVD) [13] as outlined below.1. Construct transformation matrices A and A with N columns. Thecolumns of A are the simulated CLDs C for diﬀerent particle sizes D to D N . Similarly, the columns of A are the simulated scatteringintensities I for diﬀerent particle sizes as in the case of A .2. Obtain the singular values s ki , i = 1 , , ..., N ks , k = 1 , A k , k = 1 ,

2. The singular values are ordered such that s k < s k <... < s kN ks , k = 1 , X i = ˜ A i C , i = 1 , , ..., N s − X i = ˜ A i I , i =1 , , ..., N s − A k , k = 1 , A k , k = 1 ,

2. Each of the pseudo inverseis constructed with a tolerance on the singular values, such that, onlysingular values s > s ki , i = 1 , , ..., N s − , k = 1 , ˜X ki , i = 1 , , ..., N s − , k = 1 , ˜X ki , i = 1 , , ..., N s − , k = 1 ,

2, ﬁt the PSD X in Eq.(4) for a speciﬁed value of N b to obtain diﬀerent sets of values of w ki , µ ki and σ ki .6. For each set of values of w ki , µ ki and σ ki , obtain the L norm L = (cid:107) ˜X ki − X (cid:107) , i = 1 , , ..., N ks − , k = 1 , w k , µ k and σ k , k = 1 , L as the guess starting points for a subsequent search algorithm.8 .2.2. Estimating the Nadir and Utopian points In order to perform the normalisation in Eq. (7), it is necessary to esti-mate the Nadir and Utopian points of the objective functions f and f inEq. (6) as follows.For the Nadir points:1. Set N b = 1 in Eq. (4).2. Obtain guess starting values for µ and σ in Eq. (4) for each objectivefunction f and f using the method of TSVD outlined in section 3.2.1.3. Using the respective starting values for µ and σ , minimise each of theobjective function f and f separately. Obtain the corresponding PSDs X and X at the minimum of each function.4. Estimate the Nadir point as f Ni = max[ f i ( X o ) , f i ( X o )] , i = 1 ,

2. Thisis similar to the approach used in [12].For the Utopian points:1. Set N b = 1 in Eq. (1), set tolerances T ol and T ol .2. Obtain guess starting values for µ and σ in Eq. (4) for objective func-tion f using the method of TSVD.3. Using the starting values for µ and σ , obtain the minimum f of theobjective function f .4. set counter i = 2enter while loop:set N b = N b + 1Repeat steps 2 and 3 to obtain f i and w i , µ i , σ i in Eq. (4) . if(abs( f i − f i − ) ≤ T ol )Set N b = N b , f U = f i accept w i , µ i and σ i valuesexit while loopelseset i = i + 1end while loop5. Repeat steps 2 to 4 for f using T ol and obtain f U , N b , w, µ and σ values for f . 9 .2.3. Obtaining combined solution To obtain the combined solution from the two objective functions ˜ f and˜ f , minimise the combined objective function F in Eq. (6) (with f and f replaced by ˜ f and ˜ f respectively) as follows.1. Set N b = max [ N b , N b ].2. If N b ≥ N b , choose the starting values for w , µ and σ correspond-ing to f obtained in step 4 in section 3.2.2. Else, choose the valuescorresponding to f .3. Create a vector of γ values in Eq. (6).4. For each value of γ in the vector of γ values in step 3, minimise theobjective function F in Eq. (6).5. Apply the principle of non-dominance described in section 3.1 to obtainthe Pareto optimal set corresponding to the initial vector of γ values.Delete all dominated solutions and corresponding γ values. After obtaining an estimate of the Pareto curve using the procedure de-scribed in section 3.2.3, then apply the following procedure to get anotherestimate of the Pareto curve with more uniformly spaces points.1. Let the functional form of the mapping ψ : ˜ f (cid:55)→ ˜ f be given as ˜ f = α exp( ˜ f ) + α exp( ˜ f ), α and α are arbitrary ﬁtting parameters.The form of mapping chosen in this step depends on the shape of thePareto curve estimated in section 3.2.3. Fit the curve ψ to the initiallyestimated Pareto curve to obtain values of the parameters α and α .2. Compute the Euclidean distance d between successive points on theinitially estimated Pareto curve. Set the constant spacing ˜ d betweensuccessive points on the new Pareto curve to be estimated as the min-imum value of d .3. Find the x coordinate of the point of intersection of a circle of radius˜ d centred on the ﬁrst point ( ˜ f , ˜ f ) on the initially estimated Paretocurve and the straight line joining the ﬁrst two points. This x coordi-nate corresponds to the value ˜ f of the objective function ˜ f , such that10he arc length between ( ˜ f , ˜ f ) and ( ˜ f , ˜ f ) is approximately equalto ˜ d , where ˜ f is the y coordinate of the point whose x coordinate is˜ f .4. Estimate the value of ˜ f by substituting ˜ f for ˜ f in the mapping ψ in step 1.5. Repeat steps 3 and 4 to get successive points on the ﬁtted curve ψ whose arc length distance is approximately eaqual to ˜ d . This gives adiscretisation of the ﬁtted curve ψ at the points ( ˜ f i , ˜ f i ) , i = 1 , , ..., N p such that the arc length distance between successive points is ˜ d , andthe number of these points is N p .6. Compute the derivative given in Eq. (8) at the points ( ˜ f i , ˜ f i ) , i =1 , , ..., N p to obtain a new γ vector.7. Minimise the objective function F in Eq. (6) again using this new γ vector, and obtain an improved Pareto curve with a better spread ofpoints. Since all solutions on the Pareto curve are mathematically feasible, thesolution chosen is just a decision making process depending on how muchweight the decision maker chooses to place on a particular sensor data stream.However, to have a fully automated algorithm, it is necessary to have acriterion for choosing a ﬁnal solution, independent of the decision maker.The solution at the knee of the Pareto curve is chosen for this purpose here.The boundary line method [21] for estimating the knee point of the Paretocurve is applied here. This involves drawing a boundary line between twoextreme points in the Pareto curve. Then the perpendicular distance of eachpoint on the curve from the boundary line is calculated. The point with themaximum perpendicular distance is chosen as the knee point [21].

4. Results and discussions

As mentioned in the introductory section, the data fusion approach de-veloped in this work allows a discrimination between artiﬁcial and real PSDpeaks. The artiﬁcial peaks are either completely eliminated, when they aresuﬃciently separated from the true peak, or they get suppressed. These cases11re illustrated with results from the two samples analysed in this work. Thecase where an artiﬁcial peak is suppressed is illustrated with results fromthe PS 300-500 µ m sample. The MET 125-180 µ m sample shows an examplewhere an artiﬁcial peak (from one sensor modality) that is suﬃciently sep-arated from the true peak emerges, and hence gets eliminated. Alongside,it shows an artiﬁcial peak from the other sensor modality which is not suﬃ-ciently spaced from the true peak, which only gets suppressed. These resultsare summarised below. Figure 1: The measured (black diamonds) and estimated CLDs from the objective function f in Eq. (2) (red crosses) and at the knee point of the Pareto curve (magenta triangles)for the PS 300-500 µ m sample are shown in (a). The measured (black diamonds) andestimated scattering intensities from the objective function f in Eq. (3) (blue ﬁlledsymbols) and the knee point of the Pareto curve (magenta triangles) for the PS 300-500 µ m sample are shown in (b). The Pareto curve obtained from the bi-objective function F in Eq. (6) for the PS 300-500 µ m sample are shown by the magenta diamonds in (c).The knee point is indicated by the black ﬁlled symbol. Volume based PSDs estimatedby various methods: black diamonds (Morphologi), green pentagrams (Mastersizer), redcrosses (objective function f ), blue ﬁlled symbols (objective function f ) and magentadiamonds (at the knee point of the Pareto curve in (c)) for the PS 300-500 µ m sample areshown in (d). for the PS 300-500 µ m sample suggests a single particle size mode at D ≈ µ m as shown by the black diamonds in Fig. 1(d). This agrees withthe estimated volume based PSD by the Mastersizer for the same samplewith a single particle size mode at D ≈ µ m as shown by the green penta-grams in Fig. 1(d). This estimate from the Mastersizer coincides with thatfrom the minimisation of objective function f in Eq. 3 for laser diﬀraction,as shown by the blue ﬁlled symbols in Fig. 1(d).However, the estimated volume based PSD (red crosses in Fig. 1(d))obtained from the minimisation of objective function f in Eq. (2) for CLDshows a mode at D ≈ µ m alonside that at d ≈ µ m. the particle sizemode at D ≈ µ m is most likely an artifact judging by the estimate fromthe Morphologi instrument. The reason for this could be due to the bumpson the surface of the PS 300-500 µ m particles (see supplementary informationfor sample images), which results to signiﬁcant chord splitting, and hence anartiﬁcial high counts of short chords. This introduces a left shoulder on themeasured CLD as shown by the black diamonds in Fig. 1(a). This causes theoptimisation solver to introduce another particle size mode at D ≈ µ min order to ﬁt the experimentally measured CLD as shown by the red crossesin Fig. 1(a).The Pareto curve obtained from the minimisation of the combined objec-tive function F in Eq. (6) for the PS 300-500 µ m is shown by the magentatriangles in Fig. 1(c), and the knee point is indicated by the black ﬁlled sym-bol. The estimated volume based PSD at this knee point, which is shownby the magenta triangles in Fig. 1(d), shows that the artiﬁcial peak at D ≈ µ m associated with the estimated volume based PSD from CLD hasbeen suppressed. The main peak of this estimated volume based PSD at theknee point of the Pareto curve is close to that of the estimated volume basedPSD from the oﬄine Morphologi. Hence, the estimated volume based PSD The estimated PSDs from the Morphologi instrument is used as reference in this workbecause the materials had already been manufactured by various processes and sieved.Samples of the materials were then suspended for inline sensor measurement as well asoﬄine measurement with the Morphologi instrument. Since this did not involve any par-ticle recovery from suspension by means of ﬁltering, drying and subsequent dispersion inthe Morphologi instrument, the estimated PSD with the Morphologi instrument, whichis a highly focused imaging instrument, will be a good representation of the sizes of theparticles in the materials. µ m material than the corresponding esti-mate from the CLD. The estimated volume based PSDs from the Matersizerand that obtained from the minimsation of the objective function f are alsogood representation of the sizes of the particles in the PS 300-500 µ m materi-als in this case. This is due to their closeness to the corresponding estimatefrom the oﬄine Morphologi instrument and the absence of any artiﬁcial peak.Eventhough the estimated volume based PSD at the knee point of thePareto curve is a good representation of the PS 300-500 µ m material, theestimated CLD corresponding to this PSD at the knee point shows a poorﬁt to the experimentally measured CLD, as shown by the magenta trianglesin Fig. 1(a). Similarly, the estimated scattering intensity at the knee pointof the Pareto curve (magenta triangles in Fig. 1(b)) shows a poor ﬁt to theexperimentally measured scattering intensity (black diamonds in Fig. 1(b)).However, the estimated scattering intensity (blue ﬁlled symbols in Fig. 1(b))obtained by minimising the objective function f for laser diﬀraction, showsa better ﬁt to the experimentally measured scattering intensity. This is thegoal of multi-objective optimisation, to obtain more realistic estimates byavoiding overﬁtting experimental data.The case where both complete elimination and suppression of artiﬁcialpeaks occur is seen in the MET 125-180 µ m sample. Due to the rod-likeshape (see supplementary information for sample images) of the particles, themeasured scattering intensity (black diamonds in Fig. 2(b)) show a broadshoulder at larger q ( q (cid:38) µ m − ) values, which is not associated with spheri-cal particles of the same size. Hence, in order to ﬁt this measured scatteringintensity data with a spherical model, the optimisation solver introduces aparticle size mode at D ≈ µ m in addition to the one at D ≈ µ m. Thissituation is seen both in the estimated volume based PSD from the min-imisation of the objective function f in Eq. (3) (blue ﬁlled symbols in Fig.2(d)) and from the Mastersizer (green pentagrams in Fig. 2(d)). The particlesize mode at D ≈ µ m is clearly an artifact judging by the correspondingestimate from the Morphologi instrument, which is shown by the black dia-monds in Fig. 2(d). This estimated volume based PSD from the Morphologiinstrument suggests there is no particle size mode in the MET 125-180 µ msample at particle sizes around 1 µ m.Similar to the case of the PS 300-500 µ m sample, the MET 125-180 µ msample contains particles with bumps on the surface. Hence the measuredCLD for this sample contains a small left shoulder as shown by the black14 igure 2: The experimentally measured and estimated CLD, scattering intensity andvolume based PSDs for the MET 125-180 µ m sample similar to the case of the PS 300-500 µ m sample in Fig. 1. The symbols are the same as in Fig. 1. The inset in (d) is ablow up of the PSDs between D = 10 µ m and D = 3000 µ m for clarity. diamonds in Fig. 2(a). Hence, the estimated volume based PSD obtainedfrom the minimisation of the objective function f in Eq. (2) for CLD,broadens to the left, with a minor peak at D ≈ µ m. This is in additionto the main peak at D ≈ µ m. This is shown by the red crosses in Fig.2(d).The Pareto curve obtained from the minimisation of the objective func-tion F in Eq. (6), along with the knee point of the curve are shown bythe magenta triangles and black ﬁlled symbol in Fig. 2(c) respectively. Theartiﬁcial peak in the estimated volume based PSD at D ≈ µ m from laserdiﬀraction, and the broadened left shoulder and artiﬁcial peak at D ≈ µ mfrom the corresponding estimate from CLD have either been eliminated orsuppressed in the estimated volume based PSD at the knee point of thePareto curve. The estimated volume based PSD at the knee point of thePareto curve is shown by the magenta triangles in Fig. 2(d).15imilar to the case of the PS 300-500 µ m sample, the estimated CLD at theknee point of the Pareto curve (magenta triangles in Fig. 2(a)) give a poorerﬁt to the experimentally measured CLD. The estimated CLD obtained fromthe minimisation of the objective function f (red crosses in Fig. 2(a)) givesa better ﬁt to the experimentally measured CLD. Similarly, the estimatedscattering intensity at the knee point of the Pareto curve (magenta trianglesin Fig. 2(b)) gives a poorer ﬁt to the experimentally measured scatteringintensity (black diamonds in Fig. 2(b)). The estimated scattering intensityobtained from the minimisation of the objective function f for laser diﬀrac-tion (blue ﬁlled symbols in Fig. 2(b)) gives a better ﬁt to the experimentallymeasured scattering intensity.

5. Conclusions

We have presented an algorithm which is capable of removing or signiﬁ-cantly reducing artiﬁcial peaks occurring in PSD estimates. The algorithmis based on a fusion of CLD and laser diﬀraction data. The fusion is doneby means of multi-objective optmisation using the weighted sum approach.The algorithm has been applied to CLD and laser diﬀraction data from twodiﬀerent particle suspensions. The results show the promise held by thismulti-objective optimisation approach in obtaining more accurate PSD esti-mates.In situations where an artiﬁcial peak is produced, and is signiﬁcantlyseparated from the true peak, the algorithm produces a solution where theartiﬁcial peak is completely eliminated. In the situation where the artiﬁcialpeak is too close to the true peak, the algorithm produces a solution in whichthe artiﬁcial peak is signiﬁcantly reduced.This algorithm is particularly useful in real-time estimates of PSD fromdata obtained with inline sensors. This is because various factors due to theprocess conditions could lead to PSD estimates with artiﬁcial peaks. Theoccurrence of artiﬁcial peaks in PSD estimates could be very misleading inparticulate processes where the PSD is a critical attribute of the end product.Crystallisation is a good example of a process in which the PSD of the endproduct is a critical quality attribute. The development of algorithms, whichare capable of obtaining very accurate PSD from multiple data streams willallow the process to be monitored more eﬃciently and eventually controlled.16 cknowledgement

This work was performed within the UK EPSRC funded project(EP/K014250/1) ‘Intelligent Decision Support and Control Technologies forContinuous Manufacturing and Crystallisation of Pharmaceuticals and FineChemicals’ (ICT-CMAC). The authors would like to acknowledge ﬁnancialsupport from EPSRC, AstraZeneca and GSK. The authors are also grate-ful for useful discussions with industrial partners from AstraZeneca, GSK,Mettler-Toledo, Perceptive Engineering and Process Systems Enterprise.17 upplementary Information

1. Sample images

Some sample images of the materials analysed in this work are shown inFig. 1. The images were captured with the Mettler Toledo particle visionand measurement V819 instrument.