[PDF] A fast and portable Re-Implementation of Piskunov and Valenti's Optimal-Extraction Algorithm with improved Cosmic-Ray Removal and Optimal Sky Subtraction

Abstract

We present a fast and portable re-implementation of Piskunov and Valenti's optimal-extraction algorithm (Piskunov & Valenti, 2002} in C/C++ together with full uncertainty propagation, improved cosmic-ray removal, and an optimal background-subtraction algorithm. This re-implementation can be used with IRAF and most existing data-reduction packages and leads to signal-to-noise ratios close to the Poisson limit. The algorithm is very stable, operates on spectra from a wide range of instruments (slit spectra and fibre feeds), and has been extensively tested for VLT/UVES, ESO/CES, ESO/FEROS, NTT/EMMI, NOT/ALFOSC, STELLA/SES, SSO/WiFeS, and finally, P60/SEDM-IFU data.

Full PDF

aa r X i v : . [ a s t r o - ph . I M ] M a r A fast and portable Re-Implementation of Piskunov and Valenti’s Optimal-ExtractionAlgorithm with improved Cosmic-Ray Removal and Optimal Sky Subtraction

A. Ritter

National Central University300 Jhongda Rd, Jhongli City, Taoyuan County, Taiwan (R.O.C) [email protected]

E. A. Hyde

Department of Physics & Astronomy, Macquarie University, Sydney, NSW 2109 Australia andQ. A. Parker

Department of Physics & Astronomy, Macquarie University, Sydney, NSW 2109 AustraliaResearch Centre for Astronomy, Astrophysics and Astrophotonics, Macquarie University, Sydney, NSW2109 AustraliaAustralian Astronomical Observatory, PO Box 296, Epping, NSW 1710, Australia

Accepted Nov. 26 2013We present a fast and portable re-implementation of Piskunov and Valenti’s optimal-extraction algo-rithm (Piskunov & Valenti 2002) in

C/C++ together with full uncertainty propagation, improved cosmic-rayremoval, and an optimal background-subtraction algorithm. This re-implementation can be used with IRAFand most existing data-reduction packages and leads to signal-to-noise ratios close to the Poisson limit. Thealgorithm is very stable, operates on spectra from a wide range of instruments (slit spectra and ﬁbre feeds),and has been extensively tested for VLT/UVES, ESO/CES, ESO/FEROS, NTT/EMMI, NOT/ALFOSC,STELLA/SES, SSO/WiFeS, and ﬁnally, P60/SEDM-IFU data.instrumentation: spectrographs – methods: data analysis – techniques: image processing – techniques:spectroscopic.

1. Introduction

The concept of a variance-weighted (optimal) extraction for spectroscopic data has been developedover the last two and a half decades in a series of papers (Robertson 1986; Horne 1986; Marsh 1989; Mukai1990; Vershueren & Hensberge 1990; Kinney et al. 1991; Valdes 1992; Hall et al. 1994; Piskunov & Valenti2002; Bolton & Schlegel 2010; Sharp & Birchall 2010). The most critical step in this procedure is the con-struction of an accurate proﬁle model for the slit function (spatial proﬁle perpendicular to the dispersion 2 –direction ) for which several methods have been devised. While Horne (1986) and Marsh (1989) describemethods of ﬁtting low-order polynomials to straight and curved orders, the spectrum-extraction algorithmsby Valdes (1992) use averaging of nearby proﬁles. Kinney et al. (1991) ﬁt binned, normalised proﬁles withhigh order polynomials to generate proﬁle models. Piskunov and Valenti (Piskunov & Valenti 2002, here-after P&V) again use an iteration algorithm to ﬁnd the most probable two-dimensional spatial proﬁle. Bolton& Schlegel (Bolton & Schlegel 2010) introduced a 2D Point-Spread Function (PSF) deconvolution model,but state themselves that no computer in the near future will be able to solve the equations using a brute-force approach. Sharp & Birchall (Sharp & Birchall 2010) again present an optimal-extraction algorithmfor multi-object ﬁbre spectroscopy assuming Gaussian spatial proﬁles based on the 2dF+AAOmega multi-object ﬁbre spectroscopy system (Smith et al. 2004) on the Anglo-Australian Telescope (AAT).Traditional sky-subtraction algorithms for slit spectra only look at the ends of the slit to determine the un-derlying sky. For short slits, extended objects, or objects positioned at one end of the slit, this is obviouslyproblematic as the sky background might only be measurable in very few pixels or only one wing of thespatial proﬁle. More sophisticated methods (e.g. Sembach & Tonry 1996, Glazebrook & Bland-Hawthorn2001) have been developed to handle these problems, but they come with huge costs in terms of observingtime. Here we present a new optimal background-subtraction algorithm that avoids all these problems anddelivers results close to the Poisson limit.If the spatial proﬁle or 2D PSF is not known a-priori, variance-weighted extraction only makes sense if theproﬁle goes down to zero at the borders. The reason for this is the integral-normalisation of the (spatial)proﬁle for each CCD row, hence a proper background subtraction is vital for the success of the determinationof the spatial proﬁle. If the scattered light has not been subtracted accurately or if the proﬁle width is not setwide enough, the resulting proﬁle image will show jumps where the aperture borders cross a new column.P&V published their new and extremely powerful algorithms for reducing cross-dispersed echelle spectrain 2002. Their Data-Reduction Pipeline (DRP) REDUCE was written in IDL and its only purpose was todemonstrate the power of their new algorithms. They never intented to present a stable pipeline which couldeasily be adopted to other instruments. Unfortunately the

IDL -based REDUCE package is slow and allowsonly experienced programmers to make use of it and adopt it to different instruments.Here, we are presenting a re-implementation of the optimal-extraction algorithm from the REDUCE pack-age (status 10/09/2012) in

C/C++ . The original intention was to use P&V’s algorithm within the ImageReduction and Analysis Facility (IRAF, Tody 1986) as part of the automatic DRP (Ritter & Wash¨uttl 2004)for the ﬁbre-fed STELLA Echelle Spectrograph (SES, Strassmeier et al. 2001). As additional features, wehave introduced improved cosmic-ray and CCD-defect removal and new optimal background-subtractionalgorithms. In the case of the background being nearly constant for each dispersion element, these newalgorithms can determine the spatial proﬁle AND the background (scattered light in case of ﬁbre feeds, orscattered light + sky in case of slit spectra) in the same reduction step. If the aperture deﬁnitions are providedin the form of an IRAF database ﬁle, these new programs can be called from any data-reduction package Note that in this paper we assume that the dispersion direction is only slightly tilted/curved with respect to the columns of theCCD, and the spatial proﬁle is the transcept across a spectrum perpendicular to the dispersion direction (one resolution elementaffects only one CCD row).

C/C++ code or external binaries. When run fromwithin the STELLA-pipeline framework, all parameters are stored in a single parameter ﬁle, allowing for aneasy overview and editing.Our programs are freely available (including all used libraries), fast, portable, extremely stable, and havebeen extensively bug ﬁxed and tested for a wide range of spectrographs, including the European South-ern Observatory (ESO) Very Large Telescope (VLT) Ultraviolet and Visual Echelle (slit) Spectrograph(UVES, Dekker et al. 2000), ESO/Coude (ﬁbre-fed) Echelle Spectrometer (CES, Enard 1982), ESO/Fiber-fed Extended Range Optical (Echelle) Spectrograph (FEROS, Kaufer et al. 1999), ESO New TechnologyTelescope (NTT)/ESO Multi-Mode Instrument (EMMI, Dekker et al. 1986), Northern Optical Telescope(NOT)/Andalucia Faint Object (slit) Spectrograph and Camera (ALFOSC), STELLA/SES, Siding SpringObservatory (SSO) Australian National University (ANU) Wide Field Spectrograph (WiFeS, Dopita et al.2010) slitlet-based Integral Field Unit (IFU), and ﬁnally, Palomar Observatory 60inch Telescope (P60)/SpectralEnergy Distribution (SED)-Machine (Ben-Ami et al. 2012) lenslet-based IFU data.This paper is organised as follows: In Chapter 2 we will introduce our new algorithms; Chapter 3 is dedi-cated to the test results, and the conclusions and outlook to future work are presented in Chapter 4.

2. Method2.1. Review of the original algorithm by P&V

P&V (please refere to the original paper for a full explanation of the algorithm) calculate the mostprobable spatial proﬁle by solving the problem F = X x,λ (cid:20) S λ X j W jxλ P jxλ − D xλ (cid:21) + Λ X j ( P j +1 xλ − P jxλ ) = minimum , (1)where x is the CCD column number of one spectral aperture row, λ is the wavelength (or CCD row), S λ is the object spectrum, P jxλ is the oversampled spatial proﬁle, and D xλ is the ﬂat-ﬁelded and bias- andscattered-light subtracted CCD aperture. The last term smooths the spatial proﬁle, restricting point-to-pointvariations, even if the order geometry does not constrain every point in P . Given an oversampling factor O ,the subpixel weights W jxλ at a given wavelength λ (CCD row y ) for the subpixel j are: W jxλ =  j < j FRACT ( x ( λ ) O ) j = j /O j = j + 1 , . . . , j + O − − FRACT ( x ( λ ) O ) j = j + O j > j + O (2) j is the index of the ﬁrst subpixel that (partially) falls in detector pixel [ x, λ ] , and FRACT is thefraction of that ﬁrst subpixel that is contained in the detector pixel. This structure is exactly the same foreach pixel x in a given CCD row.Minimization of Eq. 1 produces two systems of linear equations: X j ( A jk + Λ · B jk ) P jxλ = R k (3) S λ = C λ / E λ (4)with matrices given by: A jk = X x,λ S λ W jxλ W kxλ (5) C λ = X x D xλ X j P jxλ W jxλ (6) E λ = X x (cid:20)X j P jxλ W jxλ (cid:21) (7) R k = X x,λ D xλ S λ W kxλ (8) B jk is the tri-diagonal matrix with − on both subdiagonals and 2 on the main diagonal, except for the upperleft and the bottom right corners, which contain 1. In the matrix equations above, the generic nomenclature N = N x · O is used.Cosmic-ray hits and CCD defects are automatically detected and marked as such in a mask M xλ : M xλ = ( D xλ − P xλ × S λ ) > σ (9)where σ clip is determined from the whole swath in the ﬁrst iteration, and from the individual row in succes-sive iteration steps.The original programs by P&V assumed that the scattered light was properly removed and the trace func-tions of the spectra/apertures were precisely known. Sky subtraction was then done by subtracting themedian value of the lowest pixel values of one aperture row before the start of the calculation of the apertureproﬁle. This works quite well and can even take care of (residual) scattered light if the slit is long and theobserved star is close to the center of the slit. However, this procedure can be based on very few pixel valuesand fails to deliver optimal results for short slits or if the star is close to a slit edge, or if the traced apertureposition is off by even a small percentage of a pixel. To avoid these problems alternative methods for thebackground subtraction and the ﬁnal extraction were developed and are presented in the following sections. 5 – Optimal extraction is formally equivalent to scaling a known spatial proﬁle across an object’s spectrumto ﬁt the sky-subtracted data. Here the latter approach is used to disentangle and extract the background(scattered light and/or sky) and the object spectrum in the same iterative procedure. An initial normalisedspatial proﬁle is determined ﬁrst using only the rows with the highest maxima close to the peak of theintegral-normalised spatial proﬁle. Since adding a constant background to a proﬁle results in lower maximafor the normalised spatial proﬁle, this procedure only takes into account rows that are not affected by skyemission lines. After the initial determination of the spatial proﬁle, the wings of the proﬁle function are setto zero, removing background that affects all rows. This is done by subtracting the minimum of the higherproﬁle wing from the whole proﬁle function, setting resulting negative values in the lower wing to zero, andsubsequent re-normalisation of the whole proﬁle. For each row, the most probable values for the backgroundand the object spectrum are then calculated by a weighted least-squares linear ﬁt of the constant backgroundand a scaled spatial proﬁle to the data (See Bevington & Robinson 2003): D xλ = S λ P xλ + B λ , (10)where B λ is the (scattered light +) sky background. Using the error estimates σ xλ from σ xλ = D xλ G + σ RON , (11)where D xλ is the number of counts in pixel [ x, λ ] after bias subtraction, G is the detector gain, and σ RON isthe read-out noise of the detector, the χ λ to be minimised for each resolution element λ can be written as χ λ = X x W xλ [ D xλ − ( S λ P xλ + B λ )] (12)with the pixel weight W xλ = 1 σ xλ , (13)where σ xλ is the uncertainty of pixel x in a detector row ( y or λ ) of the aperture. When χ λ is minimised,the estimated parameter values of the linear model can be computed as S λ = 1∆ λ (cid:18)X x W xλ X x W xλ P xλ D xλ − X x W xλ P xλ X x W xλ D xλ (cid:19) , (14) B λ = 1∆ λ (cid:18)X x W xλ P xλ X x W xλ D xλ − X x W xλ P xλ X x W xλ P xλ D xλ (cid:19) , (15)where ∆ λ = X x W xλ X x W xλ P xλ − (cid:18)X x W xλ P xλ (cid:19) (16)The background is then subtracted from the input image and the spatial proﬁle is re-calculated using allrows. This procedure is repeated with the new proﬁle until the spatial proﬁles P xλ converge. Convergence 6 –is assumed when X x,λ (cid:18) P oldxλ − P newxλ (cid:19) < P newxλ , (17)and normally reached within 10 iterations. Tests have shown that this is a good convergence criterion.Negative values for the background are not allowed during the calculation as they are not physical.Another ﬂaw in the original algorithm by P&V was that problematic pixels were detected by either lookingat the whole swath or individual rows. The signal (and therefore the uncertainty per pixel) is a lot higherclose to the center of the spatial proﬁle P xλ compared to the wings of the proﬁle. Therefore not all pixelsaffected by a cosmic-ray hit were identiﬁed, only the ones in the central area of the hit. In our new programscosmic rays and bad pixels are identiﬁed and marked as such in a two step procedure. First we calculate thevariance per column instead of row. This ensures that each pixel is only compared to pixels with a similarsignal strength and uncertainty. Setting the sigma-clipping factor high enough ( > ) makes sure that skylines are not misidentiﬁed as problematic pixels. Repeating this sigma-clipping procedure a number of times(which can be adjusted by the user) then also removes pixels which were affected by cosmic-ray hits but arenot in the central region of the hit. The second step is done during the weighted linear ﬁtting of the spatialproﬁle to the CCD row containing the spectrum of the object and the background. In this step pixels with ( D xλ − P xλ S λ − B λ ) > σ , (18)are added to the problematic pixel mask by setting the mask value to zero. Multiplying the CCD row andthe spatial proﬁle with the mask before the ﬁnal ﬁtting procedure then results in proper removal of thecosmic-ray hits/detector defects in the ﬁnal extracted spectra (See Fig. 1). In the case one prefers to rejectcosmic rays before the optimal-extraction procedure with LACosmics in IRAF (van Dokkum 2001), thisis also supported by our STELLA pipeline. However, as shown in the same ﬁgure, at least the IRAF task lacos spec can still leave traces of the cosmic-ray hit on the CCD.The STELLA pipeline offers multiple possibilities for removing the scattered light. First of all, standard-IRAF’s apscatter task from the noao.twodspec.apextract package is implemented in the STELLADRP. Additionally, we re-implemented P&V’s scattered-light subtraction algorithm, as well as the Krigingalgorithm (Krige 1951, Matheron 1963). Kriging is very successfully used in Geostatistics for predictingsedimentary rock layers from a small number of bore holes, but is very expensive in terms of computingtime. However, clustering the scattered light leads to good results in an acceptable time frame. For theclustering two versions are implemented. The ﬁrst version assigns the minimum of each rectangular area(cluster) on the CCD (size deﬁned by the user) as scattered-light value to the center of the cluster. Thisversion is very useful if the scattered light needs to be subtracted before the spectral apertures are identiﬁed.The second version only looks at areas on the CCD which are not part of a spectral aperture, and assigns themean value of each cluster (again the sizes of clusters are deﬁned by the user) to the center of each cluster.The complete algorithm will be described in detail in a subsequent paper. Last but not least, setting the fits ﬁles the cfitsio library has been included. Mathemati-cal procedures use the BLITZ++ library (Veldhuizen & Jernigan 1997), which provides performance on parwith

FORTRAN

IDL code from P&V was ported as closely as possible to

C++ . Bugs like possibledivisions by zero, which caused the program to crash, were identiﬁed and removed in our re-implementation.

In the REDUCE pipeline, the actual uncertainties of the individual pixel values are not taken intoaccount at all during the extraction or for the ﬁnal estimation of the uncertainties. As an error estimate ofthe ﬁnal object spectrum P&V calculate σ λ = P x ( D xλ − S λ P xλ ) P x P xλ − P xλ , (19)where P xλ is the mean value of the spatial proﬁle for the CCD row λ ( y ). In our re-implementation of theoriginal algorithm, we calculate the error estimate for the ﬁnal extracted object spectrum from Eq. 4 and theuncertainties for each pixel in the CCD row: σ λ = P x σ xλ F xλ ( P x F xλ ) (20)where F xλ = X j P jx W jxλ (21)The new algorithms presented here use the estimated pixel errors, which are given to the program in anadditional error image, to calculate the most probable spectrum and background, and propagate the errors 8 –through the entire extraction process. This procedure results in much more realistic error estimates for theﬁnal object spectra (See Sec. 3). Assuming that ﬂat-ﬁelding and scattered-light subtraction do not introduceadditional errors, the uncertainties of the ﬁnal spectrum and constant background can be calculated as σ S,λ = 1∆ λ X x W xλ and (22) σ B,λ = 1∆ λ X x W xλ P xλ (23)where σ S,λ and σ B,λ are the uncertainties for the object spectrum and the background, respectively.

If the new programs are invoked by the STELLA pipeline, all parameters for the optimal extractionprocedure are stored in one parameterﬁle. This provides for both an easy overview of the parameters andtheir values, as well as easy editing. The ﬂow chart for this case is shown in Fig. 2.Our new

C/C++ programs can be utilised to calculate the scattered light, the normalised Flat, and the in-strument proﬁle (quantum efﬁciency and Blaze function) in the case that a continuous light source over thewhole spectral range is available (or the spectrum of the Flat ﬁeld lamp is known), as well as to remove CCDdefects and cosmic-ray hits, and to optimally extract the target data and sky to one-dimensional spectra. Asan additional feature, they can create the proﬁle image and the reconstructed object and sky images in 2D,where always possible failures can be easily spotted by comparing them to the original image.

3. Test results

Our new

C/C++ modules/classes were tested using the ‘black box’ testing strategy. Testing moduleswere written to test the different constructors and procedures.‘Black box’ means that the source code is not known. The tester only knows the interface and checks theresults of the procedures due to the possible entry points and variable limits. The testing modules comparethe results of the procedures to the expected ones. If a single test goes wrong, the testing module returnsimmediately.The comparison with the original IDL pipeline (REDUCE) by P&V could not be done completely, becauseREDUCE crashed. After ﬁxing a few bugs, the optimal-extraction algorithm was working properly for atleast a few spectral orders of ESO/FEROS. As shown in Fig. 3, the resulting proﬁles are indistinguishable.The comparison of the original REDUCE pipeline without sky subtraction to the individual extractionmethods performed by the STELLA pipeline (IRAF sum, ﬁt1d, ﬁt2d, and our re-implementation of P&V’salgorithm) is shown in Fig. 4. While the IRAF ﬁt2d algorithm completely fails, the other extraction methods 9 –deliver comparable results. Note however that our re-implementation leads to the smoothest object spectrum.As a test spectrum for the comparison of our new algorithms to the REDUCE pipeline and the UVESpipeline (Ballester et al. 2000) we chose a low-SNR observation of LMC-X1 (a stellar-mass black-holecandidate as revealed by the properties of the surrounding material, Wilms et al. 1998) showing strong skylines as well as cosmic-ray hits. The spectral region chosen for this test example is . ˚ A < λ < . ˚ A (CCD rows 1363–914 respectively), because it shows 3 sky-emission lines at 5888.192, 5889.959,and 5895.932 ˚A (CCD rows 1300, 1220, and 940 respectively), 2 stellar absorption lines at . ˚ A and . ˚ A (CCD rows 288 and 72), and 3 cosmic-ray hits, one in the middle of the spatial proﬁle, one in thewings of an absorption line, and one in a sky line. The mean value of the SNR in this region is approximately20, making this a low-SNR spectrum where the optimal-extraction algorithm of the UVES pipeline shouldperform at its best. Fig. 5 shows the original CCD section after bias subtraction, ﬂat-ﬁelding, and scattered-light subtraction as well as a comparison of the spatial proﬁles calculated with our new optimal sky- andobject-extraction algorithms to the algorithm by P&V.In Fig. 6 we compare the reconstructed object spectra from the original version by P&V (a), our re-implementation (b), and our new algorithm (c). While the cosmic-ray hit near the center of the spatialproﬁle causes a spike at row 1206 in the extracted object spectrum shown in a), the cosmic rays are properlyremoved by our algorithms shown in (b) and (c). The main difference between b) and c) is that in c) theproﬁle goes down to zero at the wings (as one would expect for a point source), while in b) too much weightis given to the background.The comparison of the reconstructed sky spectra in the cases of (a), (b), and (c) is shown in Fig. 7. Wehave three sky lines visible at rows 940 (5895.932 ˚A), 1220 (5889.959 ˚A), and 1300 (5888.192 ˚A). In (a)the sky is relatively noisy and the strongest line is 13% higher than in our (c) method. As expected, ourre-implementation of the original algorithm leads to nearly identical results (b). Our new optimal extractionand sky-subtraction algorithm (c) leads to the smoothest sky results. All sky lines are properly reproduced,as shown below. Using our new algorithm we additionally ﬁnd a new sky line at row 982. This line canonly be inferred in the other sky spectra. A comparison to the catalogue of optical sky emission from UVES(Hanuschik 2003) identiﬁes it as the 5894.472 ˚A line.The differences between the CCD image shown in Fig. 5 and the reconstructed object and sky spectrafrom Fig.s 6 and 7 are shown in Fig. 8. No remaining structures which would indicate systematic discrep-ancies between the CCD image and the reconstruction are obvious in any of the images.The comparison of the resulting sky and object spectra from the original implementation by P&V toour re-implementation, our new algorithms, as well as to the UVES pipeline, is presented in Fig. 9. Shownare the optimally extracted object spectra, the sky spectra, and the estimated SNRs. The UVES pipelineuses a similar approach as we present here, but always assumes a Gaussian for the spatial proﬁle, leading tosystematic errors. The sky is calculated by ﬁtting the Gaussian proﬁle plus constant (sky) to the CCD row.Again, assuming a Gaussian for the spatial proﬁle is leading to systematic errors in the sky as well. As it canbe seen in the ﬁgure, our new algorithms are leading to the smoothest sky and the least sky- and cosmic-ray 10 –residuals in the object spectrum. As expected, the sky values calculated with the REDUCE pipeline andour re-implementation are nearly identical. Note however that the REDUCE pipeline, as well as the UVESpipeline, produce a spike at . ˚ A caused by a residual cosmic-ray hit. The SNR calculated by theREDUCE pipeline is unrealistic (we still cannot reproduce Eq. 19). In contrast, our error estimates (Eq. 20for our re-implementation and Eq.s 22 and 23 for our new algorithms), using the propagated uncertaintiesfor every pixel, are similar to the error estimates stated by the UVES pipeline. However, as we could notﬁnd a documentation on the algorithm used to calculate the UVES uncertainties, we cannot comment on thestated UVES SNR too much. As the object extraction and sky subtraction are done in a way very similarto our new algorithms, we can assume that the errors are also calculated in a similar way. Given that thespatial proﬁle is close to a Gaussian, this explains why the SNR stated by the UVES pipeline is very closeto the SNR for our new algorithms. Note however that the UVES object spectrum was rebinned usingan oversampling factor of ∼ . . While oversampling preserves the resolution, it smooths the spectrum,making it appear as if it had a higher SNR. Visibly, our new extraction method leads to an object spectrumeven smoother than the rebinned spectrum produced by the UVES pipeline. Compared to the SNR for theoriginal algorithm by P&V, calculated from the propagated pixel uncertainties, our new algorithms lead toa SNR ≈ − % higher. The simple-sum extraction (which is not shown in the plot because of the skylines and cosmic-ray hits) leads to a calculated SNR of only ≈ . , even with an aperture width only slightlylarger than the object spectrum!In Fig.s 10 and 11 a comparison of the performance of the original sky-subtraction algorithm by P&V to ournew programs is shown for the case that the object is close to the slit edge. The images show a short part ofa spectrum of LMC-X1, taken with UVES in the blue arm. While the original sky-subtraction algorithm byP&V is struggling to remove the background properly, our new algorithm performs much better, resulting ina ≈ gain for the SNR of the object spectrum. Note that the spectrum actually shows an accretion disksurrounding the black hole and not a sky emission line.Shown in Fig. 12 is the computing time of our new programs and the original REDUCE proceduresfor one spectral order of the ESO/FEROS spectrograph (18x4089 pixels). It shows that our C/C++ re-implementation is about 10 times faster than the

IDL version. In times of major spectroscopic surveys withmulti-object spectrographs, this is a major advantage over the original programs.Our new programs have been extensively tested with a wide range of spectrographs, including slitsand ﬁbre feeds, prisms and gratings, Echelle spectra, multi-object spectra, and IFUs. Comparisons of theSTELLA pipeline to the results of the pipelines provided by the individual observatories and to standardIRAF have shown that our pipeline can lead to signiﬁcant SNR improvements. The UVES pipeline hasbeen optimised for low SNRs and always assumes a Gaussian proﬁle perpendicular to the dispersion axisfor the optimal-extraction algorithm. For low SNR the random error of the Poisson noise (photon noise)is larger than the systematic error introduced by this approximation. In the low-SNR regime, assuming aGaussian spatial proﬁle is therefore an acceptable approximation. However, proﬁle determination is themost critical step for variance weighted (optimal) extraction. Assuming an incorrect spatial proﬁle can leadto large systematic errors for high-SNR data. These systematics drop the achievable SNR signiﬁcantly and 11 –can introduce ripples in the extracted spectra, both leading to larger uncertainties in the stellar parametersderived from these spectra.The optimal-extraction algorithms of standard IRAF are also known to have room for improvement. Thesimple sum nearly always leads to a higher SNR than these ‘optimal extraction’ algorithms, even if apsum is fed with the most probable proﬁle calculated with P&V’s proﬁle-ﬁtting algorithm. Our new algorithmsare now allowing for the utilisation of a state-of-the-art optimal-extraction algorithm within the IRAF envi-ronment, as well as most other existing data-reduction packages.

4. Conclusions

The presented new implementation of the state-of-the-art optimal-extraction algorithm developed byP&V can now easily be integrated in IRAF and most other existing data-reduction packages and program-ming languages. It allows for scattered-light subtraction, proﬁle calculation, error propagation, and optimalextraction of Coud´e and Echelle slit spectra as well as ﬁbre feeds. For slit spectra a new optimal sky ex-/subtraction algorithm at the Poisson limit was developed, which works even for extended objects, shortslits, or if the observed star is at the slit’s end. Our progams offer the new optimal extraction and sky-subtraction algorithms as well as the original ones by P&V. The re-implementation in C++ is about 10times faster than the original one, making it perfectly suitable for large surveys. Our new programs havebeen extensively bug ﬁxed and successfully tested with spectra from VLT/UVES, ESO/CES, ESO/FEROS,NTT/EMMI, NOT/ALFOSC, STELLA/SES, SSO/WiFeS, and, ﬁnally P60/SEDM-IFU. As already shownby P&V, their optimal-extraction algorithm can lead to a signiﬁcant improvement in the SNR comparedto standard IRAF and other DRPs. In this paper we have shown that using our improved algorithms forcosmic-ray/CCD-defect removal and background subtraction we can again achieve a SNR gain of 15-50%compared to the original algorithms by P&V. This is leading to much smaller errors in parameter estimatescalculated from our optimally extracted spectra compared to other DRPs.

5. Acknowledgements

We would like to thank the anonymous referee for the helpful comments, which signiﬁcantly increasedthe scientiﬁc content of this paper, and Heidi Viets and Andrew Fathy for proof reading the manuscript.Funding for the project has been provided by the Taiwanese National Science Council (grant numbers NSC101-2112-M-008-017-MY3 and NSC 101-2119-M-008-007-MY3).

REFERENCES

P. Ballester, A. Modigliani, O. Boitquin, S. Cristiani, R. Hanuschik, A. Kaufer, S. Wolf: 2000, ESO Mes-senger 101, 31 12 –S. Ben-Ami, N. Konidaris, R. Quimby, J. T. C. Davis, C. C. Ngeow, A. Ritter, A. Rudy: 2012, SPIEConference Series 8446, 316P. R. Bevington and D. K. Robinson: Data reduction and error analysis for the physical sciences: 2003,McGraw-Hill, 98A. S. Bolton and D. J. Schlegel: 2010, PASP 122, 248H. Dekker, B. Delabre, and S. D’Odorico: SPIE 627, 39H. Dekker, S. D’Odorico, A. Kaufer, B. Delabre and H. Kotzlowski: SPIE Conference Series 4008, 534M. Dopita, J. Rhee, C. Farage, P. McGregor, G. Bloxham, A. Green, B. Roberts, J. Neilson, G. Wilson, P.Young, P. Firth, G. Busarello, P. Merluzzi: 2010, AP&SS 327, 245D. Enard: 1982, SPIE Conference Series 331, 232Jeffrey C. Hall, Eliza E. Fulton, David P. Huenemoerder, Alan D. Welty and James E. Neff: 1994, PASP106, 315K. Glazebrook and J. Bland-Hawthorn: 2001, PASP 113, 197R. Hanuschik: 2003, A&A 407, 1157Keith Horne: 1986, PASP 98, 609A. Kaufer, O. Stahl, S. Tubbesing, P. Nørregaard, G. Avila, P. Francois, L. Pasquini, A. Pizzella: 1999, TheMessenger 95, 8A. L. Kinney, R. C. Bohlin and J. D. Neill: 1991, PASP 103, 694Krige D.G, 1951, Master’s thesis, University of WitwatersrandT. R. Marsh: 1989, PASP 101, 1032G. Matheron: 1963, Economic Geology, 58, 1246–1266K. Mukai: 1990, PASP 102, 183N. E. Piskunov and J. A. Valenti: 2002, A&A 385, 1095A. Ritter and A. Wash¨uttl: 2004, Astronomische Nachrichten 325, 663J. G. Robertson: 1986, PASP 98, 1220K. R. Sembach and J. L. Tonry: 1996, AJ 112, 797R. Sharp and M. N. Birchall: 2010, PASA 27, 91 13 –G. A. Smith, W. Saunders, T. Bridges, V. Churilov, A. Lankshear, J. Dawson, D. Correll, L. Waller, R.Haynes, and G. Frost: 2004, SPIE Vonference Series 5492, 410K. G. Strassmeier, T. Granzer, M. Weber, M. Woche, G. Hildebrandt, S.-M. Bauer, J. Paschke, M. M. Roth,A. Washuettl, K. Arlt, P. A. Stolz, J. H. M. M. Schmitt, A. Hempelmann, H.-J. Hagen, H. Ruder,P. L. Palle, R. Arnay: 2001, Astronomische Nachrichten 322, 287Doug Tody: 1986, SPIE Instrumentation in Astronomy VI 627, 733F. Valdes: 1992, Astronomical Data Analysis Software and Systems I, ASP Conf. Ser. 25, 398P. G. van Dokkum: 2001, PASP 113, 1420T. L. Veldhuizen and M. E. Jernigan: 1997, Proceedings of the 1st International Scientiﬁc Computing inObject-Oriented Parallel Environments (ISCOPE’97), Lecture Notes in Computer ScienceW. Vershueren and H. Hensberge: 1990, A&A 240, 216J. Wilms, M. A. Nowak, J. B. Dove, K. Pottschmidt, W. A. Heindl, M. C. Begelman, R. Staubert: 1998,STIN 99, 70941

This preprint was prepared with the AAS L A TEX macros v5.2.

14 –Fig. 1.— From left to right: Comparison of the original CCD image with cosmics-ray hits and sky line, thecosmic-ray removal of LACosmics (still with sky line), of the REDUCE pipeline (identiﬁed pixels affectedby cosmic rays set to zero and sky removed), and our new algorithm implemented in the STELLA pipeline(same as for the REDUCE pipeline) for the two parts of the UVES spectrum shown in Fig. 5 which areaffected by cosmic-ray hits. LACosmics does not completely remove the cosmic-ray hits and replaces theidentiﬁed pixels with averages of the surrounding pixels. Both the REDUCE and the STELLA pipelines setthe mask value of the identiﬁed pixels to zero and remove these pixels from the optimal extraction. Whileonly the cores of the cosmic-ray hits are removed by the REDUCE pipeline, the STELLA pipeline leavesno traces of them. 15 – stextract.cl * encapsulates extraction tasks* reads extraction parameters optional * contains extracted error estimates* contains aperture−definition data* contains two−dimensional objectspectrum* contains two−dimensional errorimage* contains optimally extracted objectspectrum * reads aperture−definition data* creates aperture profile* optimally extracts object spectrum* reads general parameters* creates file lists stall.cl * encapsulates main tasks* contains all parameters parameterfile.propaperture−definition fileerror_image_2d.fits optextracterror_image_1d.fitsobject_spectrum_2d.fitsobject_spectrum_1d.fits

Fig. 2.— Flow chart of the parameters and data for our new optextract program, if invoked by theSTELLA pipeline. The yellow boxes show STELLA tasks, the light blue boxes input ﬁles, and the greenboxes the output ﬁles. Files within the brown box are optional for the execution of the task. 16 –Fig. 3.— Comparison of the aperture proﬁles perpendicular to the dispersion axis for ESO/FEROS (with2-slice image slicer). The thick black line shows the proﬁle calculated with the

IDL -REDUCE packagefrom P&V with 10 times oversampling. The over-plotted thin gray line shows the same proﬁle calculatedwith the STELLA pipeline. Also shown are the histograms of 3 CCD rows in the same aperture: row 41 is arow just before the aperture trace function crosses the border between two CCD columns, row 42 is the rowjust after that, and row 23 is a row where the trace function goes through the center of the CCD column. Allcurves/histograms are normalised to an integral of one. 17 –Fig. 4.— Comparison of the extraction algorithms from the REDUCE and STELLA pipelines for apart of one spectral order (5417–5622 ˚A) of the variable K0V-star LQ Hya (HD 82558), observed withESO/FEROS. The lower spectra were shifted downwards for better visibility. 18 –a) b) c)Fig. 5.— Part of a spectral order of LMC-X1 in the red, observed with UVES, showing strong sky lines aswell as cosmic-ray hits. The spectral region chosen for this test example is . ˚ A < λ < . ˚ A (CCDrows 1363–914 respectively), because it shows 3 sky-emission lines at 5888.192, 5889.959, and 5895.932˚A (CCD rows 1300, 1220, 940 respectively), 2 stellar absorption lines at . ˚ A and . ˚ A (CCDrows 288 and 72), and 3 cosmic-ray hits, one in the middle of the spatial proﬁle, one in the wings of anabsorption line, and one in a sky line: a) Original image after bias subtraction, ﬂat-ﬁelding, and scattered-light subtraction. b) Spatial proﬁle calculated with our re-implementation of the original algorithm by P&V.c) Spatial proﬁle calculated with our new algorithm. Note that the wings of the proﬁle shown in b) showsmall steps at the limits of the extraction width, which disappeared in c). The ripples visible at the top of theproﬁles are caused by the trace function crossing columns of the CCD.a) b) c)Fig. 6.— Comparison of the reconstructed object spectra from Fig. 5. a) Original version by P&V withoutproper cosmic-ray removal at row 1206. While the spike at the wing of the spatial proﬁle is properlyremoved, the cosmic ray close to the proﬁle center causes a spike in an absorption line in the extractedobject spectrum, which partially ﬁlls in the absorption feature. b) Our re-implementation, all cosmic-ray hitsproperly removed. c) Our new algorithm with all spikes removed and the wings of the spatial proﬁle goingdown to zero as they should. 19 –a) b) c)Fig. 7.— Comparison of the reconstructed sky spectra from Fig. 5. Visible are three sky lines at rows 940(5895.932 ˚A), 1220 (5889.959 ˚A), and 1300 (5888.192 ˚A). The sky images are scaled the same and thesurrounding pixels are set to zero for a better visiblity of the sky. a) Original version by P&V – as expectednearly identical to b) (our re-implementation). The sky line at row 1220 is slightly overestimated. c) Ournew algorithm with the smoothest sky and all sky lines properly reproduced (as shown below). Even anadditional sky line at row 982 (5894.472 ˚A) clearly shows up which completely vanishes in the noise of theother reconstructed sky spectra.a) b) c)Fig. 8.— Comparison of the differences between the CCD image from Fig. 5 and the reconstructed objectand sky spectra from Fig.s 6 and 7. No obvious remaining structures which would indicate systematicdiscrepancies between the CCD image and the reconstruction can be seen. 20 –Fig. 9.— Plot comparing the resulting object and sky spectra of the spectral region shown in Fig. 5, extractedwith the original sky-subtraction algorithm from P&V, our new programs, and the UVES pipeline. The upperpart of the image shows the extracted object spectra, the middle part shows the extracted sky, and the lowerpart shows the estimated SNR. In the upper two parts the spectra were shifted upwards for better visibility.Note the spike at . ˚ A in the UVES spectrum and the original implementation by P&V, which iscaused by a cosmic-ray hit. 21 –Fig. 10.— Plot comparing the resulting sky (actually accretion disk) and object spectra for a short partof an UVES spectrum of LMC-X1 in the blue arm, extracted with our re-implementation of the originalsky-subtraction algorithm from P&V and our new programs. The most left image is the bias and scattered-light subtracted and ﬂat-ﬁelded CCD image, followed by the reconstructed sky (original algorithm and ournew programs), CCD minus reconstructed sky, reconstructed object image, and CCD minus reconstructedobject. As the object is close to the slit edge, the original algorithm by P&V is struggling to remove the skyproperly, while our new programs deliver a much better result. Note that the apparent over subtraction ofthe sky in the middle panel is not real and only due to the fact that the right aperture limit is outside the slit’simage where there is no signal. 22 –Fig. 11.— Plot of the object and sky spectra shown in Fig. 10. The SNR measured with IRAF splot is 6.3for the original algorithm by P&V and 9.1 for our new algorithms – an improvement of nearly 50%. 23 –Fig. 12.— Plot comparing the computing time for the calculation of the aperture proﬁle and optimal ex-traction of one order (15x4089 pixels) for the ESO/FEROS spectrograph. Our C/C++ re-implementation ofP&V’s algorithms is about 10 times faster than the original