[PDF] Towards a Self-calibrating, Empirical, Light-Weight Model for Tellurics in High-Resolution Spectra

Abstract

To discover Earth analogs around other stars, next generation spectrographs must measure radial velocity (RV) with 10 cm/s precision. To achieve 10cm/s precision, however, the effects of telluric contamination must be accounted for. The standard approaches to telluric removal are: (a) observing a standard star and (b) using a radiative transfer code. Observing standard stars, however, takes valuable observing time away from science targets. Radiative transfer codes, meanwhile, rely on imprecise line data in the HITRAN database (typical line position uncertainties range from a few to several hundred m/s) and require difficult-to-obtain measurements of water vapor column density for best performance. To address these issues, we present SELENITE: a SELf-calibrating, Empricial, Light-Weight liNear regressIon TElluric model for high-resolution spectra. The model exploits two simple observations: (a) water tellurics grow proportionally to precipitable water vapor and therefore proportionally to each other and (b) non-water tellurics grow proportionally to airmass. Water tellurics can be identified by looking for pixels whose growth correlates with a known calibration water telluric and modelled by regression against it, and likewise non-water tellurics with airmass. The model doesn't require line data, water vapor measurements and additional observations (beyond one-time calibration observations), achieves fits with a reduced chi squared of 1.17 on B stars and 2.95 on K dwarfs, and leaves residuals of 1% (B stars) and 1.1% (K dwarfs) of continuum. Fitting takes seconds on laptop PCs: SELENITE is light-weight enough to guide observing runs.

Full PDF

DDraft version March 21, 2019

Typeset using L A TEX preprint style in AASTeX61

TOWARDS A SELF-CALIBRATING, EMPIRICAL, LIGHT-WEIGHT MODEL FORTELLURICS IN HIGH-RESOLUTION SPECTRA

Christopher Leet, Debra A. Fischer, and Jeff A. Valenti Yale University, 52 Hillhouse, New Haven, CT 06511, USA Space Telescope Science Institute, 3700 San Martin Dr., Baltimore, MD 21218 USA (Received; Revised; Accepted)

Submitted to ApJABSTRACTTo discover Earth analogs around other stars, next generation spectrographs must measure radialvelocity (RV) with 10 cm/s precision. To achieve 10cm/s precision, however, the eﬀects of telluriccontamination must be accounted for. The standard approaches to telluric removal are: (a) observinga standard star and (b) using a radiative transfer code. Observing standard stars, however, takesvaluable observing time away from science targets. Radiative transfer codes, meanwhile, rely onimprecise line data in the HITRAN database (typical line position uncertainties range from a fewto several hundred m/s) and require diﬃcult-to-obtain measurements of water vapor column densityfor best performance. To address these issues, we present SELENITE: a SELf-calibrating, Empricial,Light-Weight liNear regressIon TElluric model for high-resolution spectra. The model exploits twosimple observations: (a) water tellurics grow proportionally to precipitable water vapor and thereforeproportionally to each other and (b) non-water tellurics grow proportionally to airmass. Watertellurics can be identiﬁed by looking for pixels whose growth correlates with a known calibrationwater telluric and modelled by regression against it, and likewise non-water tellurics with airmass.The model doesn’t require line data, water vapor measurements and additional observations (beyondone-time calibration observations), achieves ﬁts with a χ red of 1.17 on B stars and 2.95 on K dwarfs,and leaves residuals of 1% (B stars) and 1 .

1% (K dwarfs) of continuum. Fitting takes seconds onlaptop PCs: SELENITE is light-weight enough to guide observing runs.

Keywords: techniques: radial velocities – methods: data analysis

Corresponding author: Christopher [email protected] a r X i v : . [ a s t r o - ph . I M ] M a r Leet INTRODUCTIONTo expand the success of exoplanet searches, next generation spectrographs are aiming for sub-meter-per-second precision in radial velocity measurements. If the 10 cm s − instrumental precisiongoal of the Echelle SPectrograph for Rocky Exoplanets Search and Stable Spectroscopic Observations(ESPRESSO Pepe et al. 2013) and the EXtreme PREcision Spectrograph (EXPRES Jurgenson et al.2016) is reached, we will be able to detect small rocky planets orbiting in the habitable zones oftheir host stars. Such high precision requires extraordinary new ﬁdelity in spectroscopic data: highresolution, high S / N and greater instrumental stability. In addition to controlling instrumental errors,success requires accounting for any systematic temporal changes in the spectral line proﬁles, whichcan arise from photospheric velocities or telluric contamination (Fischer et al. 2016).Most work on modeling telluric contamination has been tested at near infrared wavelengths wherethe telluric line depths are comparable to stellar absorption lines. However, the next generationoptical spectrographs aiming for 10 cm s − radial velocity precision will be aﬀected by time vari-able microtellurics that raster across the stellar spectrum because of barycentric velocity shifts. Ifwe do not identify pixels that are producing the small perturbations to spectral line proﬁles, thenmicrotellurics may dominate the error budget for extreme precision radial velocity programs. TELLURIC SPECTRAAtomic and molecular species in the Earth’s atmosphere interact with solar radiation and produceabsorption and emission lines that are imprinted in stellar spectra obtained with ground-based spec-trographs. The non-water constituents (e.g., N , O , Ar, Ne, He) are well-mixed, and maintain anearly ﬁxed element ratio throughout the troposphere, stratosphere, and mesosphere. The concen-tration of some non-water species (CO , CH , NO X ) exhibit seasonal changes or modulation frompost-industrial human activities. However, these gases have stable concentrations on timescales of(at least) several days. In contrast, 99% of atmospheric water vapor is conﬁned to the troposphereand exhibits both temporal and spatial variability that can change by more than 10% on timescaleof an hour (Blake & Shaw 2011).Figure 1 shows the telluric spectrum with a wavelength range of 4500 − / N and resolution of the FTS telluric spectrum shows that the optical spectrum is pepperedwith microtelluric lines with depths that are only a few percent of the continuum. Many of thelines shallower than 1% in Figure 1 will disappear when convolved with the instrumental line spreadfunction (LSF) of high-resolution ( R , − , producing small, but systematic,time-variable line proﬁle variations. Optical RV programs aiming for 10 - 20 cm s − precision will needto account for microtelluric lines because they introduce errors that exceed the target RV precisionCunha et al. (2014). ellurics Figure 1.

The FTS solar spectrum from 4500 - 6800˚A has a resolution, R ∼ ,

000 and S / N > CURRENT BEST PRACTICESSince telluric contamination is a serious error source for high precision spectroscopy, there is a richliterature of practices for telluric modelling. These practices fall into three categories: (a) modellingusing telluric standard stars (Section 3.1), (b) modelling using radiative transfer codes (Section 3.2)and (c) modelling using principle component analysis (Section 3.3). Finally, we discuss the literaturesurrounding a new challenge in telluric modelling: microtelluric modelling (Section 3.4).3.1.

Telluric Standard Stars

The classical approach to removing telluric absorption features is to observe a telluric standardstar close in time and airmass to the science object (Vacca et al. 2003; Vidal-Madjar et al. 1986).The science target’s spectrum is then divided by the spectrum of the standard star. Typically, earlytype stars from early A to late B are chosen as standard stars because they exhibit few and weakmetal lines, and their rapid rotation helps smear out the lines that remain. The high S / N aﬀordedby bright stars means that with high spectral resolution, even shallow telluric lines are discernible.These stars have the drawback that their strong hydrogen absorption features at the Brackett andPaschen lines blend with their tellurics (Rudolf, N. et al. 2016). As an alternative, a solar type starcan be used as a telluric standard using a high-resolution solar spectra (Maiolino et al. 1996).Using any standard star as a telluric reference model has several well known drawbacks.

First, ittakes away precious observing time from an observation’s science targets, especially when high S / Nrequirements are to be met (Seifahrt, A. et al. 2010).

Second, its accuracy is limited by how well thestandard star’s spectrum is known. Early type stars often display spectral features such as oxygenor carbon lines in the near-infrared. Similarly, absorption line depths of solar-type stars may deviatefrom the solar FTS atlas due to metal abundance or surface temperature deviations, leaving residualsfrom the star’s intrinsic features in the telluric model (Rudolf, N. et al. 2016). Compounding thisproblem, the need to pick a star close to the science target often forces the observation of less wellknown stars.

Finally, for telescopes with an adapative optics system (e.g. CRIRES), the changein source brightness between the science target and the standard star will aﬀect the instrumental

Leet proﬁle (Seifahrt, A. et al. 2010). In practice, Ulmer-Moll, S. et al. (2019) ﬁnd that standard starsconsistently underperform other telluric removal approaches.3.2.

Radiative Transfer Codes

Another approach is to use line-by-line radiative transfer model (LBLRTM) codes to model tel-luric lines. This technique requires accurate atmospheric temperature and pressure proﬁles, an ex-cellent model for the spectrograph line spread function, and a complete and accurate atomic linedata base. The atmospheric inputs to these codes have beneﬁted from the commercial interest andinvestments in making more accurate weather predictions. Most radiative transfer codes, includingTERRASPEC (Bender et al. 2012), Transmissions Atmosph´eriques Personnalis´ees Pour l’AStronomie(TAPAS Bertaux et al. 2014), Telﬁt (Gullikson et al. 2014) and Molecﬁt (Smette et al. 2015) usethe HIgh Resolution TRANsmission line database (HITRAN Rothman et al. 2013) and are able tomodel non-water telluric lines with an accuracy of around 2%.Unfortunately, radiative transfer codes also suﬀer from documented drawbacks.

First , radiativetransfer codes are limited by imprecise line data in the HITRAN database. The uncertainty in eachHITRAN line’s position is typically a few to several hundred m/s, but can be up to multiple km/s.HITRAN line strengths are rarely accurate to the 1% level (Seifahrt, A. et al. 2010). Rudolf, N. et al.(2016) also remark on this problem when modelling tellurics in the near IR.

Second , radiative transfer codes often struggle to model water lines. Bertaux et al. (2014) identifysome cases in TAPAS where two adjacent water lines required diﬀerent amounts of water for anadequate model. This is clearly non-physical (there is only one column density of water), but theauthors are uncertain why this discrepancy appears. Rudolf, N. et al. (2016) note that HITRAN hasimperfect water line information and induce substantial residuals in their radiative transfer code.3.3.

Principal Component Analysis

Artigau et al. (2014) investigated the use of principal component analysis (PCA) for empiricallymodeling telluric lines at near infrared wavelengths. They used observations of hot, rapidly rotatingstars to build a library of telluric standards with a range of water column density and air mass. Theﬁrst ﬁve principal components of the telluric absorption features were used to ﬁt telluric lines inspectra of program stars using least squares ﬁtting. This empirical approach self-calibrates spectraand avoids the need for atomic line data or estimates of water column density. We believe that PCA’sempirical approach is on the right track. However, PCA is a very generic model, and could beneﬁt byincorporating the well-studied physics of telluric line formation. By introducing principled physicalpriors, we aim to improve the sophistication of this approach.3.4.

The Challenge of Microtellurics

Most methods for modeling telluric lines have been applied to lines that are redward of 6800˚A.The telluric features at these red wavelengths are easier to identify, both because the telluric linesare deeper and the density of stellar lines is decreasing. Currently there is not a robust method formodeling microtellurics. Unfortunately, simulations by Cunha et al. (2014) show that if ignored,microtelluric contamination in the optical spectrum will introduce RV errors between 0.2 - 1.0 m s − ,swamping the error budget of next generation RV surveys. Cunha et al. (2014) modeled microtelluriclines in HARPS optical spectra using TAPAS, an online service that simulates atmospheric trans-mission with input from the ETHER Atmospheric Chemistry Data Centre, atomic line data from ellurics − . Achieving RVaccuracies of 10 cm s − necessitates accurate modelling of microtellurics. SELENITE: A SELF-CALIBRATING LINEAR REGRESSION MODELWe now describe SELENITE’s telluric model. Since water and non-water tellurics exhibit diﬀerentbehavior (Hadrava, P. 2006), SELENITE treats their lines separately, and so we develop the model asfollows. First, we describe the training data used to illustrate and evaluate SELENITE (Section 4.1).We proceed to describe the model for water tellurics (Section 4.2) and evaluate its performance onthe B star HR3982 (Section 4.3). We then describe the model for non-water tellurics (Section 4.4)and evaluate its performance (Section 4.5), before ﬁnally combining the two halves and applyingthem to Alpha Centaur B, a K dwarf with signiﬁcant stellar features (Section 4.6).4.1.

Training Data

The training data included 51 spectra of rapidly rotating B stars observed with the ﬁber-fed CH-IRON spectrograph (Tokovinin et al. 2013), which is located at 1.5-m telescope at the Cerro TololoInteramerican Observatory (CTIO). The B-type stars are ideal for this calibration because they arebright and have few spectral lines, providing high S / N spectra that are relatively easy to continuumnormalize. The iodine cell that is used for Doppler measurements with CHIRON was not in the lightpath for any of these observations. These spectra were obtained with the narrow slit mask, whichyields a spectral resolution, λ/δλ of R = 140 ,

000 and exposure times were set to reach a typical S / Nof 100. The air mass for each observation was recorded in the FITS header; however, no informationwas available regarding the PWV or other atmospheric conditions.Figueira et al. (2010) demonstrate long-term stability of telluric lines at the level of 10 m s − (corresponding to 0.01 of a pixel) at the La Silla Observatory using the environmentally stabilized andﬁber-fed HARPS spectrograph. The CHIRON spectrograph does not have the stability of HARPS,and the spectral format can drift by a fraction of a pixel from night-to-night. To correct for thesesmall drifts, the spectral orders were cross-correlated to align the telluric absorption lines.4.2. Water Tellurics

The Theory of Water Tellurics

Each water vapor line has a speciﬁc absorption coeﬃcient, σ , which depends on fundamental atomicand molecular line data, including the log( gf ) value, excitation potential, and the partition function.The line strength of water tellurics also changes with the number of absorbers along the line of sight,or the column density. The radiative transfer equation for the intensity of light with wavelength λ passing through a plane-parallel atmosphere with a single species of absorber is: I λ = I λ, e − σ λ · n j · z (1)ln I λ = − σ λ · n j · z (2) Leet where I λ, and I λ are the initial and ﬁnal intensity, σ λ is the eﬀective cross-section for absorption,and n is the average number density of water vapor absorbers. Path length, z , is measured in unitsof airmass at zenith. The column density of water vapor, PWV is n j · z . If a spectrum is normalized,( I λ, = 1 . n · z ).The depth of any twowater lines is therefore linearly related: by measuring the depth of an arbitrary water line(or set of lines), we can predict the depth of every other water line in the spectrum. Werefer to the water telluric used to construct the telluric spectrum as the calibration telluric, and thepixel at the core of the calibration line as the calibration pixel.As an example, Figure 2 shows two water telluric lines from the set of training spectra. Both setsof spectra (Figure 2 right, top and bottom) have been color-coded by the intensity of the pixel at λ = 5898 . Figure 2. (left) Correlation between ln( I λ ) in a pixel containing a telluric signal at 5946 . I cal ), at 5898 . We now derive the precise relationship between the depths of any two water tellurics. From theradiative transfer equation, the intensities of any pair of water lines, ( I λ i , I λ cal ), grow proportionallyto each other in log space. Since the average number density of water absorbers and the airmass isa constant at any time t i , the constant of proportionality between the growth of two lines, as shown ellurics σ λ i /σ λ cal . We denote this constant of proportionality as m λ i cal .ln I λ i,t − ln I λ i,t ln I λ cal,t − ln I λ cal,t = σ λ i [ n t · z t − n t · z t ] σ λ cal [ n t · z t − n t · z t ] = σ λ i σ λ cal ≡ m λ i cal (3)A similar linear regression is carried out to empirically relate every other pixel in the spectrum tothe calibration pixel, implying an equation of the form ln I λ i = m λ i λ cal ln I λ cal + b . During this process,the y-intercept was always found to be zero, simplifying the regression model to:ln I λ i = m λ i cal ln I λ cal (4)One exception to the above are saturated tellurics, which have left the linear regime of growthand do not obey Equation 1. In both our water and non-water analysis, however, we ﬁnd no telluricdeeper than 50% of continuum between 4500˚A-6800˚A and so no saturated telluric. Saturated telluricsare therefore considered outside of this paper’s scope. Another exception to the above is variations inthe instrumental line spread function (LSF) over time changing a telluric’s proﬁle. SELENITE doesnot model instrumental errors, and these variations can only be handled by observing new trainingdata under the new LSF. Fortunately, at CHIRON’s resolution tellurics are marginally resolved,attenuating LSF changes. In practice, CHIRON’s LSF is relatively stable over years, allowing 2012K-dwarf observations to be ﬁt by a model built on 2014 B star observations (Section 4.6).The correlated growth of water tellurics can also be exploited to identify water tellurics. The PCCof each pixel’s growth with the calibration pixel can be measured, and each pixel whose PCC exceedsa threshold, k , can be ﬂagged as containing a water telluric. Usefully, SELENITE can discover newwater tellurics not contained in HITRAN and correct the position of HITRAN’s water tellurics.Three additional tests are applied to pixels with PCC > k to eliminate false positives: First, theline spread function for CHIRON has a full width half maximum of 3 pixels. Therefore, we requirea minimum of three consecutive pixels with PCC values that exceed k . Single or double pixelsare assumed to be spurious. Second, because telluric lines have Gaussian proﬁles, the cluster ofﬂagged pixels must pass a peak detection algorithm. Finally, the high resolution FTS solar spectrum(Figure 1) indicates that telluric lines appear in clusters rather than as single isolated lines. Anyisolated telluric without another telluric within 10˚A is therefore rejected.4.2.2. Establishing a PCC Threshold

The threshold PCC ( k ) for ﬂagging pixels with a telluric signal must be chosen to minimize boththe number of both spurious detections (false positives) and the number of missed telluric lines (falsenegatives). This critical step ensures that the model telluric spectrum will have the highest possibleﬁdelity. If spurious features are included in a model, they will be used to assign zero weight pixels,resulting in lost data for the radial velocity cross-correlation. If telluric features are missed in amodel, they will remain in the stellar spectrum and increase the radial velocity errors.The selection process begins by proﬁling the false positive rates of diﬀerent values of k . Thecorrelation between a calibration pixel and a noise pixel in the data set is simulated by generating n = 51 points of the form [ln( I cal ), ln( I λ )]. The values of ln( I cal ) evenly ﬁll the range [ − ,

0] andrepresent a range of possible calibration line depths, while values of ln( I λ ) are drawn at randomfrom a Gaussian distribution with σ = 0 .

01, representing shot noise typical of the CHIRON spectra

Leet (S / N ∼ . .

425 and fewer than 0 .

01% of pixels generate a PCC > . k = 0 .

425 has just a 0 . = 10 − % changeof generating a false positive. Since the CHIRON spectrum has about 200,000 pixels, this thresholdhas just a 0.02% chance of generating a false positive. Figure 3. (left) A cumulative histogram for 100,000 trials to measure the PCC between a designatedcalibration pixel ln( I cal ) and pixels representing ln( I λ ) with only Gaussian noise scaled to σ = 0 .

01. (right)The probability of detecting a signal as a function of telluric line depth in the presence of the same level ofGaussian noise. The purple solid, red dashed, orange dash-dot and grey dotted lines show the 90, 99, 99.9,and 99.99 limits respectively.

Once a threshold PCC is established, the minimum line depth detectable under the threshold inspectra with S / N ∼

100 is evaluated. A PCC threshold that is too high will fail to detect shallowlines (generating false negatives), reducing the sensitivity of the model. We again generated pointsrepresenting pixels from 51 spectra with the form [ln( I cal ) , ln( I λ )]. The calibration line depth, ln( I cal ),was again evenly distributed across the range [ − , I λ ) were scaledaccording to ln( I λ ) = c · ln( I cal ). By randomly selecting values of c ∈ [0 , . ≤ / N ∼

100 was then added to ln( I λ ),and the percentage of time that the PCC was greater than k for pixel pairs was recorded. Thissimulation was repeated for 100,000 trials, and the results show that 90% of lines deeper than 2.3%and 99.9% of lines 3% of the continuum will be identiﬁed with the linear regression method describedhere (Fig 3, right). However, there is a precipitous drop in our ability to model tellurics with linedepths shallower than 2%. This result is, of course, dependent on the S / N of the training populationand should improve if the training set had higher S / N and better continuum normalization.4.2.3.

SELENITE’s Water Telluric Model

The steps taken to identify and model water tellurics in Section 4.2.1 are summarized below.1. The PCC of each pixel’s growth with a calibration pixel is calculated. A threshold PCC, k , isestablished, and pixels with PCC > k are ﬂagged as signiﬁcant. ellurics

92. Single or double pixels with PCC > k are rejected as spurious.3. The training data set is coadded and a peak detection algorithm is applied to each cluster ofmore than three pixels. Clusters which do not contain a peak are rejected as tellurics.4. Any cluster of ﬂagged pixels with no other cluster with 10˚A is rejected as a telluric feature.5. Linear regression is carried out on pixels that are ﬂagged as tellurics to measure m λ cal relative toa pre-identiﬁed calibration pixel. The wavelength, regression coeﬃcient, PCC and water/non-water classiﬁcation of each ﬂagged pixel is then stored in a database.The wavelength, linear coeﬃcient, PCC, and a ﬂag identifying the pixel as water is stored foreach pixel that has passed the selection criteria for water tellurics is stored in a database. Table 1lists an excerpt of a database generated from the training data’s content using the 5901.6˚A telluricas a calibrator. To generate a model of telluric water lines, the intensity of the central pixel in acalibration line is measured and information in the database is used to generate water tellurics forevery pixel in the spectrum:ln I λ i =  m λ i cal · ln I cal when λ i ∈ valid peak ∧ P CC λ > k otherwise (5)where m λ i cal is the ratio of eﬀective cross-section for absorption at λ i relative to λ cal is the eﬀective cross-section of the calibration line wavelength (or the weighted average for an ensemble of calibration lines), I cal is the intensity at the calibration line wavelength, and k is the threshold correlation coeﬃcientindicating telluric presence. Generation of the telluric water model takes less than 3 minutes on a2015 Macbook Air with a 2.2 GHz Intel Core i7 processor and 8GB of 1600 MHz DDR3 RAM andallows for identiﬁcation of variable numbers of telluric-contaminated pixels, depending on the PWV.This is valuable since, as Figure 2 shows, water telluric size can vary by an order of magnitude.On night with high PWV, at a threshold k of 0.425 (see Section 4.2.2), up to ∼ .

1% of pixels under 6800˚A. On dry nights, as few as ∼ .

2% of pixels under 6800˚A. This is a savings of ∼

75% of an order. λ i σ λ /σ . PCC Species Flag5898.12061 0.49523 0.992 W5898.14209 0.89206 0.994 W5898.18457 0.66062 0.991 W5898.20556 0.34039 0.937 W5898.99121 0.47828 0.977 W

Table 1.

An excerpt from the telluric database generated for our training spectra.

Identifying and Modelling Water Microtellurics

SELENITE is successful at identifying relatively shallow telluric features. Figure 4 shows thetraining set spectra for the wavelength range between 5075˚A and 5120˚A. From the NSO atlas0

Leet (Figure 1) it is clear that this wavelength range should only contain weak microtelluric lines. Spectrain Figure 4 (left) are color-coded by the intensity of the calibrating water telluric line at 5898.16˚Aand it is diﬃcult to see correlated growth for any microtelluric lines. However, when the pixels ineach spectrum are color-coded by the strength of the PCC (regressed against a pixel in the core ofthe 5898.16˚A line), even telluric lines with a depth close to the photon noise in the continuum emergewith high conﬁdence (Figure 4, middle). A close-up view (see right panel of Figure 4) highlights adetected microtelluric line with a depth only slightly greater than the photon noise.

Figure 4. (left) Segments of 51 overplotted CHIRON spectrum in the wavelength range between 5082 -5094˚A, color-coded by airmass. It is diﬃcult to pick out telluric lines in this image. However, when the pixelsare color-coded according to the PCC (middle), several microtellurics can be detected with high conﬁdence.Zooming in on the wavelength segment at 5086˚A (right), the correlated pixel structure for identiﬁed weaktelluric lines appears to be cleanly identiﬁed.

Moreover, SELENITE is accurate for microtellurics, whose depth is close to the shot noise of thespectra. As an example, the pixel intensity at the center of a shallow microtelluric line is plottedagainst the pixel intensity of the 5898.16˚A calibration line in Figure 5. Following the format forFigure 2, the telluric spectra in the wavelength region around 5898.16˚A and the spectra near 5086.3˚A(Figure 5 right) are color-coded according to the depth of the 5898.16˚A line. The linear regressionbetween the calibration line and the underscored microtelluric line at 5086.3˚A is shown in the leftpanel of Figure 5 and models the intensity of the microtelluric line with a mean SSE of 0.009,comparable to the S / N of the spectrum.4.2.5.

Using an Arbitrary Pixel as a Calibrator

A powerful feature of SELENITE is that any arbitrary pixel or ensemble of pixels in the databasecan be substituted for the calibration pixel without requiring additional analysis by dividing eachlinear coeﬃcient by the scale factor from the original calibration pixel to the new calibration pixel.As an example, Equation 6 shows a how model based on calibration A can be transformed to a modelbased on calibration line B . ln I λ i = m λ i A m BA · ln I B (6)The linear coeﬃcients in the regression model were derived with B-stars (telluric stars) becausethese spectra have both high S / N and few spectral lines. However, once the linear coeﬃcients have ellurics Figure 5. (left) The correlation between a microtelluric line intensity at 5086.3˚A and a telluric calibrationline at 5898.16˚A. (right-top) A plot of a set of telluric lines at 5898.16˚A for our 51 Bstar spectra, color-codedby line depth. (right-bottom) The microtelluric line at 5086.3˚A (underlined). As before, the spectra for themicrotelluric line are color-coded by the intensity of the calibrating telluric line, emphasizing the correlation. been derived, the coeﬃcients can be used to model telluric contamination in spectra of later typestars as long as the selected telluric calibration line is isolated from the stellar absorption lines or thestellar absorption feature is well enough known (for example, by spectral synthesis modeling) thatit can be divided out. The ability to use the database to switch between diﬀerent calibrating pixels(described above) oﬀers critical ﬂexibility for modeling tellurics in spectra of late type stars.4.3.

Results for Water Tellurics

Model Goodness of Fit

We evaluate SELENITE’s goodness of ﬁt using the B star HR3982’s telluric spectrum. The HR3982spectrum used was generated by averaging 3 unique observations taken over 40 min to drive up itsS/N. Goodness of ﬁt was measured using the reduced chi squared ( χ red ) test statistic. HR3892’sobserved ﬂux was treated as the true model, F obs,i , SELENITE’s model of the ﬂux as the ”data”, F model,i and the error calculated by the data reduction pipeline (0.75% of continuum), scaled by (a) the root of the number of spectra coadded ( √

3) and (b) the root of model’s ﬂux ( (cid:112) F model,i ) as thestatistical errors, σ model,i = 0 . / (cid:112) F model,i .First, to estimate the data quality independent of telluric removal, we measured the χ red of a3200px wavelength range unaﬀected by telluric lines, 4892˚A-4952˚A, with unity. We found a χ red of1.03, suggesting that our errors were well-calibrated. Next, the χ red of our model’s ﬁt in a 3200pxwavelength range with heavy water tellurics, 6472˚A-6545˚A was measured. This range was chosenbecause (a) it contains the most intense water tellurics bluewards of 6800˚A and (b) it was free fromstellar features. Only pixels where a telluric was detected were included in the χ red calculation. A25px range from 6521.5˚A-6522.5˚A was found to have errors 20 × higher than any other error, thisregion was ﬂagged as an outlier and excluded. The χ red of the telluric model was found to be 1.25.In particular, the line cores were ﬁt well, with a χ red of 1.11. To reach a similar χ red in the aﬀectedand unaﬀected region, errors in the aﬀected region need to be increased by ∼ . Leet

Figure 6 (top) plots a 5˚A excerpt from the aﬀected region, with HR3982’s spectrum shown in purpleand our model shown in blue. The ﬁt’s residuals deviate from unity by 1 .

0% on average, comparable tothe unaﬀected regions of the spectrum and the performance of radiative transfer codes. (Ulmer-Moll,S. et al. 2019). One potential ﬂaw in our model is that modelling all points without signiﬁcant telluricsignal as unity creates discontinuities in the telluric wings, however, Figure 6 (bottom) indicates thesediscontinuities are small, and most users will prefer to mask aﬀected pixels rather than dividing out.

Figure 6. (top) Excerpt of SELENITE’s ﬁt (blue) to HR3982’s science spectrum, (purple). The ﬁt’sgoodness is quantiﬁed in Section 4.3.1. (bottom) Residuals generated by dividing out the telluric model.The residuals deviate from unity by ∼

1% of the continuum on average, comparable to unaﬀected regionsof the spectrum.

Relative Contribution of PWV and Airmass to Water Line Depth

A further result is that the contribution PWV to water line depth generally dominates over airmass.As an example, Figure 7 shows that a low airmass (z=1.144) observation of the 5900˚A water linescan exhibit signiﬁcantly greater line depth than a subsequent higher airmass (z=1.454) observationbecause of changes in PWV. While the water column density for an observation depends on both theaverage number density of absorbers along the line of sight (PWV) and the path length (airmass),PWV can vary by as much as an order of magnitude while airmass generally ranges between 1 and2. In general, water line depth only weakly correlates with airmass. This lack of correlation can beexploited to distinguish water and non-water lines.4.4.

Non-water Tellurics

In this section, telluric absorption lines from molecules other than water are considered. Like watertellurics, each non-water telluric can be modeled by the radiative transfer equation for a plane parallelatmosphere and thus its signal intensity given by σ λ i · n j · z , where n j is the number density of themolecular species, j .Unlike water tellurics, however, non-water tellurics have no equivalent of PWV. Ignoring smallseasonal variations in gases such as CO , n j is spatially and temporally ﬁxed. Each non-water speciesin the atmosphere is evenly distributed with a constant number density. Therefore the column density ellurics Figure 7.

A spectrum observed at an airmass of 1.144 (red) displays signiﬁcantly deeper telluric lines thana spectrum observed at an airmass of 1.454 because of changes in PWV between observations. of non-water lines only varies with airmass: by measuring airmass, we can predict the depthof every non-water line in the spectrum.

As an example, Figure 8 (right) shows that over ourobserved range of airmass ( z between 1 . − .

8) the signal intensity of the oxygen telluric feature at6277.7˚A (Figure 8, left) is well ﬁtted by the linear regression model ln( I . ) = m · z + b . The slopeof the regression model, m , measures σ λ i · n j . Another diﬀerence from the model for water lines isthat the y-intercept (a ﬁctitious extrapolation to zero airmass) is small, but non-zero. Figure 8. (left) The correlation between the 6277.6˚A oxygen feature and airmass. (right) The telluricoxygen feature at 6277.6˚A for our set of 51 CHIRON spectra, color-coded by airmass.

Like water lines, non-water lines can be identiﬁed by measuring the correlation of their growth withairmass. Each pixel whose growth’s PCC with airmass is above a threshold, k , is assumed to havenon-water telluric and undergoes the same procedure as water telluric pixels. Again, this potentiallyallows for the detection of tellurics not listed in the HITRAN database.4 Leet

Non-water lines can be readily distinguished from water lines because non-water lines have a lowcorrelation with the water calibration pixels but a high correlation with airmass, and vice versa forwater lines (see Section 4.3.2). Separating components that vary with airmass from those that don’tis a beneﬁt of SELENITE that might well be useful outside the scope of this paper, which as in thenear IR, where H O, CO and CH lines mix. When a water and non water line blend, the compositeline can have a signiﬁcant correlation with both the water calibrator and airmass. A regression modelis not ﬁt to composite lines, but they are ﬂagged in the database.4.5. Results for Non-Water Tellurics

We evaluate SELENITES’s gooodness of ﬁt by using the B star HR 3982’s telluric spectrum fol-lowing the procedure described in Section 4.3.1. This time, however, we measured the χ red of themodels ﬁt from 6257˚A-6328˚A, a 3200px wavelength range which encompasses the heart of the 6280˚AO γ atmospheric band. Only pixels where a non-water telluric was detected were measured. The χ red of the telluric model was found to be 1.17. To reach a similar χ red in the aﬀected and unaﬀectedregion, errors in the aﬀected region need to be increased by 2 . γ atmospheric band. The ﬁt’s residuals deviate from unity byabout ∼ .

75% on average, comparable to unaﬀected regions of the spectrum.

Figure 9. (top) Excerpt of the model’s ﬁt (blue) to two oxygen doublets in the O γ atmospheric bandin HR3982’s science spectrum (purple). The ﬁts goodness is quantiﬁed in Section 4.5 (bottom) Residualsgenerated by dividing out the telluric model. The residuals deviate from unit by ∼ .

75% of the continuumon average, comparable to unaﬀected regions of the spectrum.

Unfortunately, there are no non-water species with telluric lines other than oxygen bluewards of6800˚A, so we cannot evaluate our model on other species. Fundamentally, however, any well mixednon- water species should in theory behave as oxygen does.4.6.

Modelling Tellurics in a K Dwarf Spectrum

Late-type stars display complex absorption features. These absorption features do not complicateSELENITE’s non-water modelling, which only measures airmass, but they do complicate watermodelling, since they may blend with a calibration pixel’s line. To compensate for the loss of any givencalibration pixel, a large (50+) ensemble of potential calibration pixels are given in the database.Calibration pixels which are blended with stellar lines are identiﬁed and removed as follows. Ini-tially, a telluric model is built by regression against the average of all calibration pixel depths. If any ellurics α Centauri B. We measuredthe χ red of the models ﬁt at the 6450˚A water band described in Section 4.3.1. This measurement,however, was complicated by α Centauri B’s stellar lines: if a telluric line is blended with a stellarline, the model’s ﬁt will appear incorrect. This problem was overcome by noticing that changes in theEarth’s barycentric velocity will substantially shift the stellar lines in two observations of α CentauriB taken months apart while leaving the telluric lines in the same position. Tellurics that are blendedin the ﬁrst observation will often be unblended in the second observation, and vice versa.To illustrate, Figure 10 (top) shows SELENITE’s ﬁt to two observations of α Centauri B, atbarycentric velocities of 1860 m/s and 20500 m/s, for the same 5˚A wavelength range shown inSection 4.3.1. In the 20500 m/s observation, the deep line at 6475˚A seems ill ﬁt by the model’s pairof water lines (underlined), but in the 1860 m/s observation the deep line has shifted, revealing thatit was a stellar line blended with a pair of water line which the model now ﬁts well. The ﬁt’s residuals,shown in Figure 10 bottom, show that when tellurics are removed the two spectra are indeed thesame. Where the residuals do not contain a stellar line, they deviate from unity by an average of1.1%, comparable to the results of a radiative transfer code.

Barycentric v: 20500 m/s Barycentric v: 1820 m/s

Figure 10. (top) Excerpt of the model’s ﬁt (blue) to observations of α Centauri B science spectrum(purple) at barycentric velocities of 1820 m/s and 20500 m/s. The apparent misﬁt underlined in the 20500m/s spectrum is revealed to be a good ﬁt in the 1820 m/s spectrum when the deep stellar line at 6475˚Ashifts bluewards (bottom) The model’s residuals reveal that the two underlying spectra are the same. Leet

When we compute χ red , if the spectrum grossly deviates from a pixel ﬁt (by 3 .

0% or more ofthe continuum) we assume that the pixel is blended with a stellar line and reject it. Followingthis procedure, we found an χ red of 2.95 and 3.17 for the 1840 m/s and 20500 m/s α Centauri Bobservations. This ﬁt, while acceptable, is somewhat poorer than HR3982’s ﬁt, in large part becausetelluric lines often blend with stellar line tails, disrupting their proﬁle slightly. For example, thewings of the small telluric at 6472.5˚A (at the far left of Figure 10) are blended with a small stellartelluric, inﬂating the measurement of χ red . DISCUSSIONBecause of the barycentric velocity of the Earth, telluric lines raster across the stellar line proﬁlesin time-series Doppler measurements. Even shallow microtelluric features will degrade the ﬁdelity ofhigh-resolution spectra and may contribute up to 0.5 m s − to the RV error budget. Since the Earthinduces a radial velocity of 10 cm s − in the Sun, telluric contamination is a signiﬁcant challenge inthe search for analogs of our world. In this paper, we present SELENITE, an empirical techniquefor identifying and modelling telluric features in the optical (4500˚A-6800˚A), using the observations: (a) water tellurics grow proportionally to PWV and therefore proportionally to each other and (b) non-water tellurics grow proportionally to airmass. Water tellurics are identiﬁed by looking for pixelswhose growth correlates with a known calibration water telluric and modelled by regression againstit. Non-water tellurics are identiﬁed by looking for pixels whose growth correlates with airmass andmodelled by regression against it. SELENITE has several advantages over the alternatives: • Runtime:

Once the database is built ( < min on a standard PC) ﬁtting a spectrum takesseveral seconds, permitting SELENITE to be used at the telescope to help guide observingruns. • Observing time:

Unlike standard stars, after a one time observation of a few dozen B starsto build the database, SELENITE requires no further observations, saving observing time. • Requires no atomic/molecular line data:

Unlike radiative transfer codes, SELENITE doesnot require atomic/molecular line data. This is useful because the literature suggests HITRANis not always accurate. In particular, Seifahrt, A. et al. (2010) notes: ”Line data in HITRANhave strongly varying accuracy levels. Typical uncertainties of line positions range from a fewto several hundred m/s, but can be as high as several km/s in extreme cases. Line strengthsare rarely precise to the 1% level.” Further, Rudolf, N. et al. (2016) ﬁnd that inaccuracies inthe HITRAN database frustrate their ability to model water lines accurately. • Distinguishes tellurics that vary primarily with airmass from those that don’t:

Although outside the paper’s scope, this feature could be very useful in the near IR, whereH O, CO and CH lines mix.We acknowledge, however, that SELENITE has certain limitations. First , stellar features in theset of training B stars, (e.g., the Paschen and Brackett lines) will distort its model. This problem canbe solved by interpolating over each absorption, at the cost of introducing additional uncertainityto regions of scientiﬁc interest.

Second , SELENITE only varies with airmass and PWV. Otheratmospheric phenomena which may aﬀect line proﬁles (e.g., wind speed (Caccin et al. 1985)) is nottaken into account. Instrumental changes, such as a varying LSF, are also not considered, and canonly be handled by rebuilding the database for each instrumental proﬁle change.

Third , SELENITE’s ellurics / N data, at lower S / N a line’s wings may not clear the PCC threshold, truncating them.Despite these limitations, evaluations show that SELENITE provides excellent ﬁts. The model’sﬁt to regions of intense water tellurics and non-water tellurics in the B star HR3982 had χ red of 1.25and 1.17, and thus errors just 10.5% and 2.0% bigger than the continuum’s ﬁt to unity. Further,SELENITE’s ﬁts to the K-dwarf α Centauri B observations had χ red of 2.95 and 3.17, despite the χ red test statistic being inﬂated by stellar line blending, conﬁrming that it provides a good ﬁt to late-typestars. SELENITE’s average residual is 1 .

0% and 0 .

75% for HR3982 and 1 .

1% for α Centauri B,comparable to the residuals of radiative transfer codes (Ulmer-Moll, S. et al. 2019). ACKNOWLEDGEMENTSAcknowledgements: The authors gratefully acknowledge enabling support from the followinggrants NSF-1616086, NSF-MRI0923441, NASA-NNH17ZDA001N-XRP, NASA-NNH11ZDA001N-OSS. NSO/Kitt Peak FTS data used here were produced by NSF/NOAO.

Facilities:

CTIO: (CHIRON)8

Leet

REFERENCES