Towards a Self-calibrating, Empirical, Light-Weight Model for Tellurics in High-Resolution Spectra
DDraft version March 21, 2019
Typeset using L A TEX preprint style in AASTeX61
TOWARDS A SELF-CALIBRATING, EMPIRICAL, LIGHT-WEIGHT MODEL FORTELLURICS IN HIGH-RESOLUTION SPECTRA
Christopher Leet, Debra A. Fischer, and Jeff A. Valenti Yale University, 52 Hillhouse, New Haven, CT 06511, USA Space Telescope Science Institute, 3700 San Martin Dr., Baltimore, MD 21218 USA (Received; Revised; Accepted)
Submitted to ApJABSTRACTTo discover Earth analogs around other stars, next generation spectrographs must measure radialvelocity (RV) with 10 cm/s precision. To achieve 10cm/s precision, however, the effects of telluriccontamination must be accounted for. The standard approaches to telluric removal are: (a) observinga standard star and (b) using a radiative transfer code. Observing standard stars, however, takesvaluable observing time away from science targets. Radiative transfer codes, meanwhile, rely onimprecise line data in the HITRAN database (typical line position uncertainties range from a fewto several hundred m/s) and require difficult-to-obtain measurements of water vapor column densityfor best performance. To address these issues, we present SELENITE: a SELf-calibrating, Empricial,Light-Weight liNear regressIon TElluric model for high-resolution spectra. The model exploits twosimple observations: (a) water tellurics grow proportionally to precipitable water vapor and thereforeproportionally to each other and (b) non-water tellurics grow proportionally to airmass. Watertellurics can be identified by looking for pixels whose growth correlates with a known calibrationwater telluric and modelled by regression against it, and likewise non-water tellurics with airmass.The model doesn’t require line data, water vapor measurements and additional observations (beyondone-time calibration observations), achieves fits with a χ red of 1.17 on B stars and 2.95 on K dwarfs,and leaves residuals of 1% (B stars) and 1 .
1% (K dwarfs) of continuum. Fitting takes seconds onlaptop PCs: SELENITE is light-weight enough to guide observing runs.
Keywords: techniques: radial velocities – methods: data analysis
Corresponding author: Christopher [email protected] a r X i v : . [ a s t r o - ph . I M ] M a r Leet INTRODUCTIONTo expand the success of exoplanet searches, next generation spectrographs are aiming for sub-meter-per-second precision in radial velocity measurements. If the 10 cm s − instrumental precisiongoal of the Echelle SPectrograph for Rocky Exoplanets Search and Stable Spectroscopic Observations(ESPRESSO Pepe et al. 2013) and the EXtreme PREcision Spectrograph (EXPRES Jurgenson et al.2016) is reached, we will be able to detect small rocky planets orbiting in the habitable zones oftheir host stars. Such high precision requires extraordinary new fidelity in spectroscopic data: highresolution, high S / N and greater instrumental stability. In addition to controlling instrumental errors,success requires accounting for any systematic temporal changes in the spectral line profiles, whichcan arise from photospheric velocities or telluric contamination (Fischer et al. 2016).Most work on modeling telluric contamination has been tested at near infrared wavelengths wherethe telluric line depths are comparable to stellar absorption lines. However, the next generationoptical spectrographs aiming for 10 cm s − radial velocity precision will be affected by time vari-able microtellurics that raster across the stellar spectrum because of barycentric velocity shifts. Ifwe do not identify pixels that are producing the small perturbations to spectral line profiles, thenmicrotellurics may dominate the error budget for extreme precision radial velocity programs. TELLURIC SPECTRAAtomic and molecular species in the Earth’s atmosphere interact with solar radiation and produceabsorption and emission lines that are imprinted in stellar spectra obtained with ground-based spec-trographs. The non-water constituents (e.g., N , O , Ar, Ne, He) are well-mixed, and maintain anearly fixed element ratio throughout the troposphere, stratosphere, and mesosphere. The concen-tration of some non-water species (CO , CH , NO X ) exhibit seasonal changes or modulation frompost-industrial human activities. However, these gases have stable concentrations on timescales of(at least) several days. In contrast, 99% of atmospheric water vapor is confined to the troposphereand exhibits both temporal and spatial variability that can change by more than 10% on timescaleof an hour (Blake & Shaw 2011).Figure 1 shows the telluric spectrum with a wavelength range of 4500 − / N and resolution of the FTS telluric spectrum shows that the optical spectrum is pepperedwith microtelluric lines with depths that are only a few percent of the continuum. Many of thelines shallower than 1% in Figure 1 will disappear when convolved with the instrumental line spreadfunction (LSF) of high-resolution ( R , − , producing small, but systematic,time-variable line profile variations. Optical RV programs aiming for 10 - 20 cm s − precision will needto account for microtelluric lines because they introduce errors that exceed the target RV precisionCunha et al. (2014). ellurics Figure 1.
The FTS solar spectrum from 4500 - 6800˚A has a resolution, R ∼ ,
000 and S / N > CURRENT BEST PRACTICESSince telluric contamination is a serious error source for high precision spectroscopy, there is a richliterature of practices for telluric modelling. These practices fall into three categories: (a) modellingusing telluric standard stars (Section 3.1), (b) modelling using radiative transfer codes (Section 3.2)and (c) modelling using principle component analysis (Section 3.3). Finally, we discuss the literaturesurrounding a new challenge in telluric modelling: microtelluric modelling (Section 3.4).3.1.
Telluric Standard Stars
The classical approach to removing telluric absorption features is to observe a telluric standardstar close in time and airmass to the science object (Vacca et al. 2003; Vidal-Madjar et al. 1986).The science target’s spectrum is then divided by the spectrum of the standard star. Typically, earlytype stars from early A to late B are chosen as standard stars because they exhibit few and weakmetal lines, and their rapid rotation helps smear out the lines that remain. The high S / N affordedby bright stars means that with high spectral resolution, even shallow telluric lines are discernible.These stars have the drawback that their strong hydrogen absorption features at the Brackett andPaschen lines blend with their tellurics (Rudolf, N. et al. 2016). As an alternative, a solar type starcan be used as a telluric standard using a high-resolution solar spectra (Maiolino et al. 1996).Using any standard star as a telluric reference model has several well known drawbacks.
First, ittakes away precious observing time from an observation’s science targets, especially when high S / Nrequirements are to be met (Seifahrt, A. et al. 2010).
Second, its accuracy is limited by how well thestandard star’s spectrum is known. Early type stars often display spectral features such as oxygenor carbon lines in the near-infrared. Similarly, absorption line depths of solar-type stars may deviatefrom the solar FTS atlas due to metal abundance or surface temperature deviations, leaving residualsfrom the star’s intrinsic features in the telluric model (Rudolf, N. et al. 2016). Compounding thisproblem, the need to pick a star close to the science target often forces the observation of less wellknown stars.
Finally, for telescopes with an adapative optics system (e.g. CRIRES), the changein source brightness between the science target and the standard star will affect the instrumental
Leet profile (Seifahrt, A. et al. 2010). In practice, Ulmer-Moll, S. et al. (2019) find that standard starsconsistently underperform other telluric removal approaches.3.2.
Radiative Transfer Codes
Another approach is to use line-by-line radiative transfer model (LBLRTM) codes to model tel-luric lines. This technique requires accurate atmospheric temperature and pressure profiles, an ex-cellent model for the spectrograph line spread function, and a complete and accurate atomic linedata base. The atmospheric inputs to these codes have benefited from the commercial interest andinvestments in making more accurate weather predictions. Most radiative transfer codes, includingTERRASPEC (Bender et al. 2012), Transmissions Atmosph´eriques Personnalis´ees Pour l’AStronomie(TAPAS Bertaux et al. 2014), Telfit (Gullikson et al. 2014) and Molecfit (Smette et al. 2015) usethe HIgh Resolution TRANsmission line database (HITRAN Rothman et al. 2013) and are able tomodel non-water telluric lines with an accuracy of around 2%.Unfortunately, radiative transfer codes also suffer from documented drawbacks.
First , radiativetransfer codes are limited by imprecise line data in the HITRAN database. The uncertainty in eachHITRAN line’s position is typically a few to several hundred m/s, but can be up to multiple km/s.HITRAN line strengths are rarely accurate to the 1% level (Seifahrt, A. et al. 2010). Rudolf, N. et al.(2016) also remark on this problem when modelling tellurics in the near IR.
Second , radiative transfer codes often struggle to model water lines. Bertaux et al. (2014) identifysome cases in TAPAS where two adjacent water lines required different amounts of water for anadequate model. This is clearly non-physical (there is only one column density of water), but theauthors are uncertain why this discrepancy appears. Rudolf, N. et al. (2016) note that HITRAN hasimperfect water line information and induce substantial residuals in their radiative transfer code.3.3.
Principal Component Analysis
Artigau et al. (2014) investigated the use of principal component analysis (PCA) for empiricallymodeling telluric lines at near infrared wavelengths. They used observations of hot, rapidly rotatingstars to build a library of telluric standards with a range of water column density and air mass. Thefirst five principal components of the telluric absorption features were used to fit telluric lines inspectra of program stars using least squares fitting. This empirical approach self-calibrates spectraand avoids the need for atomic line data or estimates of water column density. We believe that PCA’sempirical approach is on the right track. However, PCA is a very generic model, and could benefit byincorporating the well-studied physics of telluric line formation. By introducing principled physicalpriors, we aim to improve the sophistication of this approach.3.4.
The Challenge of Microtellurics
Most methods for modeling telluric lines have been applied to lines that are redward of 6800˚A.The telluric features at these red wavelengths are easier to identify, both because the telluric linesare deeper and the density of stellar lines is decreasing. Currently there is not a robust method formodeling microtellurics. Unfortunately, simulations by Cunha et al. (2014) show that if ignored,microtelluric contamination in the optical spectrum will introduce RV errors between 0.2 - 1.0 m s − ,swamping the error budget of next generation RV surveys. Cunha et al. (2014) modeled microtelluriclines in HARPS optical spectra using TAPAS, an online service that simulates atmospheric trans-mission with input from the ETHER Atmospheric Chemistry Data Centre, atomic line data from ellurics − . Achieving RVaccuracies of 10 cm s − necessitates accurate modelling of microtellurics. SELENITE: A SELF-CALIBRATING LINEAR REGRESSION MODELWe now describe SELENITE’s telluric model. Since water and non-water tellurics exhibit differentbehavior (Hadrava, P. 2006), SELENITE treats their lines separately, and so we develop the model asfollows. First, we describe the training data used to illustrate and evaluate SELENITE (Section 4.1).We proceed to describe the model for water tellurics (Section 4.2) and evaluate its performance onthe B star HR3982 (Section 4.3). We then describe the model for non-water tellurics (Section 4.4)and evaluate its performance (Section 4.5), before finally combining the two halves and applyingthem to Alpha Centaur B, a K dwarf with significant stellar features (Section 4.6).4.1.
Training Data
The training data included 51 spectra of rapidly rotating B stars observed with the fiber-fed CH-IRON spectrograph (Tokovinin et al. 2013), which is located at 1.5-m telescope at the Cerro TololoInteramerican Observatory (CTIO). The B-type stars are ideal for this calibration because they arebright and have few spectral lines, providing high S / N spectra that are relatively easy to continuumnormalize. The iodine cell that is used for Doppler measurements with CHIRON was not in the lightpath for any of these observations. These spectra were obtained with the narrow slit mask, whichyields a spectral resolution, λ/δλ of R = 140 ,
000 and exposure times were set to reach a typical S / Nof 100. The air mass for each observation was recorded in the FITS header; however, no informationwas available regarding the PWV or other atmospheric conditions.Figueira et al. (2010) demonstrate long-term stability of telluric lines at the level of 10 m s − (corresponding to 0.01 of a pixel) at the La Silla Observatory using the environmentally stabilized andfiber-fed HARPS spectrograph. The CHIRON spectrograph does not have the stability of HARPS,and the spectral format can drift by a fraction of a pixel from night-to-night. To correct for thesesmall drifts, the spectral orders were cross-correlated to align the telluric absorption lines.4.2. Water Tellurics
The Theory of Water Tellurics
Each water vapor line has a specific absorption coefficient, σ , which depends on fundamental atomicand molecular line data, including the log( gf ) value, excitation potential, and the partition function.The line strength of water tellurics also changes with the number of absorbers along the line of sight,or the column density. The radiative transfer equation for the intensity of light with wavelength λ passing through a plane-parallel atmosphere with a single species of absorber is: I λ = I λ, e − σ λ · n j · z (1)ln I λ = − σ λ · n j · z (2) Leet where I λ, and I λ are the initial and final intensity, σ λ is the effective cross-section for absorption,and n is the average number density of water vapor absorbers. Path length, z , is measured in unitsof airmass at zenith. The column density of water vapor, PWV is n j · z . If a spectrum is normalized,( I λ, = 1 . n · z ).The depth of any twowater lines is therefore linearly related: by measuring the depth of an arbitrary water line(or set of lines), we can predict the depth of every other water line in the spectrum. Werefer to the water telluric used to construct the telluric spectrum as the calibration telluric, and thepixel at the core of the calibration line as the calibration pixel.As an example, Figure 2 shows two water telluric lines from the set of training spectra. Both setsof spectra (Figure 2 right, top and bottom) have been color-coded by the intensity of the pixel at λ = 5898 . Figure 2. (left) Correlation between ln( I λ ) in a pixel containing a telluric signal at 5946 . I cal ), at 5898 . We now derive the precise relationship between the depths of any two water tellurics. From theradiative transfer equation, the intensities of any pair of water lines, ( I λ i , I λ cal ), grow proportionallyto each other in log space. Since the average number density of water absorbers and the airmass isa constant at any time t i , the constant of proportionality between the growth of two lines, as shown ellurics σ λ i /σ λ cal . We denote this constant of proportionality as m λ i cal .ln I λ i,t − ln I λ i,t ln I λ cal,t − ln I λ cal,t = σ λ i [ n t · z t − n t · z t ] σ λ cal [ n t · z t − n t · z t ] = σ λ i σ λ cal ≡ m λ i cal (3)A similar linear regression is carried out to empirically relate every other pixel in the spectrum tothe calibration pixel, implying an equation of the form ln I λ i = m λ i λ cal ln I λ cal + b . During this process,the y-intercept was always found to be zero, simplifying the regression model to:ln I λ i = m λ i cal ln I λ cal (4)One exception to the above are saturated tellurics, which have left the linear regime of growthand do not obey Equation 1. In both our water and non-water analysis, however, we find no telluricdeeper than 50% of continuum between 4500˚A-6800˚A and so no saturated telluric. Saturated telluricsare therefore considered outside of this paper’s scope. Another exception to the above is variations inthe instrumental line spread function (LSF) over time changing a telluric’s profile. SELENITE doesnot model instrumental errors, and these variations can only be handled by observing new trainingdata under the new LSF. Fortunately, at CHIRON’s resolution tellurics are marginally resolved,attenuating LSF changes. In practice, CHIRON’s LSF is relatively stable over years, allowing 2012K-dwarf observations to be fit by a model built on 2014 B star observations (Section 4.6).The correlated growth of water tellurics can also be exploited to identify water tellurics. The PCCof each pixel’s growth with the calibration pixel can be measured, and each pixel whose PCC exceedsa threshold, k , can be flagged as containing a water telluric. Usefully, SELENITE can discover newwater tellurics not contained in HITRAN and correct the position of HITRAN’s water tellurics.Three additional tests are applied to pixels with PCC > k to eliminate false positives: First, theline spread function for CHIRON has a full width half maximum of 3 pixels. Therefore, we requirea minimum of three consecutive pixels with PCC values that exceed k . Single or double pixelsare assumed to be spurious. Second, because telluric lines have Gaussian profiles, the cluster offlagged pixels must pass a peak detection algorithm. Finally, the high resolution FTS solar spectrum(Figure 1) indicates that telluric lines appear in clusters rather than as single isolated lines. Anyisolated telluric without another telluric within 10˚A is therefore rejected.4.2.2. Establishing a PCC Threshold
The threshold PCC ( k ) for flagging pixels with a telluric signal must be chosen to minimize boththe number of both spurious detections (false positives) and the number of missed telluric lines (falsenegatives). This critical step ensures that the model telluric spectrum will have the highest possiblefidelity. If spurious features are included in a model, they will be used to assign zero weight pixels,resulting in lost data for the radial velocity cross-correlation. If telluric features are missed in amodel, they will remain in the stellar spectrum and increase the radial velocity errors.The selection process begins by profiling the false positive rates of different values of k . Thecorrelation between a calibration pixel and a noise pixel in the data set is simulated by generating n = 51 points of the form [ln( I cal ), ln( I λ )]. The values of ln( I cal ) evenly fill the range [ − ,
0] andrepresent a range of possible calibration line depths, while values of ln( I λ ) are drawn at randomfrom a Gaussian distribution with σ = 0 .
01, representing shot noise typical of the CHIRON spectra
Leet (S / N ∼ . .
425 and fewer than 0 .
01% of pixels generate a PCC > . k = 0 .
425 has just a 0 . = 10 − % changeof generating a false positive. Since the CHIRON spectrum has about 200,000 pixels, this thresholdhas just a 0.02% chance of generating a false positive. Figure 3. (left) A cumulative histogram for 100,000 trials to measure the PCC between a designatedcalibration pixel ln( I cal ) and pixels representing ln( I λ ) with only Gaussian noise scaled to σ = 0 .
01. (right)The probability of detecting a signal as a function of telluric line depth in the presence of the same level ofGaussian noise. The purple solid, red dashed, orange dash-dot and grey dotted lines show the 90, 99, 99.9,and 99.99 limits respectively.
Once a threshold PCC is established, the minimum line depth detectable under the threshold inspectra with S / N ∼
100 is evaluated. A PCC threshold that is too high will fail to detect shallowlines (generating false negatives), reducing the sensitivity of the model. We again generated pointsrepresenting pixels from 51 spectra with the form [ln( I cal ) , ln( I λ )]. The calibration line depth, ln( I cal ),was again evenly distributed across the range [ − , I λ ) were scaledaccording to ln( I λ ) = c · ln( I cal ). By randomly selecting values of c ∈ [0 , . ≤ / N ∼
100 was then added to ln( I λ ),and the percentage of time that the PCC was greater than k for pixel pairs was recorded. Thissimulation was repeated for 100,000 trials, and the results show that 90% of lines deeper than 2.3%and 99.9% of lines 3% of the continuum will be identified with the linear regression method describedhere (Fig 3, right). However, there is a precipitous drop in our ability to model tellurics with linedepths shallower than 2%. This result is, of course, dependent on the S / N of the training populationand should improve if the training set had higher S / N and better continuum normalization.4.2.3.
SELENITE’s Water Telluric Model
The steps taken to identify and model water tellurics in Section 4.2.1 are summarized below.1. The PCC of each pixel’s growth with a calibration pixel is calculated. A threshold PCC, k , isestablished, and pixels with PCC > k are flagged as significant. ellurics
92. Single or double pixels with PCC > k are rejected as spurious.3. The training data set is coadded and a peak detection algorithm is applied to each cluster ofmore than three pixels. Clusters which do not contain a peak are rejected as tellurics.4. Any cluster of flagged pixels with no other cluster with 10˚A is rejected as a telluric feature.5. Linear regression is carried out on pixels that are flagged as tellurics to measure m λ cal relative toa pre-identified calibration pixel. The wavelength, regression coefficient, PCC and water/non-water classification of each flagged pixel is then stored in a database.The wavelength, linear coefficient, PCC, and a flag identifying the pixel as water is stored foreach pixel that has passed the selection criteria for water tellurics is stored in a database. Table 1lists an excerpt of a database generated from the training data’s content using the 5901.6˚A telluricas a calibrator. To generate a model of telluric water lines, the intensity of the central pixel in acalibration line is measured and information in the database is used to generate water tellurics forevery pixel in the spectrum:ln I λ i = m λ i cal · ln I cal when λ i ∈ valid peak ∧ P CC λ > k otherwise (5)where m λ i cal is the ratio of effective cross-section for absorption at λ i relative to λ cal is the effective cross-section of the calibration line wavelength (or the weighted average for an ensemble of calibration lines), I cal is the intensity at the calibration line wavelength, and k is the threshold correlation coefficientindicating telluric presence. Generation of the telluric water model takes less than 3 minutes on a2015 Macbook Air with a 2.2 GHz Intel Core i7 processor and 8GB of 1600 MHz DDR3 RAM andallows for identification of variable numbers of telluric-contaminated pixels, depending on the PWV.This is valuable since, as Figure 2 shows, water telluric size can vary by an order of magnitude.On night with high PWV, at a threshold k of 0.425 (see Section 4.2.2), up to ∼ .
1% of pixels under 6800˚A. On dry nights, as few as ∼ .
2% of pixels under 6800˚A. This is a savings of ∼
75% of an order. λ i σ λ /σ . PCC Species Flag5898.12061 0.49523 0.992 W5898.14209 0.89206 0.994 W5898.18457 0.66062 0.991 W5898.20556 0.34039 0.937 W5898.99121 0.47828 0.977 W
Table 1.
An excerpt from the telluric database generated for our training spectra.
Identifying and Modelling Water Microtellurics
SELENITE is successful at identifying relatively shallow telluric features. Figure 4 shows thetraining set spectra for the wavelength range between 5075˚A and 5120˚A. From the NSO atlas0
Leet (Figure 1) it is clear that this wavelength range should only contain weak microtelluric lines. Spectrain Figure 4 (left) are color-coded by the intensity of the calibrating water telluric line at 5898.16˚Aand it is difficult to see correlated growth for any microtelluric lines. However, when the pixels ineach spectrum are color-coded by the strength of the PCC (regressed against a pixel in the core ofthe 5898.16˚A line), even telluric lines with a depth close to the photon noise in the continuum emergewith high confidence (Figure 4, middle). A close-up view (see right panel of Figure 4) highlights adetected microtelluric line with a depth only slightly greater than the photon noise.
Figure 4. (left) Segments of 51 overplotted CHIRON spectrum in the wavelength range between 5082 -5094˚A, color-coded by airmass. It is difficult to pick out telluric lines in this image. However, when the pixelsare color-coded according to the PCC (middle), several microtellurics can be detected with high confidence.Zooming in on the wavelength segment at 5086˚A (right), the correlated pixel structure for identified weaktelluric lines appears to be cleanly identified.
Moreover, SELENITE is accurate for microtellurics, whose depth is close to the shot noise of thespectra. As an example, the pixel intensity at the center of a shallow microtelluric line is plottedagainst the pixel intensity of the 5898.16˚A calibration line in Figure 5. Following the format forFigure 2, the telluric spectra in the wavelength region around 5898.16˚A and the spectra near 5086.3˚A(Figure 5 right) are color-coded according to the depth of the 5898.16˚A line. The linear regressionbetween the calibration line and the underscored microtelluric line at 5086.3˚A is shown in the leftpanel of Figure 5 and models the intensity of the microtelluric line with a mean SSE of 0.009,comparable to the S / N of the spectrum.4.2.5.
Using an Arbitrary Pixel as a Calibrator
A powerful feature of SELENITE is that any arbitrary pixel or ensemble of pixels in the databasecan be substituted for the calibration pixel without requiring additional analysis by dividing eachlinear coefficient by the scale factor from the original calibration pixel to the new calibration pixel.As an example, Equation 6 shows a how model based on calibration A can be transformed to a modelbased on calibration line B . ln I λ i = m λ i A m BA · ln I B (6)The linear coefficients in the regression model were derived with B-stars (telluric stars) becausethese spectra have both high S / N and few spectral lines. However, once the linear coefficients have ellurics Figure 5. (left) The correlation between a microtelluric line intensity at 5086.3˚A and a telluric calibrationline at 5898.16˚A. (right-top) A plot of a set of telluric lines at 5898.16˚A for our 51 Bstar spectra, color-codedby line depth. (right-bottom) The microtelluric line at 5086.3˚A (underlined). As before, the spectra for themicrotelluric line are color-coded by the intensity of the calibrating telluric line, emphasizing the correlation. been derived, the coefficients can be used to model telluric contamination in spectra of later typestars as long as the selected telluric calibration line is isolated from the stellar absorption lines or thestellar absorption feature is well enough known (for example, by spectral synthesis modeling) thatit can be divided out. The ability to use the database to switch between different calibrating pixels(described above) offers critical flexibility for modeling tellurics in spectra of late type stars.4.3.
Results for Water Tellurics
Model Goodness of Fit
We evaluate SELENITE’s goodness of fit using the B star HR3982’s telluric spectrum. The HR3982spectrum used was generated by averaging 3 unique observations taken over 40 min to drive up itsS/N. Goodness of fit was measured using the reduced chi squared ( χ red ) test statistic. HR3892’sobserved flux was treated as the true model, F obs,i , SELENITE’s model of the flux as the ”data”, F model,i and the error calculated by the data reduction pipeline (0.75% of continuum), scaled by (a) the root of the number of spectra coadded ( √
3) and (b) the root of model’s flux ( (cid:112) F model,i ) as thestatistical errors, σ model,i = 0 . / (cid:112) F model,i .First, to estimate the data quality independent of telluric removal, we measured the χ red of a3200px wavelength range unaffected by telluric lines, 4892˚A-4952˚A, with unity. We found a χ red of1.03, suggesting that our errors were well-calibrated. Next, the χ red of our model’s fit in a 3200pxwavelength range with heavy water tellurics, 6472˚A-6545˚A was measured. This range was chosenbecause (a) it contains the most intense water tellurics bluewards of 6800˚A and (b) it was free fromstellar features. Only pixels where a telluric was detected were included in the χ red calculation. A25px range from 6521.5˚A-6522.5˚A was found to have errors 20 × higher than any other error, thisregion was flagged as an outlier and excluded. The χ red of the telluric model was found to be 1.25.In particular, the line cores were fit well, with a χ red of 1.11. To reach a similar χ red in the affectedand unaffected region, errors in the affected region need to be increased by ∼ . Leet
Figure 6 (top) plots a 5˚A excerpt from the affected region, with HR3982’s spectrum shown in purpleand our model shown in blue. The fit’s residuals deviate from unity by 1 .
0% on average, comparable tothe unaffected regions of the spectrum and the performance of radiative transfer codes. (Ulmer-Moll,S. et al. 2019). One potential flaw in our model is that modelling all points without significant telluricsignal as unity creates discontinuities in the telluric wings, however, Figure 6 (bottom) indicates thesediscontinuities are small, and most users will prefer to mask affected pixels rather than dividing out.
Figure 6. (top) Excerpt of SELENITE’s fit (blue) to HR3982’s science spectrum, (purple). The fit’sgoodness is quantified in Section 4.3.1. (bottom) Residuals generated by dividing out the telluric model.The residuals deviate from unity by ∼
1% of the continuum on average, comparable to unaffected regionsof the spectrum.
Relative Contribution of PWV and Airmass to Water Line Depth
A further result is that the contribution PWV to water line depth generally dominates over airmass.As an example, Figure 7 shows that a low airmass (z=1.144) observation of the 5900˚A water linescan exhibit significantly greater line depth than a subsequent higher airmass (z=1.454) observationbecause of changes in PWV. While the water column density for an observation depends on both theaverage number density of absorbers along the line of sight (PWV) and the path length (airmass),PWV can vary by as much as an order of magnitude while airmass generally ranges between 1 and2. In general, water line depth only weakly correlates with airmass. This lack of correlation can beexploited to distinguish water and non-water lines.4.4.
Non-water Tellurics
In this section, telluric absorption lines from molecules other than water are considered. Like watertellurics, each non-water telluric can be modeled by the radiative transfer equation for a plane parallelatmosphere and thus its signal intensity given by σ λ i · n j · z , where n j is the number density of themolecular species, j .Unlike water tellurics, however, non-water tellurics have no equivalent of PWV. Ignoring smallseasonal variations in gases such as CO , n j is spatially and temporally fixed. Each non-water speciesin the atmosphere is evenly distributed with a constant number density. Therefore the column density ellurics Figure 7.
A spectrum observed at an airmass of 1.144 (red) displays significantly deeper telluric lines thana spectrum observed at an airmass of 1.454 because of changes in PWV between observations. of non-water lines only varies with airmass: by measuring airmass, we can predict the depthof every non-water line in the spectrum.
As an example, Figure 8 (right) shows that over ourobserved range of airmass ( z between 1 . − .
8) the signal intensity of the oxygen telluric feature at6277.7˚A (Figure 8, left) is well fitted by the linear regression model ln( I . ) = m · z + b . The slopeof the regression model, m , measures σ λ i · n j . Another difference from the model for water lines isthat the y-intercept (a fictitious extrapolation to zero airmass) is small, but non-zero. Figure 8. (left) The correlation between the 6277.6˚A oxygen feature and airmass. (right) The telluricoxygen feature at 6277.6˚A for our set of 51 CHIRON spectra, color-coded by airmass.
Like water lines, non-water lines can be identified by measuring the correlation of their growth withairmass. Each pixel whose growth’s PCC with airmass is above a threshold, k , is assumed to havenon-water telluric and undergoes the same procedure as water telluric pixels. Again, this potentiallyallows for the detection of tellurics not listed in the HITRAN database.4 Leet
Non-water lines can be readily distinguished from water lines because non-water lines have a lowcorrelation with the water calibration pixels but a high correlation with airmass, and vice versa forwater lines (see Section 4.3.2). Separating components that vary with airmass from those that don’tis a benefit of SELENITE that might well be useful outside the scope of this paper, which as in thenear IR, where H O, CO and CH lines mix. When a water and non water line blend, the compositeline can have a significant correlation with both the water calibrator and airmass. A regression modelis not fit to composite lines, but they are flagged in the database.4.5. Results for Non-Water Tellurics
We evaluate SELENITES’s gooodness of fit by using the B star HR 3982’s telluric spectrum fol-lowing the procedure described in Section 4.3.1. This time, however, we measured the χ red of themodels fit from 6257˚A-6328˚A, a 3200px wavelength range which encompasses the heart of the 6280˚AO γ atmospheric band. Only pixels where a non-water telluric was detected were measured. The χ red of the telluric model was found to be 1.17. To reach a similar χ red in the affected and unaffectedregion, errors in the affected region need to be increased by 2 . γ atmospheric band. The fit’s residuals deviate from unity byabout ∼ .
75% on average, comparable to unaffected regions of the spectrum.
Figure 9. (top) Excerpt of the model’s fit (blue) to two oxygen doublets in the O γ atmospheric bandin HR3982’s science spectrum (purple). The fits goodness is quantified in Section 4.5 (bottom) Residualsgenerated by dividing out the telluric model. The residuals deviate from unit by ∼ .
75% of the continuumon average, comparable to unaffected regions of the spectrum.
Unfortunately, there are no non-water species with telluric lines other than oxygen bluewards of6800˚A, so we cannot evaluate our model on other species. Fundamentally, however, any well mixednon- water species should in theory behave as oxygen does.4.6.
Modelling Tellurics in a K Dwarf Spectrum
Late-type stars display complex absorption features. These absorption features do not complicateSELENITE’s non-water modelling, which only measures airmass, but they do complicate watermodelling, since they may blend with a calibration pixel’s line. To compensate for the loss of any givencalibration pixel, a large (50+) ensemble of potential calibration pixels are given in the database.Calibration pixels which are blended with stellar lines are identified and removed as follows. Ini-tially, a telluric model is built by regression against the average of all calibration pixel depths. If any ellurics α Centauri B. We measuredthe χ red of the models fit at the 6450˚A water band described in Section 4.3.1. This measurement,however, was complicated by α Centauri B’s stellar lines: if a telluric line is blended with a stellarline, the model’s fit will appear incorrect. This problem was overcome by noticing that changes in theEarth’s barycentric velocity will substantially shift the stellar lines in two observations of α CentauriB taken months apart while leaving the telluric lines in the same position. Tellurics that are blendedin the first observation will often be unblended in the second observation, and vice versa.To illustrate, Figure 10 (top) shows SELENITE’s fit to two observations of α Centauri B, atbarycentric velocities of 1860 m/s and 20500 m/s, for the same 5˚A wavelength range shown inSection 4.3.1. In the 20500 m/s observation, the deep line at 6475˚A seems ill fit by the model’s pairof water lines (underlined), but in the 1860 m/s observation the deep line has shifted, revealing thatit was a stellar line blended with a pair of water line which the model now fits well. The fit’s residuals,shown in Figure 10 bottom, show that when tellurics are removed the two spectra are indeed thesame. Where the residuals do not contain a stellar line, they deviate from unity by an average of1.1%, comparable to the results of a radiative transfer code.
Barycentric v: 20500 m/s Barycentric v: 1820 m/s
Figure 10. (top) Excerpt of the model’s fit (blue) to observations of α Centauri B science spectrum(purple) at barycentric velocities of 1820 m/s and 20500 m/s. The apparent misfit underlined in the 20500m/s spectrum is revealed to be a good fit in the 1820 m/s spectrum when the deep stellar line at 6475˚Ashifts bluewards (bottom) The model’s residuals reveal that the two underlying spectra are the same. Leet
When we compute χ red , if the spectrum grossly deviates from a pixel fit (by 3 .
0% or more ofthe continuum) we assume that the pixel is blended with a stellar line and reject it. Followingthis procedure, we found an χ red of 2.95 and 3.17 for the 1840 m/s and 20500 m/s α Centauri Bobservations. This fit, while acceptable, is somewhat poorer than HR3982’s fit, in large part becausetelluric lines often blend with stellar line tails, disrupting their profile slightly. For example, thewings of the small telluric at 6472.5˚A (at the far left of Figure 10) are blended with a small stellartelluric, inflating the measurement of χ red . DISCUSSIONBecause of the barycentric velocity of the Earth, telluric lines raster across the stellar line profilesin time-series Doppler measurements. Even shallow microtelluric features will degrade the fidelity ofhigh-resolution spectra and may contribute up to 0.5 m s − to the RV error budget. Since the Earthinduces a radial velocity of 10 cm s − in the Sun, telluric contamination is a significant challenge inthe search for analogs of our world. In this paper, we present SELENITE, an empirical techniquefor identifying and modelling telluric features in the optical (4500˚A-6800˚A), using the observations: (a) water tellurics grow proportionally to PWV and therefore proportionally to each other and (b) non-water tellurics grow proportionally to airmass. Water tellurics are identified by looking for pixelswhose growth correlates with a known calibration water telluric and modelled by regression againstit. Non-water tellurics are identified by looking for pixels whose growth correlates with airmass andmodelled by regression against it. SELENITE has several advantages over the alternatives: • Runtime:
Once the database is built ( < min on a standard PC) fitting a spectrum takesseveral seconds, permitting SELENITE to be used at the telescope to help guide observingruns. • Observing time:
Unlike standard stars, after a one time observation of a few dozen B starsto build the database, SELENITE requires no further observations, saving observing time. • Requires no atomic/molecular line data:
Unlike radiative transfer codes, SELENITE doesnot require atomic/molecular line data. This is useful because the literature suggests HITRANis not always accurate. In particular, Seifahrt, A. et al. (2010) notes: ”Line data in HITRANhave strongly varying accuracy levels. Typical uncertainties of line positions range from a fewto several hundred m/s, but can be as high as several km/s in extreme cases. Line strengthsare rarely precise to the 1% level.” Further, Rudolf, N. et al. (2016) find that inaccuracies inthe HITRAN database frustrate their ability to model water lines accurately. • Distinguishes tellurics that vary primarily with airmass from those that don’t:
Although outside the paper’s scope, this feature could be very useful in the near IR, whereH O, CO and CH lines mix.We acknowledge, however, that SELENITE has certain limitations. First , stellar features in theset of training B stars, (e.g., the Paschen and Brackett lines) will distort its model. This problem canbe solved by interpolating over each absorption, at the cost of introducing additional uncertainityto regions of scientific interest.
Second , SELENITE only varies with airmass and PWV. Otheratmospheric phenomena which may affect line profiles (e.g., wind speed (Caccin et al. 1985)) is nottaken into account. Instrumental changes, such as a varying LSF, are also not considered, and canonly be handled by rebuilding the database for each instrumental profile change.
Third , SELENITE’s ellurics / N data, at lower S / N a line’s wings may not clear the PCC threshold, truncating them.Despite these limitations, evaluations show that SELENITE provides excellent fits. The model’sfit to regions of intense water tellurics and non-water tellurics in the B star HR3982 had χ red of 1.25and 1.17, and thus errors just 10.5% and 2.0% bigger than the continuum’s fit to unity. Further,SELENITE’s fits to the K-dwarf α Centauri B observations had χ red of 2.95 and 3.17, despite the χ red test statistic being inflated by stellar line blending, confirming that it provides a good fit to late-typestars. SELENITE’s average residual is 1 .
0% and 0 .
75% for HR3982 and 1 .
1% for α Centauri B,comparable to the residuals of radiative transfer codes (Ulmer-Moll, S. et al. 2019). ACKNOWLEDGEMENTSAcknowledgements: The authors gratefully acknowledge enabling support from the followinggrants NSF-1616086, NSF-MRI0923441, NASA-NNH17ZDA001N-XRP, NASA-NNH11ZDA001N-OSS. NSO/Kitt Peak FTS data used here were produced by NSF/NOAO.
Facilities:
CTIO: (CHIRON)8
Leet
REFERENCES