[PDF] A Review of the Enviro-Net Project

Abstract

Ecosystems monitoring is essential to properly understand their development and the effects of events, both climatological and anthropological in nature. The amount of data used in these assessments is increasing at very high rates. This is due to increasing availability of sensing systems and the development of new techniques to analyze sensor data. The Enviro-Net Project encompasses several of such sensor system deployments across five countries in the Americas. These deployments use a few different ground-based sensor systems, installed at different heights monitoring the conditions in tropical dry forests over long periods of time. This paper presents our experience in deploying and maintaining these systems, retrieving and pre-processing the data, and describes the Web portal developed to help with data management, visualization and analysis.

Full PDF

AA Review of the Enviro-Net Pro ject ∗ September 13, 2018Gilberto Z. Pastorello G. Arturo Sanchez-Azofeifa Mario A. Nascimento Department of Earth and Atmospheric Sciences1-26 Earth Sciences BuidingUniversity of AlbertaT6G 2E3Edmonton, Alberta, Canada. [email protected] , [email protected] Department of Computing Science2-32 Athabasca HallUniversity of AlbertaT6G 2E8Edmonton, Alberta, Canada. [email protected] ∗ Text published in:G. Z. Pastorello, G. A. Sanchez-Azofeifa, M. A. Nascimento.

Enviro-Net: From Networks ofGround-Based Sensor Systems to a Web Platform for Sensor Data Management . Sensors . 2011.11(6):6454-6479. doi: 10.3390/s110606454

Abstract

Ecosystems monitoring is essential to properly understand their devel-opment and the eﬀects of events, both climatological and anthropologicalin nature. The amount of data used in these assessments is increasingat very high rates. This is due to increasing availability of sensing sys-tems and the development of new techniques to analyze sensor data. TheEnviro-Net Project encompasses several of such sensor system deploy-ments across ﬁve countries in the Americas. These deployments use a fewdiﬀerent ground-based sensor systems, installed at diﬀerent heights mon-itoring the conditions in tropical dry forests over long periods of time. a r X i v : . [ c s . N I] J un his paper presents our experience in deploying and maintaining thesesystems, retrieving and pre-processing the data, and describes the Webportal developed to help with data management, visualization and anal-ysis. Monitoring ecosystems at high spatial and temporal resolutions still is achallenging endeavor. Satellite-embarked sensors that oﬀer regular passessupport only coarse resolution monitoring and on-demand high resolu-tion satellite or airborne-based monitoring are still too expensive to beconsidered viable options for frequent data collections. Furthermore, vali-dation of satellite and airborne measurements against the values observedat ground level is often diﬃcult to obtain. Ground-based, or in-situ , sen-sor systems for environmental monitoring have associated challenges aswell [1], but have undergone a considerable evolution recently. Such sys-tems are now capable of collecting data at very high temporal resolutionfor very speciﬁc ecosystems through long periods of time. In particular,the use of wireless sensor systems has been shown to be very eﬀective inthis type of monitoring [2], from the cost perspective and increasingly interms of performance and reliability as well.There are many challenges associated with high resolution (both spa-tial and temporal) in-situ environmental monitoring, many of which al-ready well recognized in the literature. Rundel et al. [1], for instance,discuss how these networks generate more data than can be managed bytraditional methods for ﬁeld research data, with data quality assuranceand control surpassing capabilities of single individuals dealing with thedata, but still being required to produce high-quality data. The large va-riety of problems impacting quality can be more easily detected by usingadequate cyberinfratructure for automating the detection, which also al-lows more timely identiﬁcation of problems in the deployments themselves.They also argue that, although data storage and retrieval is reasonablyeasy to attain, publishing and sharing data is not as straightforward. Stillaccording to the authors, one of the advantages of this integrated ap-proach for oﬀering data from multiple sensors is the larger world viewgenerated, which is not possible with single sensors—at least not at thesespatio-temporal scales. The authors also acknowledge the importance oftraining scientists in using in-situ monitoring tools, the ﬂexibility of powerrequirements for these systems (especially wireless) and the use of energyharvesting, problems related gaps in the data (from numerous causes),diﬃculty to assess precision and ﬁdelity in such systems, and the value ofcommercial availability for decreasing costs and scaling up deploymentssizes.Hart and Martinez [3] discuss power management, large volumes ofdata and required cyberinfrastructure, beginning of commercial eﬀorts,and data quality control as important issues concerning in-situ environ-mental monitoring. They also raise additional points that require morework, such as assessment of environmental conditions any equipment eeds to withstand them (e.g., temperature, pressure, vibration); stan-dardization requirements related to data and metadata representation;security requirements, preventing tampering with both equipment anddatasets within the data management systems; and, better means fordata interpretation (e.g., by using new methods for data mining). An-other relevant eﬀort can be found in the report from Estrin et al. [4],who focus on cyberinfrastructure. Key points include: the need for betterprototyping and design of end-to-end test-beds to allow validation acrosswide ranges of environments, applications and domains; creation of betterservices regarding time synchronization, in-situ calibration, and adaptiveduty cycling, among others; seamless use of high performance computingfacilities for data processing; tools to improve support for metadata; and,collaboration eﬀorts as a basis for training new scientists (from multipledomains) and as a mechanism for sustaining long term deployments.This paper distills our experience in deploying and managing in-situ sensor systems within the Enviro-Net Project ( ). Currently, Enviro-Net includes 39 deployments spread through-out nine sites in six diﬀerent countries (Argentina, Brazil, Canada, CostaRica, Mexico and Panama), and is coordinated at the University of Al-berta, in cooperation with local partner research teams at each site. Theinitial goal of the deployments was to monitor vegetation phenology, thestudy of climate eﬀects on periodic biological activity [5], correlating itwith environmental variables, such as availability of light, air temperature, etc. These and other variables are monitored by diﬀerent types of sens-ing systems, with the collected data being transmitted back to Internetservers in Alberta either through a commercial satellite up-link or beingmanually retrieved from the data loggers and then sent via email, FTP orEnviro-Net’s website. The following gives but one example of the applica-bility and usefulness of such type of systems. From the data collected bya combination of two types of specialized solar radiation ﬂuxes sensors,it is possible to derive diﬀerent vegetation indexes, which can be usedas proxies to monitoring phenological responses. In order to distinguishvegetation distribution, particularly from perspectives such as species dis-tribution or successional stage, the areas to be monitored are numerousand relatively small. Similarly, short term eﬀects of isolated climatic phe-nomena (e.g., a rainstorm or sharp changes in temperature) require higherrates of data acquisition. These characteristics require higher spatial andtemporal resolutions only achieved through in-situ monitoring of eachecosystem.In this context, detailed discussions of how we dealt with these chal-lenges within the Enviro-Net Project form the main contributions of thispaper, particularly considering the scenario under which the project wasdeveloped. The monitored sites are mostly tropical dry forests in remotelocations, which are challenging environments for both equipment perfor-mance and personnel’s ability to work. Also, all deployments are based oninexpensive and commercially available technology, essential characteris-tics to allow scalability and reproducibility of experiments. The hetero-geneity of equipment from diﬀerent manufacturers also introduce diﬃcul-ties, mainly regarding systems maintenance. Having long term (multiple-year) deployments impose extra management requirements. Integrated ata management, a fourth aspect, presents numerous challenges rangingfrom data quality control to user interface usability. Finally, and maybethe most relevant aspect, is the issue of high spatial and temporal reso-lutions, considered not only within a single deployment, but also amongdiﬀerent deployments both in the same and diﬀerent sites. Part of thesechallenges have simple individual solutions, however, from a more holisticperspective, the integration of the solutions for all of them is what actuallyenables the use of sensors systems for in-situ environmental monitoring.After a review of related work on Section 2, this paper describes our so-lutions regarding deployments of in-situ monitoring systems in Section 3,pre-processing and treatment of data in Section 4 and data publicationand accessibility using a Web-based system in Section 5. This section divides related work discussion into applications (covering themotivation for in-situ monitoring), deployments (showing experiences ininstalling and maintaining sensor systems), and data management (com-paring diﬀerent eﬀorts in dealing with the large amounts of sensor datagenerated).

Environmental monitoring is one of the driving forces behind the adoptionof ground-based sensing systems, pushing the need for higher spatial andtemporal resolution. Examples of eﬀorts in this direction include: (i) thecreation of the National Ecological Observatory Network (NEON) [6],which aims at studying climate change, land-use change and invasivespecies on a continental scale using, among other methods and technolo-gies, ground-based deployments of sensor systems; (ii) FLUXNET [7],which use micrometeorological and ﬂux towers to measure exchanges ofcarbon dioxide, water vapor, and energy between terrestrial ecosystemsand the atmosphere. These initiatives heavily rely on long term ground-based monitoring solutions. FLUXNET has a public data managementframework called Fluxdata.org [8], which also oﬀers ﬂexible metadata sup-port. However, due to the diversity of equipment and protocols for deploy-ment and data pre-processing, data integration within the Fluxdata.orgsystem is limited, mostly oﬀering access to data on the original formatprovided by the data producers. This limits the possibilities of apply-ing ﬁlters and aggregation operations to the data or generating deriveddata products within the system. Although our system also deals witha variety of equipment, the deployment protocols are largely uniform,and pre-processing protocols are developed using a centralized approach,which allow us to achieve a considerable level of data integration withinEnviro-Net.These and other initiatives, aiming at integration of ground-basedmonitoring eﬀorts, are leading to an evolution from single site environmen-tal monitoring into networks for environment observation [3]. This evolu-tion culminates with the current vision for a

Sensor Web [9–11], encom- assing several types of deployments of sensor systems, interconnectingthem globally through a Web-based integration strategy using standardsdeveloped by the Sensor Web Enablement ( ) Working Group of the Open Geospa-tial Consortium, Inc. (OGC) ( ).A small clariﬁcation on the deﬁnition for (wireless) sensor networksmay be in order. Mainly within Computing Science (CS) research [12,13]and in earlier Sensor Web related eﬀorts [9], this deﬁnition is narrowerthan what is used in this paper. In this more restrictive deﬁnition, a(wireless) sensor network is based on nodes (also known as “motes”) thathave sensing, data storage/processing, and communication componentsplus a power source. These nodes are usually autonomous and oper-ate cooperatively—by communicating amongst themselves—to collect andprocess data, also being programmable, i.e. , able to behave diﬀerentlyaccording to, for instance, the type of application, power supply condi-tions, environmental conditions, etc. Although we have used this type ofwireless nodes in our deployments, we do not require the capability of of-fering communication amongst network’s components. Instead, we adoptthe centralized type of processing architecture as classiﬁed by [12], beingmore in line with the current Sensor Web approach to networks [11]. It issuﬃcient for us, for instance, that the connection of sensing elements bedone at the level of integrated data products.Applications of in-situ monitoring systems are also the topic of otherresearch eﬀorts. Porter et al. [2] present a good review of the capabili-ties of wireless sensor networks (WSN) to be applied within the ecologicaldomain. Hamilton et al. [14], while covering capabilities of networks ofsensors applied to ecology as well, also highlight the idea of ecological ob-servatories, adopted within NEON. An extensive review of in-situ moni-toring eﬀorts is presented by Rundel et al. [1], classiﬁed according to theirarea of focus: above ground, under-ground, and aquatic environments.Porter et al. [15] discuss the state of the sensing technology, what canalready be accomplished and a few areas that require more development(e.g., data management software and new types of sensors). Precisionagriculture is a particularly relevant application area for pervasive sens-ing technology. For instance, Lee et al. [16] evaluate monitoring applied tospecialty crop, while Matese et al. [17] use wireless sensor network in vine-yard monitoring, and Aquino-Santos et al. [18] evaluate data transmissionprotocols in small scale deployments in watermelon ﬁelds. In this paper,we discuss aspects that apply to many of these scenarios, particularly ifconsidering them in a long term monitoring perspective. However, ourfocus is on practical and logistics aspects of deploying and maintainingequipment, retrieving and managing the data, and supporting analysis ofdata products.

Other research groups have discussed their eﬀorts with ground-based de-ployments of sensor systems, mostly focusing on the use of wireless equip-ment. A pioneering eﬀort in applying wireless sensor networks was thehabitat monitoring experiment in the Great Duck Island [19] in the coast f Maine in the United States, deployed to oﬀer a less intrusive wayto study behavior and nesting of seabird colonies. The SensorScopeproject [20, 21] is another example, taking place mainly in Switzerland.They have described their experience with developing the hardware andsoftware for their wireless system, performing tests, and going on deploy-ment expeditions, along with their architecture and communication pro-tocols. With a focus on solar energy availability, AdaptSens [22] adoptssystem-wide levels of operation to cope with diﬀerent amounts of availableenergy. GreenOrbs ( ) [23] is a long term ef-fort for monitoring an university campus urban forest close to Hangzhouin China, using a large number of nodes. LUSTER [24] is a system formonitoring ecological variables that implements fault-tolerant distributedstorage over a delay-tolerant network using an hierarchical architecture;the system also covers user interaction both in the ﬁeld expeditions anda web interface for data retrieval. Another eﬀort [25], aiming at moni-toring the UNESCO World Heritage site Mogao Grottoes in Dunhuang,China, implemented a low power wireless monitoring system inside thesite’s caves with a tailored long distance connection to transmit the databack to an on-line server. Another World Heritage site, a rainforest ecosys-tem in Queensland, Australia, was monitored by a wireless sensor networkproject [26], which served as a prototype for future long term deploymentsusing similar conﬁgurations. Another interesting application, monitoringthe activities of volcanoes in Ecuador [27,28], entails addressing issues suchas higher sampling rates (100 Hz or more), need for higher accuracy andmore expensive sensors. Changing the spatial scale a little, monitoring asingle redwood tree [29] in California in the United States, oﬀered newinsight in understanding the microclimate surrounding this type of tree.Reports on deployment experiences also focus on the diversity of problemsfaced when using wireless sensing equipment, such as the LOFAR-agroproject [30] that experienced problems ranging from hardware failures tonetwork protocols errors and software problems. While deployment re-lated eﬀorts in our work focus on issues related to managing the life cycleof ground-based sensor data, other works [31,32] bring evaluations of tech-nology for wireless sensor network equipment, including communicationprotocols, power consumption and data transmission issues.To the best of our knowledge, none of the deployment eﬀorts re-viewed here address the same scenario as ours: having (multi-year) longterm deployments, based on cooperative eﬀorts of several (heterogeneous)teams, using commercially available equipment from multiple manufac-turers, with an integrated eﬀort of data retrieval, quality control anddata availability through an easy to use Web-based platform. We believethis is a more realistic scenario for ground-based environmental monitor-ing eﬀorts. The current eﬀorts within the Life Under Your Feet project( http://lifeunderyourfeet.org/ ) [33] are the closest to our own, alsohaving long term, spatially distributed deployments with a Web-baseddata visualization interface integrated with geolocation information. How-ever, they do not seem to deal with heterogeneous equipment and dataformats, nor oﬀer ﬁltering/aggregation options, derived datasets or qual-ity information in their data management solution. .3 Data Management Many of the challenges related to sensor data management have beenknown for a while [4]. However, several technical and non-technical ques-tions still remain unaddressed. Broad scope projects for management ofearth observation data try to present a top-down approach to data man-agement. One such project is DataOne ( ), aneﬀort towards distributed cyberinfrastructure for Earth observation data,bringing together a multitude of data providers and consumers. Anothereﬀort is our partner project GeoChronos ( ),which implements means for sharing (and interacting with) tools, datasetsand libraries of records within the Earth observation domain. Enviro-Net,however, uses more of a bottom-up approach, oﬀering specialized solutionsfor the types of data supported, expanding these types as needed. Thisallows data management solutions that are geared towards speciﬁc needsto answer speciﬁc science questions.Although it is common to think about sensor data management asstream data management, with the associated challenges (on-line aggre-gation, classiﬁcation, etc. ) [34], at least within environmental research,particularly in ground-based monitoring, this is not a frequent scenario.Most of the current applications based on sensor data use the perspec-tive of historical (or an archive of) time series data. Applications usingthe stream data perspective are only beginning to appear, and the cur-rent applications that do require that perspective—e.g., volcano moni-toring [27]—are still the exception. Data manipulation for most of thecurrent applications is done after having the data collected and stored,applying a variety of analytical operations in an oﬄine fashion [8, 35].Middleware software for automating control of deployments is also thefocus of current research eﬀorts, in form of architectures for integratingdiﬀerent network deployments [36], or Web-based interfaces for interac-tion with and control of wireless deployments [37]. Our focus, on theother hand, is on managing the data products rather than controlling theequipment from within our system.The data archival aspect of data management involves not only stor-age of data, but also retrieval, documentation, access control, amongother issues. Furthermore, data curation of long-term repositories in-volves not only handling the data but also helping scientists answeringresearch questions and also maintaining the underlying computational in-frastructure [38]. Within Enviro-Net, although we are only beginning toto devise our long term plans for infrastructure maintenance, our systemalready oﬀers data access with a number of ﬂexibility aspects to fostereﬃcient use of the data. Eﬀorts on applying digital library practices insupport of sensor data management are also gaining acceptance [39]. Is-sues of data quality and integrity, as well as the elements of data collectionthat aﬀect them, need to be an integral part of such eﬀorts [40], particu-larly from the perspective of making data documentation available alongwith the datasets. In this scenario, metadata becomes as valuable as thedatasets themselves, from quality metadata about deployments [41], tooﬀering search and annotation options and enriching visualization [42].Finally, Application Programming Interfaces (APIs) allow data to be ac- essed in a programmatic way, which can be achieved, for instance, usingWeb services interfaces (using Sensor Web Enablement standards) or usingspecialized solutions such as a wrapper-based middlewares [43] or REST-based APIs [44]. Data quality aspects are an integral part of Enviro-Net,and are being improved, particularly regarding documentation and meta-data coverage. Although data ingestion is largely automated and dataaccess is possible through the Web user interface within Enviro-Net, dataaccess using a programmatic interface is still under development. Apart from a few test installations, all of our deployments are intendedto be long term, collecting data for a minimum of two to three years. Theearliest deployments were installed in mid 2007, with the ﬁrst wirelessdeployments installed in mid 2008. All deployments suﬀered from inter-ruption in data collection on some scale, usually from a few days up to acouple of months, depending on how early the problem was detected. Ear-lier deployments suﬀered 100% failure rate due to equipment design beingincompatible with tropical environments. Many problems were related tounexpected interactions of environmental conditions with the equipment.However, most of the deployments are still operational today, with securedfunding for maintaining them operational until at least 2013.Currently, Enviro-Net has 39 permanent deployments, plus temporarydeployments in Edmonton, Canada for equipment testing and calibra-tion. The

Biosphere Reserve of Chamela-Cuixmala in the state of Jalisco,Mexico has a tower (overlooking the top of the canopy) and a wirelessunderstory sensor system. The number of nodes in a wireless deploymentis usually 12, but there are deployments with as few as ﬁve and as manyas 20 nodes, each node having between three to six sensors each. The

Santa Rosa National Park in Costa Rica hosts two more towers. The

Parque Natural Metropolitano in Panama has the most recent deploymentwith 24 thermocouples monitoring leaf temperatures. In Brazil, threesites have deployments: the

Mata Seca State Park , the

Serra do Cip´oNational Park , and the

Environmental Protection Area of the PandeirosRiver , all located in the Minas Gerais state. The Mata Seca park hostsﬁve towers and eight understory deployments (including four wireless de-ployments), all in the cerrado ecosystem, which is similar to a savanna;three understory deployments are active close to the Pandeiros river, alsoa cerrado ecosystem; and, Serra do Cip´o park has ﬁve towers plus sevenunderstory deployments, two of which using wireless systems, coveringnatural grasslands and forest vegetation in the cerrado . Finally, threedeployments are operational in the province of San Luis in Argentina, aphenology tower monitoring a grassland ecosystem, and one tower and onewireless understory deployment installed in a adjacent chaco ecosystem.Two more wireless towers are operational chaco and grassland ecosys-tems in the province of C´ordoba, Argentina. Three more deploymentsare expected to start data collection in 2011 in the province of San Luis.Although the ideal spatial scales for many applications requires highernumbers of nodes deployed to be considered high spatial density—more n line with our plans for future sensor networks—the intermediary stepwith 5–20 nodes per deployment was necessary to prove this kind of scale isfeasible in remote locations with long term deployments. These are, how-ever, dense enough to characterize many ecosystem level behavior (suchas response to climatic events), and even diﬀerences between neighboringecosystems. The experience acquired in these smaller deployments, whichis the fundamental contribution of this text, serves as a basis for theselarger scales expeditions.The main challenge of having deployments across an entire continentis without question maintaining them. Partnerships with research groupsbased closer to the deployment sites proved essential, with the added issueof oﬀering training to the people performing basic maintenance. The smallamounts of time available for training leads to the choice of equipmentthat is simple to use and maintain. Hands on experience has proven to bethe most eﬃcient method to train new users, specially when focusing onhow to deal with common problems. Special attention needs to be given todata retrieval and manipulation methods in order to allow tracking of dataproblems later in the processing chain. Documentation of our own group’sdeployment protocols and data handling procedures complemented andhelped with equipment manuals and speciﬁcations.Regularity in systems maintenance is key in keeping them runningwithin long term deployments. Life expectancy and calibration devia-tion for sensors are usually a parameter speciﬁed by the manufacturer.Enviro-Net deployments usually have two maintenance cycles: one for ba-sic overall system check (and data retrieval for oﬀ-line deployments) andanother for complete veriﬁcation of the equipment. The basic cycle hasintervals ranging from two weeks to two months, depending on the acces-sibility of the site and regularity of visits for other purposes. This task isusually performed by a member of the local research teams and involvescleaning the sensors if needed—mostly from dust build-up or obstructionssuch as leaves, insect or bird nests, etc–veriﬁcation of the general healthof the system, and data retrieval, usually the most relevant part in a ba-sic maintenance cycle. The complete cycle intervals ranges from 6 to 12months, and allows detection of a broader range of problems—e.g., bat-tery charge retention capacity. This task is usually performed with onemore experienced technician. Tables 1 and 2 list the equipment used in our deployments. For dataloggersystems, shown in Table 1, wired and wireless systems are available. Inwired systems all the sensors are connected directly to the data logger andthe communications with it are done mostly through a physical connectionusing a cable (serial or USB, for instance) to connect to a laptop. Forwired deployments, we mostly used

Onset Computer Corp. ( ) data loggers; speciﬁcally the HOBO Micro Station , the

HOBO U12 Series and the

HOBO U30 Series models were employed.Wireless systems, on the other hand oﬀer diﬀerent strategies to elimi-nate the need for cabled connections. As an example, the equipment man-ufactured by

Olsonet Communications Corporation ( om/ ) oﬀers two types of nodes: a collector and an aggregator. The formeris connected to the sensors and is responsible for wirelessly transmittingthe readings to the aggregator, which works as a centralization point forthe data collection also dubbing as a short term data logger. The ag-gregator, however, requires a cable connection for setup or data recovery.A diﬀerent strategy is used by the equipment manufactured by MicroS-train, Inc. ( ), where each ENV-Link TM node works as an individual data logger, but the connection to thesenodes for setup and data retrieval is done through a wireless connection.The storage capacity for samples in both types of loggers usually matchthe power consumption characteristics to achieve similar longevity in ﬁelddeployments. As discussed later in this section a satellite up-link and acontinuous battery recharging capability (e.g., using solar panels), wouldallow even longer time spans. However, since in practice maintenance isnecessary long before these limits are reached, battery and storage life-times are not a limitation for most of these types of equipment. Logger Model Connectivity Storage Memory Power (Battery Type) Est. Longevity (a)

Onset U30 wired data and setup 512 KB Int. (4.5 or 10 Ah, 4 V) + Solar solar panel (b)

Onset U12 wired data and setup 43,000 samples (64 KB) Int. (CR-2032 lithium 3 V) 10–12 monthsOnset Micro Station wired data and setup 512 KB Int. (4 x AA 1.5 V) 10–14 monthsOlsonet Collector wireless data / no setup 256 KB Int. (2 x AA 1.5 V) 4–5 monthsOlsonet Aggregator wireless data / wired setup 2 GB (remov. SD card) Ext. (7–12 Ah) + Solar solar panel (b)

Microstrain ENV-Link wireless data and setup 360,000 samples Int. (650 mAh) + Ext. (9 Ah) 10–14 months

Table 1: Dataloggers summary. (a) Estimated longevity with 15 minutes sampling; (b) Dependent on sun light availability.

Table 2: Sensors summary.

Sensor Model Variable (Unit) Sensor Type Range Accuracy

Sensirion SHT-75 Temp. ( ◦ C) silicon bandgap − ◦ C 0.3–1.5 ◦ CRel. Hum. (%) capacitive humidity 0–100% RH 1.8–4.0% RHOnset S-THB-M00x Temp. ( ◦ C) silicon bandgap − ◦ C 0.2–0.7 ◦ CRel. Hum. (%) capacitive humidity 0–100% RH 2.5–4.5% RHOnset RG3-M Rainfall (mm/h) tipping bucket max 1,270 mm/h 1.00%Onset S-LIA-M003 PAR ( µ mol/m2/sec) (a) photons detector 0–2,500 µ mol/m2/sec (c) µ mol/m2/secOnset S-LIB-M003 Solar Radiation (W/m2) silicon photovoltaic detector 0–1,280 W/m2 (d) µ mol/m2/sec) (a) photons detector 0–2,000 µ mol/m2/sec (c) (d) (b) ) 70 MHz capacitance/frequency 0–100 % VWC 1.0–3.0 % VWC(a) Photosynthetically Active Radiation; (b) Volumetric Water Content; (c) For wavelengthsbetween 400 and 700 nm; (d) For wavelengths between 300 and 1,100 nm. The biggest advantage of wired equipment is reliability, being in uselonger, and tested under many combinations of conditions. Besides lim-ited spatial coverage, when compared to wireless systems, the most prob-lematic aspect of this technology is accessibility. Everything requiring aphysical connection between the logger and the laptop with the controlsoftware, having to climb up a tower to perform tasks as routine as re-trieving data is a somewhat serious limitation. Even using longer cablesfor the sensors, which still have a limited maximum length on account at-tenuation of the electric signal, towers for higher canopies require climbingto access the logger. or environmental monitoring, the major advantage of wireless sensorsystems is the possibility of covering larger areas, without giving up highspatial and temporal resolution, and at a reasonably low cost. One lowpoint of the technology is that it is still fairly new as a commercial product,and still needs some adaptation. Errors in communication protocols, radiorange limitations, power management related issues, lack of features in thecontrol software packages, and breaches in weather prooﬁng cases weightin at the cons for wireless systems. However, our experience shows thetechnology has already reached the tipping point to becoming viable foruse in long term, harsh environment deployments.Commercial availability of wireless sensor networks (WSN), as a tech-nology, is still limited. Although the original ideas for WSN— i.e. , largenumber of general purpose nodes distributed in very dense deployments,randomly placed, almost weightless, and disposable—have yet to mate-rialize [20, 28, 45], wireless technology used in conjunction with sensoryequipment is proving to be invaluable in monitoring larger areas at thescale of a single ecosystem.Table 2 lists the main sensors use in our deployments, which are arewell known, commercially available, inexpensive, and based on establishedtechnologies. With the variables listed, it is possible to extract plentyof derived information from them, such as vegetation indexes and lightabsorption patterns for photosynthesis. In our deployments, we used so-lar radiation sensors provided by Onset and Apogee Instruments, Inc. ( ); air temperature and relative hu-midity sensors by Sensirion Inc. ( ) and On-set (The Onset temperature and relative humidity sensors used are repack-aged Sensirion sensors); and, soil moisture sensors by Decagon Devices,Inc. ( ) Lower cost sensors systems usually donot oﬀer calibration options for the user; they have their calibration ad-justed at the manufacturer (who usually oﬀer recalibration services). Within the Enviro-Net Project there are currently two main types of de-ployments: phenology towers and understory installations. A phenologytower uses two solar radiation ﬂux sensors (also called pyranometers),measuring wavelengths between approximately 300 to 1,100 nm, and twoPhotosynthetically Active Radiation ﬂux sensors (or PAR sensors), whichmeasure wavelengths between approximately 400 to 700 nm. Ratios ofthese measurements can be used to derive vegetation indexes such asNormalized Diﬀerence Vegetation Index (NDVI) or Enhanced VegetationIndex (EVI)–see, for instance [46–49]. Such indexes can be used as prox-ies to monitor vegetation phenology. Understory deployments are used tomonitor the conditions below the canopy level, and usually cover a largerarea.Figure 1 shows the schematics of a phenology tower on the left, withtwo PAR sensors and two pyranometers, one of each measuring incomingsolar radiation and one of each measuring reﬂected solar radiation. Theright side of the ﬁgure is a photo of one phenology tower installed inBrazil, which raises the sensors eight meters from the ground, six meters bove the canopy. Figure 1: Phenology tower schematics (left) and a tower in Brazil (right).

Most radiation ﬂux sensors have view angles of up to 85 ◦ from zenith(when oriented up, i.e. , measuring incoming radiation) or nadir (whenoriented down), with a uniform 360 ◦ rotation. With that, the radius thataﬀects the readings is up to around ten times the distance ( h ) between thesensor position and the surface being monitored ( i.e. , radius = tan(85 ◦ ) × h ). For our deployment, we usually have at least ﬁve meters between thetop of the canopy and the sensor measuring the reﬂected radiation (8 to 15m in total), leading to a coverage radius of at least 50 m in the monitoredarea.Obstructions within the range of a sensor interfere with the readingand might not be easy to identify from the data only—e.g., higher canopyof adjacent ecosystems or a nearby tower with other instruments mayinterfere with sensor measuring incoming radiation. A sensor measuringradiation reﬂected from the canopy is more susceptible to interference—e.g., the positioning of solar panels, whose reﬂectiveness greatly aﬀectreadings. Large panels should be positioned outside of the interferenceradius, while smaller panels can be positioned at the same height as thesensor for no interference. Note that it is diﬃcult to position radiationsensors and solar panels at diﬀerent orientations, since both should use theoptimal exposure angle to the sun, roughly North, in southern latitudes,or South, in northern latitudes.Monitoring the conditions under the canopy level, i.e. , understory de-ployments, allows assessing a diﬀerent range of micro-climatic conditionsand also soil condition—e.g., temperature and moisture levels. Understorydeployments are usually easier to access, and with that, they are usefulfor validating the readings observed in a tower and also as a backup forcertain variables in case of sensor malfunction in a tower. Using wirelesssystems substantially increases the spatial coverage of understory deploy-ments with a fraction of the increase in cost and eﬀorts to retrieve dataand maintain the system.Figure 2 depicts an example of such a wireless deployment on its leftside. On the right side, it shows a node deployed in the chaco ecosystemin Argentina. The height at which the sensors are installed in this case s also determined by the canopy’s height, usually ranging from righton the ground (e.g., for grasslands) to 1.5 m for taller canopies. Oneexample of application that relies on the spatial coverage and resolutionof understory wireless deployments is deriving Leaf Area Index (LAI)—see[48,50], for instance. LAI, along with Plant (PAI) and Wood (WAI) AreaIndexes [51], are important indicators of vegetation productivity, beingalso used as a reference for crop growth rates. Combining readings froma phenology tower with understory readings of absorbed solar radiationﬂuxes, it is possible to derive NDVI for the location of each node. UsingNDVI and knowing an appropriate conversion factor, characteristic toeach ecosystem, it is possible to calculate LAI for each node [48]. Thisallows the creation of maps of very high spatial and temporal resolutionsfor both NDVI and LAI. Figure 2: Understory schematics (left) and a node in Argentina (right).

Having the option of deploying a large number of sensors in a givenarea also raises the question of how to distribute these sensors. We haveadopted three diﬀerent strategies to spatially distribute nodes and theirsensors. Figure 3 illustrates these strategies. The ﬁrst approach, shownin the left, is intended to monitor a linear region along a transect. This isparticularly useful for monitoring transitions between ecosystems or ex-position to diﬀerent conditions within the same ecosystem. The center ofthe ﬁgure shows distribution of nodes in concentric circles, which is some-times called a “star” deployment. This type of deployment is used mostlyto monitor conditions around a point of interest, usually corresponding tothe footprint of phenology or carbon ﬂux towers, allowing combination ofmeasurements from both deployments. A third strategy is to deploy nodesin a grid, covering a potentially larger area of interest. Regularly spacedgrids are useful for uniform monitoring throughout an area. However,irregular grids can also be useful when special conditions occur within aregion of interest. Examples include part of an area that is also being onitored by other experiments (e.g., leaf collection for chlorophyll mea-surements); or patches aﬀected by ﬁre and monitoring their recovery is ofinterest. (a) (b) (c)Figure 3: Deployment strategies: ( a ) transects; ( b ) concentric circumferences;and ( c ) grids. From a logistics perspective, installing tower and understory systems havefairly diﬀerent characteristics. Phenology towers reached up to 15 m in oneof our deployments, with 9 m being the most common height. Selectingthe location for installing a tower that high must take into account therepresentativeness of the ecosystem, the impact of building it, and theaccessibility to bring its parts to the site. Another important issue isthe uniformity in the height of the canopy. Too much variation in thetree heights will lead to scaling problems in the data, an area with tallervegetation will be contributing signiﬁcantly less to the readings. Wheninstalling a phenology tower intended to be used in a long term datacollection, the growth of the vegetation should also be taken into account.Younger ecosystems might grow considerably at intervals as short as oneyear, forcing height upgrades to a tower.The height of the canopy is also a concern for understory deployments.Ecosystems with lower canopies, such as grasslands, require that solar ra-diation ﬂux sensors be positioned almost adjacent to the ground, whiletaller canopies allow sensors in a higher position (0.60 to 0.10 m are com-mon heights). For wireless deployments, the node is usually installed ina higher position to improve radio signal range, while the sensors aredeployed at the appropriate heights.Although it might seem like a trivial task at ﬁrst, correctly positioningthe sensors should take into consideration a number of factors. One issueis the creation of unnatural sources of shade (e.g., from the pole wherethe node sits) into the sensor. For deployments in the northern (southern) emisphere, positioning radiation ﬂux sensors South (North) of obstruc-tions avoids this issue. Air temperature and relative humidity sensors arealso aﬀected by their positioning. Besides being hosted at solar radiationshields and being positioned as to allow for air circulation, they shouldalso keep some distance from radiation absorbing materials. Most of theweatherprooﬁng cases, for instance, absorb non-negligible quantities ofsolar radiation. We had cases of temperature deviations of up to 20 ◦ Cbecause of a dark weatherproof case.One crucial aspect to sensor systems deployments in tropical ecosys-tems is the exposure to constantly high relative humidity. Values between90%–100% are common in these environments. Combined with high tem-peratures, this condition transformed many weatherproof casings into hu-midity traps. The main problem was actually the diﬀerence of internal andexternal pressure in the cases. That made previously air tight cases ab-sorb humidity while balancing the pressure, exposing the internal circuitsand connectors. Both for loggers and sensors, even cases designed andtested to work underwater were susceptible to this problem. Adoptingpressure relief valves signiﬁcantly attenuated the problem, even thoughsometimes they can get clogged with dirt and stop working. Anotheradopted practice that also helped reduce this problem was to use silicone-based adhesives to seal borders and openings, around sensor cables andalso around the sensors themselves.For wireless sensor systems, testing the range of the radio system atthe actual deployment site is essential. Vegetation distribution and ter-rain contours are diﬃcult to predict beforehand and have a signiﬁcantimpact in the radio range. Two major aspects have shown to be of par-ticular relevance when conducting this kind of test. Firstly, if the type ofbatteries used decrease the voltage oﬀered to the system with time, thetests should not be conducted with new batteries. A more accurate test ofradio range is achieved using more realistic battery levels—e.g., levels ofbattery similar to when a deployment running for more than half of the ex-pected battery life. In case of rechargeable batteries, the charge level usedshould be the average level the batteries would have when going withoutcharge for the maximum foreseeable period. For tropical dry forests themaximum period without non-negligible sun light exposure for chargingbatteries through solar panels is around two days. It is worth of note thatregular alkaline (zinc and manganese dioxide) batteries, widely adoptedto power nodes in sensor networks, do change their voltages depending ontheir level of charge.The second aspect interfering with radio range is related to the veg-etation density, particularly to changes in density throughout the sea-sons. Radio range is greatly aﬀected by branches and leaves in the lineof sight of the signal. Ranges of up to 300 m in a level and open ﬁeldcan be reduced to as little as 15 m (a factor of 20 reduction) simply byhaving a somewhat closed vegetation. In particular, radio frequenciesat 2.4 GHz are severely attenuated by trees and leafs. This frequencyis adopted by several wireless sensor systems, including the ZigBee Al-liance ( ) communications protocols (based onthe IEEE 802.15.4 Wireless Personal Area Network standard), widely usedin these systems. Furthermore, it is very usual for deployment campaigns o take place in the dry season, when rainfall is less of a concern for theschedule in deployment plans. However, foliage of deciduous vegetationcan be at much lower levels than it will be in the wet season, which cansigniﬁcantly aﬀect the range of radio signal. There is no deﬁnite solutionto address the vegetation changes, since simulating the conditions of adiﬀerent season is diﬃcult. Monitoring the overall network health, whichcan be done in its simplest form by detecting gaps in the data, and repo-sitioning nodes when necessary has been the best measure to address thisissue in new ecosystems. These, in turn, serve as a reference for futuredeployments in similar ecosystems.Seasonal change also can have an unexpected impact in the visibilityof nodes and sensors. When installing sensors in the dry season, there arefew obstructions and less color variability on the landscape. This makesvisibility reasonably good. However, areas that signiﬁcantly change theirvegetation coverage or areas that have dense vegetation can become quitechallenging from the point of view of visibility in the wet season. Usingcolorful markers—red or yellow ribbons or paint are eﬀective for this—cansave a lot of time when trying to ﬁnd nodes and sensor that have beendeployed for a while. Not relying solely on the GPS to locate small piecesof equipment such as individual nodes and sensors can be the diﬀerencebetween returning to base camp before or after sunset. One aspect ofusing such markers that was not taken into account in this work is theincreased attractiveness color makers might exert on animals (particularlyinsects). Data from ground-based sensor systems can be retrieved either in-situ orremotely. The former involves expeditions to the deployment sites, whichcan be very expensive. However, if the site is already being visited in aregular basis for other reasons (collecting leaf samples, for instance), thismight become more feasible. Most of our current deployments are workingin this scenario. This has proven to be quite an advantage from theperspective of maintenance of untested systems, allowing early detectionof problems with equipment. With equipment proven to work well, usinga remote solution is probably more cost and time eﬀective.Collecting data remotely might be achieved in a number of ways.One possibility is using a dedicated long range wireless communicationsystem—e.g., by using a WiFi connection with repeaters—to transmitdata at regular intervals to a computer installed in a location with per-manent power supply. If there is also Internet connectivity, the data canbe forwarded to on-line permanent archival systems. This alternative usu-ally has a signiﬁcant overhead of maintaining the local computer and thelong range communications system running.Another alternative is to use cellular networks with data capabilities.Although cellular coverage is not good in more remote areas, some regionshave enough connectivity to allow data transmission in a fairly regularfashion. Using higher gain antennas improves signal reception, but thesystem must be prepared to go through reasonably long periods with noconnectivity, preserving all data for delayed transmission. Since an actual nternet connection is provided with a cellular connection, the data canbe transmitted directly to on-line archival servers.A third type of remote data retrieval solution involves using a satelliteup-link. This approach is also subject to communications failure (e.g.,if there is too much cloud coverage). The connectivity provided hereusually is not to the Internet, but connectivity to a service provider thatreceive the data from the satellite. This provider in turn makes the dataaccessible, often oﬀering automated ways of retrieving the data from theiron-line servers. In our case, systems that have proven to work consistentlywell have been equipped with a satellite transmitter.Remote connectivity allows not only automated data retrieval, but alsosome level remote operation of the equipment. Options of stopping andstarting the logger, setting sampling and storage rates are often available.In a few cases, it might be interesting to be able to set other parame-ters remotely, particularly with wireless systems. Research projects haveexplored conﬁguring deployments remotely [37], even reprogramming log-gers and collection nodes in some cases [20, 36]. This level of ﬂexibility inremote deployment conﬁguration, however, is not yet commercially avail-able. When considering the volume of data generated by current sensor sys-tems, automation of data management related tasks within a proper com-putational infrastructure is of paramount importance [1, 3, 4, 10, 33, 52].However, actual datasets generated by sensor systems might present a va-riety of problems and exceptions, which are often diﬃcult to foresee. Thisis a severe drawback in attempts to automate the ﬁrst data managementphase: ingestion of data into any computational data management sys-tem. This sort of problems are often dismissed as being “implementationdetails”, but their implications can actually aﬀect data quality parametersand models to store and distribute the data. In higher end (expensive)and/or homogeneous equipment this sort of problems are usually easierto tackle. However, in a setting like ours, using equipment from diﬀerentmanufactures, in a highly distributed eﬀort, with an aim at low limits forequipment and maintenance costs, these issues are fairly commonplace.The implementation of solutions for problems with raw datasets areusually carried out within a data pre-processing (or data cleaning ) phase.Although these terms usually encompass explicit data quality veriﬁcationor removal of erroneous readings (e.g., values outside the scale measuredby a sensor), this section only considers problems that actually prevent(or are diﬃcult to trace after) the ingestion of the data into a data man-agement system. When compared to classiﬁcation scales usually adoptedin describing Earth observation data products, after the corrections in thissection, the dataset should treated as “raw” data, or, as being at DataProcessing Level 2 in the National Research Council (NRC) Committeeon Data Management and Computation (CODMAC) [53] classiﬁcation, orto Data Processing Level 0 used by NASA ( http://science.nasa.gov/earth-science/earth-science-data/data-processing-levels-for-eosdis-data-products/ ). he next paragraph discuss the problems we had to handle when prepar-ing our datasets. Keeping correct temporal information for timestamping readings from dis-tributed sensors can be really challenging, not to mention correcting timedeviations after recording the data [54]. Time synchronization is an is-sue both at single deployment, with multiple collectors and/or loggers,and across deployments. Within a deployment, hardware imprecision andheterogeneous initial synchronization methods are the two main causes ofsynchronization problems. Time keeping in electronic equipment is basedon crystal oscillators, which can deviate from their standard frequencywith environmental conditions, especially temperature. This causes thetime measurements to deviate as well, and aﬀects almost all types of datalogging equipments. In this case, the error is proportional to the samplingrate, which for applications such as seismology, with high sampling ratesare, these errors are quite signiﬁcant. For long term environmental mon-itoring, this can also be a problem. One solution is to have an accuratereference time keeping and a mechanism to keep the synchronicity amongloggers. Possible solutions include having more precise equipment kept ata less exposed location or using GPS time as references. A few wirelesscommunication protocols have time synchronization features embeddedwithin their message exchanging mechanism [55].When dynamic time synchronization against a reliable reference isnot feasible, the initial synchronization method is the basis for all timeinformation within a deployment. This is the most common scenario forour current deployments, with the usual mechanism for synchronizationbeing based on the time information from the computer with the controlsoftware used to start a deployment. Therefore, the time informationin that computer should be synchronized (e.g., by using Network TimeProtocol, IETF RFC 5905 ( http://tools.ietf.org/html/rfc5905 )).Data comparison from diﬀerent deployments at small temporal reso-lutions must take into account potential synchronization errors. However,since sampling rates are commonly higher than desired temporal resolu-tion, most data analysis is done with aggregated data instead of the entiredataset, which attenuates the eﬀects of the time synchronization relatederrors, particularly when looking at hourly or even daily averages.Similar to other reports [54], we also experienced power source relatedsynchronization problems. Time measurement in some logging equip-ments can be aﬀected by power outages or low voltages from the powersource. Some types of equipment use the main power to keep time mea-surement running and, although time measurement usually requires verylittle power, if the supply is interrupted, the equipment’s clock gets reset.Current data logging equipment and control software oﬀer poor sup-port to address time synchronization problems. Many of them don’t evenlet the user see what is the current time in logging system to manuallycheck for time drifts. But this is evolving in control software for wirelesssystems, since these suﬀer more noticeably from time related problems. .2 Time Zones When dealing with deployments that are geographically distributed through-out various timezones, establishing the correct local time can become anissue. Once again, relying on a computer’s time as a reference to times-tamp the readings is a major cause of errors. Diﬀerent versions of operat-ing systems have diﬀerent levels of automation regarding time zones anddaylight saving time conﬁgurations, often allowing users to change thesemanually. Therefore, besides having the correct time on the computer, asalready discussed, wrong conﬁgurations of time zone and changing conﬁg-urations for daylight saving times can also lead to inconsistencies such as:having data for a single deployment timestamped with diﬀerent daylightsaving times, or diﬃculties determining which is the correct local timewhen comparing data for deployments in diﬀerent time zones.For our deployments, when issuing ﬁeld laptops, time conﬁgurationsalways adopt the local standard time for the site, disabling automaticchanges to daylight saving time. However, even rugged ﬁeld laptops fail,and temporary misconﬁgured replacements can be used. Or, an evenless elaborate problem, which happens often, new users get confused byseemingly “wrong” time settings and change the time conﬁgurations.It is possible, however, to check time zone and daylight saving timesagainst sun time. This is done by comparing several days of sunrise timefrom data collected by solar radiation sensors to expected sunrise timesfor the location. This method is not accurate enough for correcting forhardware time drift, for instance, but is good enough to correct for one ormore hours shift in the timestamps. This veriﬁcation is performed on allof our datasets before ingesting them into our data management system.

One burdensome problem of dealing with data from diﬀerent types ofequipment is handling changing data formats or a variety of possible for-matting errors.The ﬁrst of such types of problems to be addressed are changes inthe data format made by the equipment manufacturer. A considerableamount of format changes from manufacturers are not documented ade-quately with new versions or software updates. Unfortunately, this typeof problem needs to be addressed case-by-case.One problem that was surprising to us is that some types of failures inthe sensors themselves can generate errors in the data format by, for in-stance, changing the number of data columns in a record. As an example,this could make a record that should have three data columns (e.g., read-ings on temperature, humidity and solar radiation) actually have extra ormissing columns. Similar eﬀects can be caused by connector designed tobe generic and support diﬀerent sensors: a sensor behaving in some unex-pected way may cause the data collection node or the logger to performincorrect conversions or even generate software errors that will aﬀect thedata format. In wireless equipment, we have also seen the data formatbeing drastically changed by problems in the transmission of the data.In the presence of radio interference, usually created by the operation of igher powered wireless equipment in proximity of the deployment, thedata transmission gets compromised, generating errors in both the valuesof the readings and the structure of a record.All these types of errors can cause failures in the ingestion softwareor, worse, have errors introduced in the data ingestion process withoutany warnings. Our solution to this was to make ingestion software moni-tor for format changes and generate informative error messages, allowingidentifying problems before ingestion. Given the tailored nature of data pre-processing steps described in thissection, it is diﬃcult to keep standardized provenance information andeven harder to automate collection of this type of information, as alsomade evident in [40]. In our current data management solution, sim-ple free text description ﬁelds are used to keep track of pre-processingsteps and choices. Nonetheless, with a ﬂexible metadata speciﬁcationtool, such as the one created for our partner project GeoChronos ( ) [56], it would be possible to add speciﬁc ﬁeldsto document evolving aspects of the pre-processing steps.Although diﬃcult to obtain and maintain, documentation of the pre-processing steps are important to identify not only errors in the pre-processing itself, but also problems with the deployments. For instance,the appearance of too many erroneous records from a wireless data collec-tion node are potential evidence of problems with the sensors, the sensorconnections, the node hardware or radio interference sources in the sur-roundings. The latter problem might even indicate aﬀected readings fromother nodes that would otherwise go unnoticed.

A resourceful and easy to use data management system is the last pieceof our solution for large scale in-situ monitoring. The pre-processing steppresented in the previous section allows the data to be ingested into thesystem, being stored in an integrated representation. Then users can in-teract with the system having access to data ﬁltering, aggregation andother more specialized processing operations. Data visualization and re-trieval are oﬀered for data at all processing levels after pre-processing,providing a ﬂexible mechanism for analyzing the data within the systemor using other tools with the data already narrowed down to the parts ofinterest. This section discusses these issues, also considering data qualityand user interaction aspects.

The task of data ingestion can be automated for datasets that require pre-processing steps known beforehand. Automated data ingestion methodsare particularly useful with deployments that have automated data re-trieval, as is the case for data retrieval using a satellite up-link and an the espective Internet service for getting the data. However, new datasets ordatasets that needed specialized pre-processing before getting ready to beingested need a ﬂexible mechanism to map available data to the integratedrepresentation of the data in the system. Properly handling errors andexceptions in the data ingestion processing is necessary from both user ex-perience and data quality perspectives. Automated data ingestion needstimely error generation so the user responsible for the deployment is keptinformed and and can take corrective action. Informative descriptions oferrors helps the user identify and diagnose the error causes, which is par-ticularly important for manual ingestion of data that was pre-processedin any non-standard way.Another aspect to be considered is the documentation of the pro-cess for every dataset upload. Metadata regarding date and time, datasource, user, pre-processing options, among others, help identify sourcesof error such as faulty time related correction or application of outdatedpre-processing methods. Most of these metadata can be collected auto-matically by the system, which unburdens the user and prevents missinginformation from less thorough users.Only with an integrated data representation model it is possible tooﬀer a common user interface, types of ﬁlters, aggregation options and anyother operation for manipulation of data. Data from diﬀerent instruments,deployment conﬁgurations, retrieval strategies, etc. , need to be storeduniformly so all the system’s features are available for all datasets. Having the data uniformly stored in a repository, the users can start tai-loring datasets to their needs. The most basic functionality to allow thistailoring is being able to apply ﬁlters (e.g., only data within a range ofvalues or with low error rates) and aggregation operators (e.g., showingdaily or monthly values) to the datasets. Without adequate computa-tional support, many researchers spend days to weeks in this trivial task.Figure 4 shows our interface for a few of these ﬁlters to achieve the targetdata, from the top: which sensors to include, which time span to consider,and which times of the day are of interest. The screen shown is to extractand download a dataset to be used with other tools. Several other op-tions are also available, including ﬁltering out errors, showing raw values(e.g., of voltages, electrical current, or unconverted pulse counts), ﬁle andcontent formats, including data derived from the sensor readings, amongothers.Oﬀering quick and easy access to the (corrected and quality checked)sensor readings from a deployment is one of the most essential features ofour solution. However, also having data that can be derived from fromthese readings as easily accessible is what shows the actual potential fordata management systems like ours. Our current implementation oﬀersthe automatic calculation of vegetation indexes, NDVI and EVI, fromsolar radiation ﬂux sensors using diﬀerent methodologies [46–49]. Otherproducts are currently being integrated into Enviro-Net, including LAI,

Vapour Pressure Deﬁcit (VPD), spatial distribution for

Fraction of Ab-sorbed Photosynthetically Active Radiation (fAPAR).

After tailoring a dataset to speciﬁc goals, adequate visualization toolsallow easier understanding of events and trends within the monitored ar-eas. The most basic visual tool is graphing the sensor readings of diﬀerentvariables, allowing visual comparison and insight on the measurements inone deployment. However, two features in our web system proved to beinvaluable: graphing of datasets that went trough transformations (ﬁl-tering, aggregation and derived data) and graphing across deployments.These graphing options are depicted on the left side of Figure 5, whichshows derived NDVI (using the methodology in [48]) for two diﬀerentdeployments in the Mata Seca State Park, in Brazil, within a speciﬁctime span, using only readings close to midday (between 10:00 AM and2:00 PM local time), ﬁltering out seemingly cloudy days ( i.e. , includingonly data records when the measured incoming PAR is more than 900microeinsteins per second per square meter— µ E/m /s), and aggregatingthe data in daily averages.The right side of Figure 5 shows another type of visualization strategybased on the spatial distribution of the readings. The graph on the leftsite uses a color scale to represent variation temperature across an areacovered by 12 temperature sensors in the Chamela Reserve in Mexico. Thegraph on the right shows the coverage of the installed sensors (indicatingthe reliability of the scale), highlighting sensor failures when these occur.

Within the speciﬁed time span, the system generated a sequence of imageswhich are animated using the controls at the bottom to show the evolutionof the temperature and reliability distributions through time. This is auseful tool to observe cyclic (e.g., diurnal or seasonal) changes in themonitored areas.

This paper presented the Enviro-Net Project, which addresses a variety ofissues related to in-situ (or ground-based) monitoring of ecosystems, fromthe deployment of sensors to the delivery of processed data products. Acombination of factors make this project unique: (i) acquisition of data atecosystem level with high spatial and temporal resolutions; (ii) long term,ground-based monitoring; (iii) use of heterogeneous, commercially avail-able, and inexpensive equipment, including wireless sensing technologies;and (iv) integrated data management solution, with a Web-based userinteraction with data products.This scenario, which is increasingly being adopted by other researchprojects, is described in detail in the paper, discussing lessons learnedand pointing out aspects that require attention and could go unnoticedbefore deployment eﬀorts are well underway. The paper examines notonly technical issues of deploying ground-based sensor systems, but alsothe logistics behind execution and maintenance of deployments, issuesrelated to data retrieval, veriﬁcation and quality, and publication of dataproducts. The paper discussed evidence that this kind of research wasneeded, integrating solution to from a number of research eﬀorts andoﬀering a real solution in-situ long term environmental monitoring at highresolution temporal and spatial.Current eﬀorts include: improving our deployment protocols to dealwith arising problems and simplifying the maintenance related tasks; ex-tending our data management system in order to handle larger amounts ofdata; and adding new data manipulation operations to oﬀer more derived ata products. As future work, we intend to focus on data provenancevisualization issues, to improve understanding of how data products weregenerated and allowing automation of reproducibility. Another aspectto be explored in future releases of our data management system is theintegration of remote sensing data (from satellite and airborne instru-ments) into our common interface [57], allowing analysis and comparisonof these types of data with ground-based data. We also plan to work onimplementing programmatic interfaces to allow software-based access toour data by, for instance, using Open Geospatial Consortium protocols.Lastly, we have plans to include monitoring of equipment life expectancy,particularly of sensors and wireless collector node equipment, in order tocreate better models for maintenance of long term deployments—by, forinstance, increasing the precision of required replacement rates for equip-ment. Acknowledgements

The Enviro-Net Project is funded by the Canadian Foundation for Innova-tion and the Inter-American Institute for Global Change Research (IAI)CRN II eferences

1. Rundel, P.W.; Graham, E.A.; Allen, M.F.; Fisher, J.C.; Harmon,T.C. Environmental sensor networks in ecological research.

NewPhytol. , , 589–607.2. Porter, J.; Arzberger, P.; Braun, H.W.; Bryant, P.; Gage, S.;Hansen, T.; Hanson, P.; Lin, C.C.; Lin, F.P.; Kratz, T.; et al .Wireless sensor networks for ecology. BioScience , , 561–572.3. Hart, J.K.; Martinez, K. Environmental sensor networks: A revo-lution in the earth system science? Earth-Sci. Rev. , , 177–191.4. Estrin, D.; Michener, W.; Bonito, G. Environmental Cyberinfras-tructure Needs for Distributed Networks ; Technical report for LongTerm Ecological Research (LTER) Network; Scripps Institution ofOceanography: La Jolla, CA, USA, August 2003; Available online: (accessed on 20 April2011).5. Schwartz, M.D.

Phenology: An Integrative Environmental Science ;Kluwer Academic Publishers: Dordrecht, The Netherlands, 2003.6. Keller, M.; Schimel, D.S.; Hargrove, W.W.; Hoﬀman, F.M. A con-tinental strategy for the National Ecological Observatory Network.

Front. Ecol. Environ. , , 282–284.7. Baldocchi, D.; Falge, E.; Gu, L.; Olson, R.; Hollinger, D.; Running,S.; Anthoni, P.; Bernhofer, C.; Davis, K.; Evans, R. FLUXNET: Anew tool to study the temporal and spatial variability of ecosystem-scale carbon dioxide, water vapor, and energy ﬂux densities. Bull.Am. Meteorol. Soc. , , 2415–2434.8. Humphrey, M.; Agarwal, D.; van Ingen, C. Fluxdata.org: Publica-tion and Curation of Shared Scientiﬁc Climate and Earth SciencesData. In Proceedings of the 5th IEEE International Conference one-Science (e-Science’09) , Oxford, UK, 9–11 December 2009; pp.118–125.9. Delin, K.A. The sensor web: A macro-instrument for coordinatedsensing.

Sensors , , 270–285.10. Teillet, P.M. Sensor webs: A geostrategic technology for integratedearth sensing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. , , 473–480.11. Botts, M.; Percivall, G.; Reed, C.; Davidson, J. OGC R (cid:13) Sensor WebEnablement: Overview and High Level Architecture. In

GeoSensorNetworks ; Nittel, S., Labrinidis, A., Stefanidis, A., Eds.; Springer:Berlin, Germany, 2008; Volume 4540, pp. 175–190.12. Chong, C.Y.; Kumar, S.P. Sensor networks: Evolution, opportuni-ties, and challenges.

Proc. IEEE , , 1247–1256.13. Martinez, K.; Hart, J.K.; Ong, R. Environmental Sensor Networks. IEEE Comput. , , 50–56.

4. Hamilton, M.P.; Graham, E.A.; Rundel, P.W.; Allen, M.F.; Kaiser,W.; Hansen, M.H.; Estrin, D.L. New approaches in embeddednetworked sensing for terrestrial ecological observatories.

Environ.Eng. Sci. , , 192–204.15. Porter, J.H.; Nagy, E.; Kratz, T.K.; Hanson, P.; Collins, S.L.;Arzberger, P. New eyes on the world: Advanced sensors for ecology. BioScience , , 385–397.16. Lee, W.; Alchanatis, V.; Yang, C.; Hirafuji, M.; Moshou, D.; Li,C. Sensing technologies for precision specialty crop production. Comput. Electroni. Agric. , , 2–33.17. Matese, A.; Di Gennaro, S.F.; Zaldei, A.; Genesio, L.; Vaccari,F.P. A wireless sensor network for precision viticulture: The NAVsystem. Comput. Electron. Agric. , , 51–58.18. Aquino-Santos, R.; Gonz´alez-Potes, A.; Edwards-Block, A.; Virgen-Ortiz, R.A. Developing a new wireless sensor network platform andits application in precision agriculture. Sensors , , 1192–1211.19. Mainwaring, A.; Polastre, J.; Szewczyk, R.; Culler, D.; Anderson, J.Wireless Sensor Networks for Habitat Monitoring. In Proceedings ofthe 1st ACM International Workshop on Wireless Sensor Networksand Applications (WSNA’02) , Atlanta, GA, USA, September 2002;pp. 88–97.20. Barrenetxea, G.; Ingelrest, F.; Schaefer, G.; Vetterli, M. The Hitch-hiker’s Guide to Successful Wireless Sensor Network Deployments.In

Proceedings of the 6th ACM Conference on Embedded NetworkSensor Systems (SenSys’08) , Raleigh, NC, USA, 5–7 November2008; pp. 43–56.21. Ingelrest, F.; Barrenetxea, G.; Schaefer, G.; Vetterli, M.; Couach,O.; Parlange, M. SensorScope: Application-speciﬁc sensor networkfor environmental monitoring.

ACM Trans. Sens. Netw. (TOSN) , , 17:1–17:32.22. Wang, L.; Yang, Y.; Noh, D.K.; Le, H.K.; Liu, J.; Abdelzaher, T.F.;Ward, M. AdaptSens: An Adaptive Data Collection and StorageService for Solar-Powered Sensor Networks. In Proceedings of the30th IEEE Real-Time Systems Symp. (RTSS’09) , Washington, DC,USA, 1–4 December 2009; pp. 303–312.23. Mo, L.; He, Y.; Liu, Y.; Zhao, J.; Tang, S.J.; Li, X.Y.; Dai, G.Canopy closure estimates with GreenOrbs: Sustainable sensing inthe forest. In

Proceedings of the 7th ACM Conference on EmbeddedNetworked Sensor Systems (SenSys’09) , Berkeley, CA, USA, 4–6November 2009; pp. 99–112.24. Selavo, L.; Wood, A.; Cao, Q.; Sookoor, T.; Liu, H.; Srinivasan, A.;Wu, Y.; Kang, W.; Stankovic, J.; Young, D.; et al . LUSTER: Wire-less sensor network for environmental research. In

Proceedings ofthe 5th ACM Conference on Embedded Networked Sensor Systems(SenSys’07) , Sydney, Australia, 4–9 November 2007; pp. 103–116.

5. Ming, X.; Yabo, D.; Dongming, L.; Ping, X.; Gang, L. A WirelessSensor System for Long-Term Microclimate Monitoring in Wild-land Cultural Heritage Sites. In

Proceedings of the 6th IEEE In-ternational Symposium on Parallel and Distributed Processing withApplications (ISPA’08) , Sydney, Australia, 10–12 December 2008;pp. 207–214.26. Wark, T.; Hu, W.; Corke, P.; Hodge, J.; Keto, A.; Mackey, B.;Foley, G.; Sikka, P.; Br¨unig, M. Springbrook: Challenges in de-veloping a long-term, rainforest wireless sensor network. In

Pro-ceedings of the 4th International Conference on Intelligent Sensors,Sensor Networks and Information Processing (ISSNIP’08) , Sydney,Australia, 15–18 December 2008; pp. 599–604.27. Werner-Allen, G.; Lorincz, K.; Johnson, J.; Lees, J.; Welsh, M.Fidelity and yield in a volcano monitoring sensor network. In

Pro-ceedings of the 7th Symp. on Operating Systems Design and Imple-mentation (OSDI’06) , Seattle, WA, USA, 6–8 November 2006; pp.381–396.28. Welsh, M. Sensor networks for the sciences.

Commun. ACM , , 36–39.29. Tolle, G.; Polastre, J.; Szewczyk, R.; Culler, D.; Turner, N.; Tu, K.;Burgess, S.; Dawson, T.; Buonadonna, P.; Gay, D.; et al . A Macro-scope in the Redwoods. In Proceedings of the 3rd InternationalConference on Embedded Networked Sensor Systems (SenSys’05) ,San Diego, CA, USA, 2–4 November 2005; pp. 51–63.30. Langendoen, K.; Baggio, A.; Visser, O. Murphy loves potatoes: Ex-periences from a pilot sensor network deployment in precision agri-culture. In

Proceedings of the 20th International Parallel and Dis-tributed Processing Symp. (IPDPS 2006) , Rhodes Island, Greece,25–29 April 2006.31. Szewczyk, R.; Polastre, J.; Mainwaring, A.; Culler, D. Lessonsfrom a Sensor Network Expedition. In

Wireless Sensor Networks ;Springer: Berlin, Germany, 2004; Volume 2920, pp. 307–322.32. Jim´enez, V.P.G.; Armada, A.G. Field measurements and guidelinesfor the application of wireless sensor networks to the environmentand security.

Sensors , , 10309–10325.33. Mus˘aloiu-E., R.; Terzis, A.; Szlavecz, K.; Szalay, A.; Cogan, J.;Gray, J. Life under Your Feet: A Wireless Soil Ecology Sensor Net-work. In Proceedings of the 3rd Workshop on Embedded NetworkedSensors (EmNets’06) , Cambridge, MA, USA, 30–31 May 2006.34. Olken, F.; Gruenwald, L. Data stream management: Aggregation,classiﬁcation, modeling, and operator placement.

IEEE InternetComput. , , 9–12.35. Ozer, S.; Gray, J.; Szalay, A.; Terzis, A.; Mus˘aloiu-E, R.; Szlavecz,K.; Burns, R.; Cogan, J. Data analysis tools for sensor-based sci-ence. In Proceedings of the 4th International Conference on Embed-ded Networked Sensor Systems (SenSys’06) , Boulder, CO, USA, 31October–3 November 2006; pp. 341–342.

6. Aberer, K.; Hauswirth, M.; Salehi, A. Infrastructure for Data Pro-cessing in Large-Scale Interconnected Sensor Networks. In

Pro-ceedings of the 8th the Int. Conf. on Mobile Data Management(MDM’07) , Mannheim, Germany, 7–11 May 2007; pp. 198–205.37. Stojkoska, B.; Davcev, D. Web Interface for Habitat Monitoring Us-ing Wireless Sensor Network. In

Proceedings of the 5th InternationalConference on Wireless and Mobile Communications (ICWMC’09) ,Cannes, France, 23–29 August 2009; pp. 157–162.38. Karasti, H.; Baker, K.S. Digital data practices and the long termecological research program growing global.

Int. J. Digital Curation , , 42–58.39. Borgman, C.L.; Wallis, J.C.; Mayernik, M.S.; Pepe, A. Drowningin data: Digital library architecture to support scientiﬁc use of em-bedded sensor networks. In Proceedings of the 7th ACM/IEEE-CSJoint Conf. on Digital Libraries (JCDL’07) , Vancouver, Canada,17–22 June 2007; pp. 269–277.40. Wallis, J.; Borgman, C.; Mayernik, M.; Pepe, A.; Ramanathan,N.; Hansen, M. Know Thy Sensor: Trust, Data Quality, and DataIntegrity in Scientiﬁc Digital Libraries. In

Research and AdvancedTechnology for Digital Libraries ; Kov´acs, L., Fuhr, N., Meghini, C.,Eds.; Springer: Berlin, Germany, 2007; Volume 4675, pp. 380–391.41. Lukac, M.; Stubailo, I.; Guy, R.; Davis, P.; Puruhuaya, V.A.; Clay-ton, R.; Estrin, D. First-class meta-data: A step towards a highlyreliable wireless seismic network in Peru. In

Proceedings of the 1stWorkshop on Sensor Networks for Earth and Space Science Appli-cations (ESSA’10) , San Francisco, CA, USA, April 2009.42. Dawes, N.; Kumar, K.A.; Michel, S.; Aberer, K.; Lehning, M.Sensor Metadata Management and Its Application in Collabora-tive Environmental Research. In

Proceedings of the 4th IEEE In-ternational Conference on eScience , Indianapolis, IN, USA, 7–12December 2008; pp. 143–150.43. Casola, V.; Gaglione, A.; Mazzeo, A. A Reference Architecture forSensor Networks Integration and Management. In

Proceedings ofthe 3rd Int. Conf. on GeoSensor Networks (GSN’09) , Oxford, UK,July 2009; pp. 158–168.44. Gupta, V.; Udupi, P.; Poursohi, A. Early lessons from buildingSensor.Network: An open data exchange for the web of things.In

Proceedings of the 8th IEEE Int. Conf. on Pervasive Comput-ing and Communications Workshops (PERCOM’10 Workshops) ,Mannheim, Germany, 29 March–2 April 2010; pp. 738–744.45. Raman, B.; Chebrolu, K. Censor networks: A critique of “sensornetworks” from a systems perspective.

ACM SIGCOMM Comput.Commun. Rev. , , 75–78.46. Huemmrich, K.F.; Black, T.A.; Jarvis, P.G.; McCaughey, J.H.;Hall, F.G. High temporal resolution NDVI phenology from microm-eteorological radiation sensors. J. Geophys. Res. , , 27935–27944.

7. Jenkins, J.P.; Richardson, A.D.; Braswell, B.H.; Ollinger, S.V.;Hollinger, D.Y.; Smith, M.L. Reﬁning light-use eﬃciency calcula-tions for a deciduous forest canopy using simultaneous tower-basedcarbon ﬂux and radiometric measurements.

Agric. For. Meteorol. , , 64–79.48. Wilson, T.B.; Meyers, T.P. Determining vegetation indices fromsolar and photosynthetically active radiation ﬂuxes. Agric. For.Meteorol. , , 160–179.49. Rocha, A.V.; Shaver, G.R. Advantages of a two band EVI cal-culated from solar and photosynthetically active radiation ﬂuxes. Agric. For. Meteorol. , , 1560–1563.50. Fuchs, M.; Asrar, G.; Kanemasu, E.T.; Hipps, L.E. Leaf area esti-mates from measurements of photosynthetically active radiation inwheat canopies. Agric. For. Meteorol. , , 13–22.51. S´anchez-Azofeifa, G.A.; Kal´acska, M.; do Esp´ırito-Santo, M.M.;Fernandes, G.W.; Schnitzer, S. Tropical dry forest succession andthe contribution of lianas to wood area index (WAI). For. Ecol.Manage. , , 941–948.52. Arzberger, P.; Farazdel, A.; Konagaya, A.; Ang, L.; Shimojo, S.;Stevens, R. Life sciences and cyberinfrastructure: Dual and in-teracting revolutions that will drive future science. New Gener.Comput. , , 97–110.53. JPL–NASA. Planetary Data System Standards Reference ; Tech-nical Report JPL D-7669, Part 2; Jet Propulsion Laboratory:Pasadena, CA, USA, 2009. Available online: http://pds.nasa.gov/tools/standards-reference.shtml (accessed on 20 April2011).54. Gupchup, J.; Mus˘aloiu-E., R.; Szalay, A.; Terzis, A. Sundial: UsingSunlight to Reconstruct Global Timestamps. In

Wireless SensorNetworks ; Roedig, U., Sreenan, C., Eds.; Springer: Berlin, Ger-many, 2009; Volume 5432, pp. 183–198.55. Sundararaman, B.; Buy, U.; Kshemkalyani, A.D. Clock synchro-nization for wireless sensor networks: A survey.

Ad Hoc Netw. , , 281–323.56. Curry, R.; Kiddle, C.; Simmonds, R.; Pastorello, G.Z. An on-linecollaborative data management system. In Proceedings of the Gate-way Computing Environments Workshop (GCE’10) , New Orleans,LA, USA, 14 November 2010; pp. 1–10.57. Muraoka, H.; Koizumi, H. Satellite Ecology (SATECO)–linkingecology, remote sensing and micrometeorology, from plot to regionalscale, for the study of ecosystem structure and function.

J. PlantRes. , , 3–20., 3–20.