Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Igor Terekhov is active.

Publication


Featured researches published by Igor Terekhov.


high performance distributed computing | 2001

Distributed data access and resource management in the D0 SAM system

Igor Terekhov; R. Pordes; Victoria White; L. Lueking; L. Carpenter; H. Schellman; J. Trumbo; Siniša Veseli; Matt Vranicar; S. White

SAM (Sequential Access through Meta-data) is the data access and job management system for the D0 high energy physics experiment at Fermilab. The SAM system is being developed and used to handle the Petabyte-scale experiment data, accessed by hundreds of D0 collaborators scattered around the world. In this paper, we present solutions to some of the distributed data processing problems from the perspective of real experience dealing with mission-critical data. We concentrate on the distributed disk caching, resource management and job control. The system has elements of Grid computing and has features applicable to data-intensive computing in general.


Nuclear Instruments & Methods in Physics Research Section A-accelerators Spectrometers Detectors and Associated Equipment | 2003

The SAM-GRID project: architecture and plan

A. Baranovski; G. Garzoglio; H. Koutaniemi; L. Lueking; S. Patil; R. Pordes; A. Rana; Igor Terekhov; S. Veseli; J. Yu; R. Walker; V. White

Abstract SAM is a robust distributed file-based data management and access service, fully integrated with the D0 experiment at Fermilab and in phase of evaluation at the CDF experiment. The goal of the SAM-Grid project is to fully enable distributed computing for the experiments. The architecture of the project is composed of three primary functional blocks: the job handling, data handling, and monitoring and information services. Job handling and monitoring/information services are built on top of standard grid technologies (Condor-G/Globus Toolkit), which are integrated with the data handling system provided by SAM. The plan is devised to provide the users incrementally increasing levels of capability over the next 2 years.


Nuclear Instruments & Methods in Physics Research Section A-accelerators Spectrometers Detectors and Associated Equipment | 2003

Meta-computing at D0

Igor Terekhov

Abstract D0 Run II is one of the two large collider experiments at Fermilab and one of the largest currently running High Energy Physics Experiments in the world. Its amount of data, throughput of data processing, and the size of the collaboration present a unique challenge for the experiments meta-computing system. To meet the challenge, the SAMGrid system is being developed to allow globally distributed, high-throughput data processing with many Grid features. At the core of the system is the mature data handling system, SAM. We add the Job and Information Management to the data handling to arrive at a complete Grid.


Information Processing Letters | 1999

Time efficient deadlock resolution algorithms

Igor Terekhov; Tracy Camp

two efficient polynomial time resolution algorithms for the case of multiple resource units. The complexity of deadlock detection and resolution with our two resolution algorithms are O(NjN,) and O(NjNf + Np N,‘Nmin), where Np is the number of processes, N,. is the number of resources, and Nmin = min(N,, Nr). We prove that one algorithm is optimal in the special case when every process is blocked on no more than one resource unit. We also present comparison studies of the two algorithms with randomly generated deadlock scenarios. The results illustrate that, on average, the number of aborts in both techniques exceeds the optimum by less than 10%.


grid computing | 2001

The D0 Experiment Data Grid - SAM

L. Lueking; L. Loebel-Carpenter; Wyatt Merritt; C. Moore; R. Pordes; Igor Terekhov; Siniša Veseli; Matt Vranicar; S. White; Victoria White

SAM (Sequential Access through Meta-data) is a data grid and data cataloging system developed for the DO high energy physics (HEP) experiment at Fermilab. Since March 2001, DO has been acquiring data in real time fiom the detector and will archive up to 1/2 Petabyte a year of simulated, raw detector and processed physics data. SAM catalogs the event and calibration data, provides distributed file delivery and caching services, and manages the processing and analysis jobs for the hundreds of DO collaborators around the world. The DO applications are data-intensive and the physics analysis programs execute on the order of 1-1000 cpuseconds per 250KByte of data. SAM manages the transfer of data between the archival storage systems through the globally distributed disk caches and delivers the data files to the users batch and interactive jobs. Additionally, SAM handles the user job requests and execution scheduling, and manages the use of the available compute, storage and network resources to implement experiment resource allocation policies. DO has been using early versions of the SAM system for two years for the management of the simulation and test data. The system is in production use with round the clock support. DO is a participant in the Particle Physics Data Grid (PPDG) project. Aspects of the ongoing SAM developments are in collaboration with the computer science groups and other experiments on PPDG. The DO emphasis is to develop the more sophisticated global grid job, resource management, authentication and information services needed to fully meet the needs of the experiment during the next 6 years of data taking and analysis.


ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH: VII International Workshop; ACAT 2000 | 2002

SAM for D0—a fully distributed data access system

Igor Terekhov; Victoria White; L. Lueking; L. Carpenter; H. Schellman; J. Trumbo; Siniša Veseli; Matt Vranicar

The SAM (Sequential Access through Meta-data) system is being built as a distributed cache management and data access layer for the D0 experiment at Fermilab. The innovation of the project is the fully distributed architecture of the system which is designed to be deployable worldwide. It uses a central database for the meta-data and a hierarchy of CORBA servers for the actual data movement. SAM provides distributed disk caching, data routing and replication and therefore has components attractive to the Grid.


international conference on cluster computing | 2004

Management of grid jobs and data within SAMGrid

Andrew Baranovski; G. Garzoglio; Igor Terekhov; Alain Roy; Todd Tannenbaum

When designing SAMGrid, a project for distributing high-energy physics computations on a grid, we discovered that it was challenging to decide where to place users jobs. Jobs typically need to access hundreds of files, and each site has a different subset of the files. Our data system SAM knows what portion of a users data may be at each site, but does not know how to submit grid jobs. Our job submission system Condor-G knows how to submit grid jobs, but originally it required users to choose grid sites and gave them no assistance in choosing. This work describes how we enhanced Condor-G to interact with SAM to make good decisions about where jobs should be executed, and thereby improve the performance of grid jobs that access large amounts of data. All these enhancements are general enough to be applicable to grid computing beyond the data-intensive computing with SAMGrid.


Archive | 2004

The SAMGrid database server component: its upgraded infrastructure and future development path

L. Loebel-Carpenter; S. White; A. Baranovski; G. Garzoglio; R. Herber; R. Illingworth; R. Kennedy; A. Kreymer; A. Kumar; L. Lueking; A. L. Lyon; Wyatt Merritt; Igor Terekhov; J. Trumbo; S. Veseli; M. Burgon-Lyon; R. St. Denis; U Glasgow; S. Belforte; Trieste Infn; U. Kerzel; U Karlsruhe; V. Bartsch; M. Leslie; OxfordU.; Piscataway Rutgers U.; Texas Tech.

The SAMGrid Database Server encapsulates several important services, such as accessing file metadata and replica catalog, keeping track of the processing information, as well as providing the runtime support for SAMGrid station services. Recent deployment of the SAMGrid system for CDF has resulted in unification of the database schema used by CDF and D0, and the complexity of changes required for the unified metadata catalog has warranted a complete redesign of the DB Server. We describe here the architecture and features of the new server. In particular, we discuss the new CORBA infrastructure that utilizes python wrapper classes around IDL structs and exceptions. Such infrastructure allows us to use the same code on both server and client sides, which in turn results in significantly improved code maintainability and easier development. We also discuss future integration of the new server with an SBIR II project which is directed toward allowing the DB Server to access distributed databases, implemented in different DB systems and possibly using different schema.


Archive | 2004

The SAMGrid monitoring service and its integration with MonALISA

A. L. Lyon; P. Vokac; M. Zimmler; G. Baranovski; G. Garzoglio; L. Loebel-Carpenter; R. Herber; R. Illingworth; R. Kennedy; A. Kreymer; A. Kumar; L. Lueking; Wyatt Merritt; Igor Terekhov; J. Trumbo; S. White; S. Veseli; M. Burgon-Lyon; R. St. Denis; U Glasgow; S. Belforte; Trieste Infn; U. Kerzel; U Karlsruhe; OxfordU.; Piscataway Rutgers U.; Texas Tech.

The SAMGrid team is in the process of implementing a monitoring and information service, which fulfills several important roles in the operation of the SAMGrid system, and will replace the first generation of monitoring tools in the current deployments. The first generation tools are in general based on text log-files and represent solutions which are not scalable or maintainable. The roles of the monitoring and information service are: (1) providing diagnostics for troubleshooting the operation of SAMGrid services; (2) providing support for monitoring at the level of user jobs; (3) providing runtime support for local configuration and other information which currently must be stored centrally (thus moving the system toward greater autonomy for the SAMGrid station services, which include cache management and job management services); (4) providing intelligent collection of statistics in order to enable performance monitoring and tuning. The architecture of this service is quite flexible, permitting input from any instrumented SAMGrid application or service. It will allow multiple backend storage for archiving of (possibly) filtered monitoring events, as well as real time information displays and active notification service for alarm conditions. This service will be able to export, in a configurable manner, information to higher level Grid monitoring services, such as MonALISA. We describe our experience to date with using a prototype version together with MonALISA.


Other Information: PBD: 2 Dec 2002 | 2002

Run II data analysis on the grid

Igor Mandrichenko; Igor Terekhov; F. Würthwein

In this document, we begin the technical design for the distributed RunII computing for CDF and D0. The present paper defines the three components of the data handling area of Run II computing, namely the Data Handling System, the Storage System and the Application. We outline their functionality and interaction between them. We identify necessary and desirable elements of the interfaces.

Collaboration


Dive into the Igor Terekhov's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

S. White

Texas Tech University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

A. L. Lyon

University of Rochester

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

S. Veseli

Texas Tech University

View shared research outputs
Researchain Logo
Decentralizing Knowledge