Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nikolay Simakov is active.

Publication


Featured researches published by Nikolay Simakov.


extreme science and engineering discovery environment | 2013

Using XDMoD to facilitate XSEDE operations, planning and analysis

Thomas R. Furlani; Barry L. Schneider; Matthew D. Jones; John Towns; David L. Hart; Steven M. Gallo; Robert L. DeLeon; Charng Da Lu; Amin Ghadersohi; Ryan J. Gentner; Abani K. Patra; Gregor von Laszewski; Fugang Wang; Jeffrey T. Palmer; Nikolay Simakov

The XDMoD auditing tool provides, for the first time, a comprehensive tool to measure both utilization and performance of high-end cyberinfrastructure (CI), with initial focus on XSEDE. Here, we demonstrate, through several case studies, its utility for providing important metrics regarding resource utilization and performance of TeraGrid/XSEDE that can be used for detailed analysis and planning as well as improving operational efficiency and performance. Measuring the utilization of high-end cyberinfrastructure such as XSEDE helps provide a detailed understanding of how a given CI resource is being utilized and can lead to improved performance of the resource in terms of job throughput or any number of desired job characteristics. In the case studies considered here, a detailed historical analysis of XSEDE usage data using XDMoD clearly demonstrates the tremendous growth in the number of users, overall usage, and scale of the simulations routinely carried out. Not surprisingly, physics, chemistry, and the engineering disciplines are shown to be heavy users of the resources. However, as the data clearly show, molecular biosciences are now a significant and growing user of XSEDE resources, accounting for more than 20 percent of all SUs consumed in 2012. XDMoD shows that the resources required by the various scientific disciplines are very different. Physics, Astronomical sciences, and Atmospheric sciences tend to solve large problems requiring many cores. Molecular biosciences applications on the other hand, require many cycles but do not employ core counts that are as large. Such distinctions are important in guiding future cyberinfrastructure design decisions. XDMoDs implementation of a novel application kernel-based auditing system to measure overall CI system performance and quality of service is shown, through several examples, to provide a useful means to automatically detect under performing hardware and software. This capability is especially critical given the complex composition of todays advanced CI. Examples include an application kernel based on a widely used quantum chemistry program that uncovered a software bug in the I/O stack of a commercial parallel file system, which was subsequently fixed by the vendor in the form of a software patch that is now part of their standard release. This error, which resulted in dramatically increased execution times as well as outright job failure, would likely have gone unnoticed for sometime and was only uncovered as a result of implementation of XDMoDs suite of application kernels.


Journal of Molecular Biology | 2013

pH-triggered conformational switching of the diphtheria toxin T-domain: the roles of N-terminal histidines.

Igor V. Kurnikov; Alexander Kyrychenko; Jose C. Flores-Canales; Mykola V. Rodnin; Nikolay Simakov; Mauricio Vargas-Uribe; Yevgen O. Posokhov; Maria Kurnikova; Alexey S. Ladokhin

pH-induced conformational switching is essential for functioning of diphtheria toxin, which undergoes a membrane insertion/translocation transition triggered by endosomal acidification as a key step of cellular entry. In order to establish the sequence of molecular rearrangements and side-chain protonation accompanying the formation of the membrane-competent state of the toxins translocation (T) domain, we have developed and applied an integrated approach that combines multiple techniques of computational chemistry [e.g., long-microsecond-range, all-atom molecular dynamics (MD) simulations; continuum electrostatics calculations; and thermodynamic integration (TI)] with several experimental techniques of fluorescence spectroscopy. TI calculations indicate that protonation of H257 causes the greatest destabilization of the native structure (6.9 kcal/mol), which is consistent with our early mutagenesis results. Extensive equilibrium MD simulations with a combined length of over 8 μs demonstrate that histidine protonation, while not accompanied by the loss of structural compactness of the T-domain, nevertheless results in substantial molecular rearrangements characterized by the partial loss of secondary structure due to unfolding of helices TH1 and TH2 and the loss of close contact between the C- and N-terminal segments. The structural changes accompanying the formation of the membrane-competent state ensure an easier exposure of the internal hydrophobic hairpin formed by helices TH8 and TH9, in preparation for its subsequent transmembrane insertion.


Journal of Physical Chemistry B | 2010

Soft Wall Ion Channel in Continuum Representation with Application to Modeling Ion Currents in α-Hemolysin

Nikolay Simakov; Maria Kurnikova

A soft repulsion (SR) model of short-range interactions between mobile ions and protein atoms is introduced in the framework of continuum representation of the protein and solvent. The Poisson-Nernst-Plank (PNP) theory of ion transport through biological channels is modified to incorporate this soft wall protein model. Two sets of SR parameters are introduced. The first is parametrized for all essential amino acid residues using all atom molecular dynamic simulations; the second is a truncated Lennard-Jones potential. We have further designed an energy-based algorithm for the determination of the ion accessible volume, which is appropriate for a particular system discretization. The effects of these models of short-range interactions were tested by computing current-voltage characteristics of the α-hemolysin channel. The introduced SR potentials significantly improve prediction of channel selectivity. In addition, we studied the effect of the choice of some space-dependent diffusion coefficient distributions on the predicted current-voltage properties. We conclude that the diffusion coefficient distributions largely affect total currents and have little effect on rectifications, selectivity, or reversal potential. The PNP-SR algorithm is implemented in a new efficient parallel Poisson, Poisson-Boltzmann, and PNP equation solver, also incorporated in a graphical molecular modeling package HARLEM.


international conference on cluster computing | 2015

Analysis of XDMoD/SUPReMM Data Using Machine Learning Techniques

Steven M. Gallo; Joseph P. White; Robert L. DeLeon; Thomas R. Furlani; Helen Ngo; Abani K. Patra; Matthew D. Jones; Jeffrey T. Palmer; Nikolay Simakov; Jeanette M. Sperhac; Martins Innus; Thomas Yearke; Ryan Rathsam

Machine learning techniques were applied to job accounting and performance data for application classification. Job data were accumulated using the XDMoD monitoring technology named SUPReMM, they consist of job accounting information, application information from Lariat/XALT, and job performance data from TACC_Stats. The results clearly demonstrate that community applications have characteristic signatures which can be exploited for job classification. We conclude that machine learning can assist in classifying jobs of unknown application, in characterizing the job mixture, and in harnessing the variation in node and time dependence for further analysis.


Scopus | 2014

Comprehensive, open-source resource usage measurement and analysis for HPC systems

James C. Browne; Robert L. DeLeon; Abani K. Patra; William L. Barth; John Hammond; Jones; Tom Furlani; Barry I. Schneider; Steven M. Gallo; Amin Ghadersohi; Ryan J. Gentner; Jeffrey T. Palmer; Nikolay Simakov; Martins Innus; Andrew E. Bruno; Joseph P. White; Cynthia D. Cornelius; Thomas Yearke; Kyle Marcus; G. Von Laszewski; Fugang Wang

The important role high‐performance computing (HPC) resources play in science and engineering research, coupled with its high cost (capital, power and manpower), short life and oversubscription, requires us to optimize its usage – an outcome that is only possible if adequate analytical data are collected and used to drive systems management at different granularities – job, application, user and system. This paper presents a method for comprehensive job, application and system‐level resource use measurement, and analysis and its implementation. The steps in the method are system‐wide collection of comprehensive resource use and performance statistics at the job and node levels in a uniform format across all resources, mapping and storage of the resultant job‐wise data to a relational database, which enables further implementation and transformation of the data to the formats required by specific statistical and analytical algorithms. Analyses can be carried out at different levels of granularity: job, user, application or system‐wide. Measurements are based on a new lightweight job‐centric measurement tool ‘TACC_Stats’, which gathers a comprehensive set of resource use metrics on all compute nodes and data logged by the system scheduler. The data mapping and analysis tools are an extension of the XDMoD project. The method is illustrated with analyses of resource use for the Texas Advanced Computing Centers Lonestar4, Ranger and Stampede supercomputers and the HPC cluster at the Center for Computational Research. The illustrations are focused on resource use at the system, job and application levels and reveal many interesting insights into system usage patterns and also anomalous behavior due to failure/misuse. The method can be applied to any system that runs the TACC_Stats measurement tool and a tool to extract job execution environment data from the system scheduler. Copyright


Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale | 2016

A Quantitative Analysis of Node Sharing on HPC Clusters Using XDMoD Application Kernels

Nikolay Simakov; Robert L. DeLeon; Joseph P. White; Thomas R. Furlani; Martins Innus; Steven M. Gallo; Matthew D. Jones; Abani K. Patra; Benjamin D. Plessinger; Jeanette M. Sperhac; Thomas Yearke; Ryan Rathsam; Jeffrey T. Palmer

In this investigation, we study how application performance is affected when jobs are permitted to share compute nodes. A series of application kernels consisting of a diverse set of benchmark calculations were run in both exclusive and node-sharing modes on the Center for Computational Researchs high-performance computing (HPC) cluster. Very little increase in runtime was observed due to job contention among application kernel jobs run on shared nodes. The small differences in runtime were quantitatively modeled in order to characterize the resource contention and attempt to determine the circumstances under which it would or would not be important. A machine learning regression model applied to the runtime data successfully fitted the small differences between the exclusive and shared node runtime data; it also provided insight into the contention for node resources that occurs when jobs are allowed to share nodes. Analysis of a representative job mix shows that runtime of shared jobs is affected primarily by the memory subsystem, in particular by the reduction in the effective cache size due to sharing; this leads to higher utilization of DRAM. Insights such as these are crucial when formulating policies proposing node sharing as a mechanism for improving HPC utilization.


Journal of Physical Chemistry B | 2016

Homolytic Cleavage of Both Heme-Bound Hydrogen Peroxide and Hydrogen Sulfide Leads to the Formation of Sulfheme

Hector D. Arbelo-López; Nikolay Simakov; Jeremy C. Smith; Juan López-Garriga; Troy Wymore

Many heme-containing proteins with a histidine in the distal E7 (HisE7) position can form sulfheme in the presence of hydrogen sulfide (H2S) and a reactive oxygen species such as hydrogen peroxide. For reasons unknown, sulfheme derivatives are formed specifically on solvent-excluded heme pyrrole B. Sulfhemes severely decrease the oxygen-binding affinity in hemoglobin (Hb) and myoglobin (Mb). Here, use of hybrid quantum mechanical/molecular mechanical methods has permitted characterization of the entire process of sulfheme formation in the HisE7 mutant of hemoglobin I (HbI) from Lucina pectinata. This process includes a mechanism for H2S to enter the solvent-excluded active site through a hydrophobic channel to ultimately form a hydrogen bond with H2O2 bound to Fe(III). Proton transfer from H2O2 to His64 to form compound (Cpd) 0, followed by hydrogen transfer from H2S to the Fe(III)-H2O2 complex, results in homolytic cleavage of the O-O and S-H bonds to form a reactive thiyl radical (HS(•)), ferryl heme Cpd II, and a water molecule. Subsequently, the addition of HS(•) to Cpd II, followed by three proton transfer reactions, results in the formation of a three-membered ring ferric sulfheme that avoids migration of the radical to the protein matrix, in contrast to that in other peroxidative reactions. The transformation of this three-membered episulfide ring structure to the five-membered thiochlorin ring structure occurs through a significant potential energy barrier, although both structures are nearly isoenergetic. Both three- and five-membered ring structures reveal longer NB-Fe(III) bonds compared with other pyrrole nitrogen-Fe(III) bonds, which would lead to decreased oxygen binding. Overall, these results are in agreement with a wide range of experimental data and provide fertile ground for further investigations of sulfheme formation in other heme proteins and additional effects of H2S on cell signaling and reactivity.


Scopus | 2015

Application kernels: HPC resources performance monitoring and variance analysis

Nikolay Simakov; Joseph P. White; Robert L. DeLeon; Amin Ghadersohi; Tom Furlani; Jones; Steven M. Gallo; Abani K. Patra

Application kernels are computationally lightweight benchmarks or applications run repeatedly on high performance computing (HPC) clusters in order to track the Quality of Service (QoS) provided to the users. They have been successful in detecting a variety of hardware and software issues, some severe, that have subsequently been corrected, resulting in improved system performance and throughput. In this work, the application kernels performance monitoring module of eXtreme Data Metrics on Demand (XDMoD) is described. Through the XDMoD framework, the application kernels have been run repetitively on the Texas Advanced Computing Centers Stampede and Lonestar4 clusters for a total of over 14,000 jobs. This provides a body of data on the HPC clusters operation that can be used to statistically analyze how the application performance, as measured by metrics such as execution time and communication bandwidth, is affected by the clusters workload. We discuss metric distributions, carry out regression and correlation analyses, and use a PCA study to describe the variance and relate the variance to factors such as the spatial distribution of the application in the cluster. Ultimately, these types of analyses can be used to improve the application kernel mechanism, which in turn results in improved QoS of the HPC infrastructure that is delivered to the end users. Copyright


International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems | 2017

A Slurm Simulator: Implementation and Parametric Analysis

Nikolay Simakov; Martins Innus; Matthew D. Jones; Robert L. DeLeon; Joseph P. White; Steven M. Gallo; Abani K. Patra; Thomas R. Furlani

Slurm is an open-source resource manager for HPC that provides high configurability for inhomogeneous resources and job scheduling. Various Slurm parametric settings can significantly influence HPC resource utilization and job wait time, however in many cases it is hard to judge how these options will affect the overall HPC resource performance. The Slurm simulator can be a very helpful tool to aid parameter selection for a particular HPC resource. Here, we report our implementation of a Slurm simulator and the impact of parameter choice on HPC resource performance. The simulator is based on a real Slurm instance with modifications to allow simulation of historical jobs and to improve the simulation speed. The simulator speed heavily depends on job composition, HPC resource size and Slurm configuration. For an 8000 cores heterogeneous cluster, we achieve about 100 times acceleration, e.g. 20 days can be simulated in 5 h. Several parameters affecting job placement were studied. Disabling node sharing on our 8000 core cluster showed a 45% increase in the time needed to complete the same workload. For a large system (>6000 nodes) comprised of two distinct sub-clusters, two separate Slurm controllers and adding node sharing can cut waiting times nearly in half.


Molecular Based Mathematical Biology | 2013

Graphical Processing Unit accelerated Poisson equation solver and its application for calculation of single ion potential in ion-channels

Nikolay Simakov; Maria Kurnikova

Abstract Poisson and Poisson-Boltzmann equations (PE and PBE) are widely used in molecular modeling to estimate the electrostatic contribution to the free energy of a system. In such applications, PE often needs to be solved multiple times for a large number of system configurations. This can rapidly become a highly demanding computational task. To accelerate such calculations we implemented a graphical processing unit (GPU) PE solver described in this work. The GPU solver performance is compared to that of our central processing unit (CPU) implementation of the solver. During the performance analysis the following three characteristics were studied: (1) precision associated with the modeled system discretization on the grid, (2) numeric precision associated with the floating point representation of real numbers (this is done via comparison of calculations with single precision (SP) and double precision (DP)), and (3) execution time. Two types of example calculations were carried out to evaluate the solver performance: (1) solvation energy of a single ion and a small protein (lysozyme), and (2) a single ion potential in a large ion-channel (α-hemolysin). In addition, influence of various boundary condition (BC) choices was analyzed, to determine the most appropriate BC for the systems that include a membrane, typically represented by a slab with the dielectric constant of low value. The implemented GPU PE solver is overall about 7 times faster than the CPU-based version (including all four cores). Therefore, a single computer equipped with multiple GPUs can offer a computational power comparable to that of a small cluster. Our calculations showed that DP versions of CPU and GPU solvers provide nearly identical results. SP versions of the solvers have very similar behavior: in the grid scale range of 1-4 grids/Å the difference between SP and DP versions is less than the difference stemming from the system discretization. We found that for the membrane protein, the use of a focusing technique with periodic boundary conditions in rough grid provides significantly better results than using a focusing technique with the electric potential set to zero at the boundaries.

Collaboration


Dive into the Nikolay Simakov's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jeffrey T. Palmer

State University of New York System

View shared research outputs
Top Co-Authors

Avatar

Thomas Yearke

State University of New York System

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge