Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dmitrii N. Rassokhin is active.

Publication


Featured researches published by Dmitrii N. Rassokhin.


Journal of Computational Chemistry | 2001

Multidimensional scaling and visualization of large molecular similarity tables

Dimitris K. Agrafiotis; Dmitrii N. Rassokhin; Victor S. Lobanov

Multidimensional scaling (MDS) is a collection of statistical techniques that attempt to embed a set of patterns described by means of a dissimilarity matrix into a low‐dimensional display plane in a way that preserves their original pairwise interrelationships as closely as possible. Unfortunately, current MDS algorithms are notoriously slow, and their use is limited to small data sets. In this article, we present a family of algorithms that combine nonlinear mapping techniques with neural networks, and make possible the scaling of very large data sets that are intractable with conventional methodologies. The method employs a nonlinear mapping algorithm to project a small random sample, and then “learns” the underlying transform using one or more multilayer perceptrons. The distinct advantage of this approach is that it captures the nonlinear mapping relationship in an explicit function, and allows the scaling of additional patterns as they become available, without the need to reconstruct the entire map. A novel encoding scheme is described, allowing this methodology to be used with a wide variety of input data representations and similarity functions. The potential of the algorithm is illustrated in the analysis of two combinatorial libraries and an ensemble of molecular conformations. The method is particularly useful for extracting low‐dimensional Cartesian coordinate vectors from large binary spaces, such as those encountered in the analysis of large chemical data sets.


Journal of Computational Chemistry | 2001

Nonlinear Mapping of Massive Data Sets by Fuzzy Clustering and Neural Networks

Dmitrii N. Rassokhin; Victor S. Lobanov; Dimitris K. Agrafiotis

Producing good low‐dimensional representations of high‐dimensional data is a common and important task in many data mining applications. Two methods that have been particularly useful in this regard are multidimensional scaling and nonlinear mapping. These methods attempt to visualize a set of objects described by means of a dissimilarity or distance matrix on a low‐dimensional display plane in a way that preserves the proximities of the objects to whatever extent is possible. Unfortunately, most known algorithms are of quadratic order, and their use has been limited to relatively small data sets. We recently demonstrated that nonlinear maps derived from a small random sample of a large data set exhibit the same structure and characteristics as that of the entire collection, and that this structure can be easily extracted by a neural network, making possible the scaling of data set orders of magnitude larger than those accessible with conventional methodologies. Here, we present a variant of this algorithm based on local learning. The method employs a fuzzy clustering methodology to partition the data space into a set of Voronoi polyhedra, and uses a separate neural network to perform the nonlinear mapping within each cell. We find that this local approach offers a number of advantages, and produces maps that are virtually indistinguishable from those derived with conventional algorithms. These advantages are discussed using examples from the fields of combinatorial chemistry and optical character recognition.


Journal of Chemical Information and Modeling | 2011

Efficient substructure searching of large chemical libraries: the ABCD chemical cartridge.

Dimitris K. Agrafiotis; Victor S. Lobanov; Maxim Shemanarev; Dmitrii N. Rassokhin; Sergei Izrailev; Edward P. Jaeger; Simson Alex; Michael Farnum

Efficient substructure searching is a key requirement for any chemical information management system. In this paper, we describe the substructure search capabilities of ABCD, an integrated drug discovery informatics platform developed at Johnson & Johnson Pharmaceutical Research & Development, L.L.C. The solution consists of several algorithmic components: 1) a pattern mapping algorithm for solving the subgraph isomorphism problem, 2) an indexing scheme that enables very fast substructure searches on large structure files, 3) the incorporation of that indexing scheme into an Oracle cartridge to enable querying large relational databases through SQL, and 4) a cost estimation scheme that allows the Oracle cost-based optimizer to generate a good execution plan when a substructure search is combined with additional constraints in a single SQL query. The algorithm was tested on a public database comprising nearly 1 million molecules using 4,629 substructure queries, the vast majority of which were submitted by discovery scientists over the last 2.5 years of user acceptance testing of ABCD. 80.7% of these queries were completed in less than a second and 96.8% in less than ten seconds on a single CPU, while on eight processing cores these numbers increased to 93.2% and 99.7%, respectively. The slower queries involved extremely generic patterns that returned the entire database as screening hits and required extensive atom-by-atom verification.


PLOS Computational Biology | 2009

A Self-Organizing Algorithm for Modeling Protein Loops

Pu Liu; Fangqiang Zhu; Dmitrii N. Rassokhin; Dimitris K. Agrafiotis

Protein loops, the flexible short segments connecting two stable secondary structural units in proteins, play a critical role in protein structure and function. Constructing chemically sensible conformations of protein loops that seamlessly bridge the gap between the anchor points without introducing any steric collisions remains an open challenge. A variety of algorithms have been developed to tackle the loop closure problem, ranging from inverse kinematics to knowledge-based approaches that utilize pre-existing fragments extracted from known protein structures. However, many of these approaches focus on the generation of conformations that mainly satisfy the fixed end point condition, leaving the steric constraints to be resolved in subsequent post-processing steps. In the present work, we describe a simple solution that simultaneously satisfies not only the end point and steric conditions, but also chirality and planarity constraints. Starting from random initial atomic coordinates, each individual conformation is generated independently by using a simple alternating scheme of pairwise distance adjustments of randomly chosen atoms, followed by fast geometric matching of the conformationally rigid components of the constituent amino acids. The method is conceptually simple, numerically stable and computationally efficient. Very importantly, additional constraints, such as those derived from NMR experiments, hydrogen bonds or salt bridges, can be incorporated into the algorithm in a straightforward and inexpensive way, making the method ideal for solving more complex multi-loop problems. The remarkable performance and robustness of the algorithm are demonstrated on a set of protein loops of length 4, 8, and 12 that have been used in previous studies.


Journal of Molecular Graphics & Modelling | 2003

A modified update rule for stochastic proximity embedding

Dmitrii N. Rassokhin; Dimitris K. Agrafiotis

Recently, we described a fast self-organizing algorithm for embedding a set of objects into a low-dimensional Euclidean space in a way that preserves the intrinsic dimensionality and metric structure of the data [Proc. Natl. Acad. Sci. U.S.A. 99 (2002) 15869-15872]. The method, called stochastic proximity embedding (SPE), attempts to preserve the geodesic distances between the embedded objects, and scales linearly with the size of the data set. SPE starts with an initial configuration, and iteratively refines it by repeatedly selecting pairs of objects at random, and adjusting their coordinates so that their distances on the map match more closely their respective proximities. Here, we describe an alternative update rule that drastically reduces the number of calls to the random number generator and thus improves the efficiency of the algorithm.


Journal of Chemical Information and Modeling | 2011

Power keys: a novel class of topological descriptors based on exhaustive subgraph enumeration and their application in substructure searching.

Pu Liu; Dimitris K. Agrafiotis; Dmitrii N. Rassokhin

We present a novel class of topological molecular descriptors, which we call power keys. Power keys are computed by enumerating all possible linear, branch, and cyclic subgraphs up to a given size, encoding the connected atoms and bonds into two separate components, and recording the number of occurrences of each subgraph. We have applied these new descriptors for the screening stage of substructure searching on a relational database of about 1 million compounds using a diverse set of reference queries. The new keys can eliminate the vast majority (>99.9% on average) of nonmatching molecules within a fraction of a second. More importantly, for many of the queries the screening efficiency is 100%. A common feature was identified for the molecules for which power keys have perfect discriminative ability. This feature can be exploited to obviate the need for expensive atom-by-atom matching in situations where some ambiguity can be tolerated (fuzzy substructure searching). Other advantages over commonly used molecular keys are also discussed.


Archive | 2001

Method, system, and computer program product for representing object relationships in a multidimensional space

Dimitris K. Agrafiotis; Dmitrii N. Rassokhin; Victor S Lobanov; Francis R. Salemme


Journal of Chemical Information and Modeling | 2007

Advanced Biological and Chemical Discovery (ABCD): Centralizing Discovery Knowledge in an Inherently Decentralized World

Dimitris K. Agrafiotis; Simson Alex; Heng Dai; An Derkinderen; Michael Farnum; Peter Gates; Sergei Izrailev; Edward P. Jaeger; Paul Konstant; Albert Leung; Victor S. Lobanov; P. Marichal; Douglas Martin; Dmitrii N. Rassokhin; Maxim Shemanarev; Andrew Skalkin; John Stong; Tom Tabruyn; Marleen Vermeiren; Jackson S. Wan; Xiang-Yang Xu; Xiang Yao


Journal of Molecular Graphics & Modelling | 2000

Kolmogorov-Smirnov statistic and its application in library design.

Dmitrii N. Rassokhin; Dimitris K. Agrafiotis


Archive | 2001

System, method, and computer program product for representing object relationships in a multidimensional space

Dimitris K. Agrafiotis; Dmitrii N. Rassokhin; Victor S Lobanov; F. Raymond Salemme

Collaboration


Dive into the Dmitrii N. Rassokhin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Eric Y. Yang

Baylor College of Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge