Miguel A. Castro-García

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Miguel A. Castro-García is active.

Explore More

Publication

Featured researches published by Miguel A. Castro-García.

Lecture Notes in Computer Science | 2004

Easing Message-Passing Parallel Programming Through a Data Balancing Service

Graciela Román-Alonso; Miguel A. Castro-García; Jorge Buenabad-Chávez

The message passing model is now widely used for parallel computing, but is still difficult to use with some applications. The explicit data distribution or the explicit dynamic creation of parallel tasks can require a complex algorithm. In this paper, in order to avoid explicit data distribution, we propose a programming approach based on a data load balancing service for MPI-C. Using a parallel version of the merge sort algorithm, we show how our service avoids explicit data distribution completely, easing parallel programming. Some performance results are presented which compare our approach to a version of merge sort with explicit data distribution.

IEEE Transactions on Computers | 2014

Parallel Simulation of Pore Networks Using Multicore CPUs

J. Matadamas-Hernandez; G. Roman-Alonso; F. Rojas-Gonzalez; Miguel A. Castro-García; Azzedine Boukerche; M. Aguilar-Cornejo; S. Cordero-Sanchez

Pore networks can be simulated in silico by using the dual site-bond Model. In this approach, a set of cavities (sites) are interconnected to each other by means of a set of throats (bonds), while considering that each site should be always larger than any of its delimiting bonds. The NoMISS greedy algorithm has been implemented recently in order to address this task; nevertheless, even if this procedure is relatively fast, there arises problems related to large memory consumption and long computing time, as pore networks become somewhat large. Here, three parallel methods are proposed to allow a proficient construction of large pore networks. The first method is a parallel Monte Carlo procedure, which applies a number of exchanges among pore sizes in order to obtain a valid pore network. The other two methods are parallel versions of the pioneering NoMISS greedy algorithm. The first version uses a static data partitioning to speed up the running time, whilst the second applies a dynamic data distribution policy to improve the pore network quality. The obtained results show the behavior of each proposed version with respect to their performance and quality, by employing the resources of a 125-core Linux cluster.

international conference on electrical engineering, computing science and automatic control | 2011

Reducing communication overhead under parallel list processing in multicore clusters

Jorge Buenabad-Chávez; Miguel A. Castro-García; Jose Luis Quiroz-Fabian; Edgar F. Hernández-Ventura; Graciela Román-Alonso; Daniel M. Yellin; Manuel Aguilar-Cornejo

The Data List Management Library (DLML) processes data lists in parallel, balancing the workload transparently to programmers. Its first design was targeted at clusters of uniprocessor nodes, and based on multiprocess parallelism and on message-passing communication. This paper presents a multithreaded design of DLML aimed at clusters of multicore nodes to better capitalise on intra-node parallelism. On applications tested, MultiCore DLML runs twice as fast as DLML when message-passing communication is not excessive. Good performance was achieved only after addressing issues relating to MPI communication overhead, cache locality and memory consumption.

international conference of the ieee engineering in medicine and biology society | 2007

Segmentation of Brain Image Volumes Using the Data List Management Library

G. Roman-Alonso; J.R. Jimenez-Alaniz; Jorge Buenabad-Chávez; Miguel A. Castro-García; A.H. Vargas-Rodriguez

The segmentation of head images is useful to detect neuroanatomical structures and to follow and quantify the evolution of several brain lesions. 2D images correspond to brain slices. The more images are used the higher the resolution obtained is, but more processing power is required and parallelism becomes desirable. We present a new approach to segmentation of brain image volumes using DLML (data list management library), a tool developed by our team. We organise the integer numbers identifying images into a list, and our DLML version process them both in parallel and with dynamic load balancing transparently to the programmer. We compare the performance of our DLML version to other typical parallel approaches developed with MPI (master-slave and static data distribution), using cluster configurations with 4-32 processors.

Journal of Computational Science | 2016

Pore networks subjected to variable connectivity and geometrical restrictions: A simulation employing a multicore system

Salomón Cordero-Sánchez; Fernando Rojas-González; Graciela Román-Alonso; Miguel A. Castro-García; Manuel Aguilar-Cornejo; J. Matadamas-Hernandez

Abstract Pore networks considering variable connectivity and geometrical restrictions among voids of assorted sizes are simulated using an 8-multicore computing system. The topology of the resulting networks is visualized in terms of the sizes and connectivity of the pores through color graphics. Results allow the calculation of percolation thresholds, correlation lengths among pores, fractal dimensions of percolation clusters, and conditional probabilities among connected pore sizes. Besides, it is possible to observe disconnected pore islands of different sizes, depending on the structural correlation among pores.

distributed simulation and real-time applications | 2012

Pore Networks Simulation with Parallel Greedy Algorithms

Graciela Román-Alonso; Azzedine Boukerche; J. Matadamas-Hernandez; Miguel A. Castro-García

Porous media simulation is an important contribution in the study of many physical phenomena. The No MISS greedy algorithm outstands from the existing sequential algorithms for constructing a pore sub network, in a relatively fast way. However, despite the No MISS time reduction, there are still problems related to the required processing time when very large networks need to be studied. In this work, a non scalable parallel version of the No MISS algorithm is presented, and a new approach is proposed to alleviate this issue, in both versions cluster cores work simultaneously on different porous sub network spaces. The first approach, named as Unbounded-No MISS, allows the cores to go forward with the initialization of the porous sub network space, applying a balancing policy when a core needs more data. At the end, the cores require a sequential synchronization to finish the porous network construction. The second approach, named as Bounded-No MISS, controls the porous sub network initialization by considering a site-size boundary, avoiding the final strong synchronization and improving considerably the scalability. The obtained results using a 125-core cluster are presented.

parallel, distributed and network-based processing | 2010

Load Balancing Algorithms with Partial Information Management for the DLML Library

Juan Santana-Santana; Miguel A. Castro-García; Manuel Aguilar-Cornejo; Graciela Román-Alonso

Load balancing algorithms are an essential component of parallel computing reducing the response time of applications. Frequently, balancing algorithms have a centralized behavior requiring a lot of messages to operate, thus causing scalability problems. A solution to improve scalability is to define a decentralized algorithm, avoiding the generation of bottlenecks. DLML (Data List Management Library) is a tool that, in a transparent way, allows the parallel processing of data that are organized through a List. One drawback of this tool is the global bidding algorithm used to distribute the data (work) generated during the execution. In this paper two load balancing algorithms for DLML handling partial information are proposed. The first algorithm considers a logical Torus topology and the second one follows a Binary Tree topology for communications. Results show how the scalability of DLML was improved, using two clusters of 40 and 1024 processing units, and executing dynamic and static applications.

mexican international conference on computer science | 2008

Model Checking for Integrating Dynamic Load Distribution into Parallel Applications

Jose Luis Quiroz-Fabian; Manuel Aguilar-Cornejo; Graciela Román-Alonso; Miguel A. Castro-García

Many parallel applications running on a distributed memory cluster generate data dynamically to process during their execution. In this case it is possible that some cluster nodes become overloaded. To improve performance we can integrate a dynamic data distribution algorithm.The integration of a dynamic load distribution policy into an application must consider the correct programming of several synchronisation/communication points in order to avoid dead-lock or data lost problems. In this work we show how a Model checking technique can be used to verify formally and automatically whether an application along with a load distribution algorithm work properly.We first propose a model for a parallel application that uses a dynamic load distribution policy to transfer its generated data to other processors (when there are some processors that can help with data processing). In our model in particular we defined a cyclic distribution policy. We also propose a set of functioning properties that our model and all parallel application that uses dynamic load distribution must fulfill. Then we apply a formal verification technique using the model checker spin to ensure that such properties are satisfied. To show an application of our model we used the MPI tool to implement it and solve the N-Queens problem, where milliards of possible solutions (data)are generated and processed. We show some results obtained by using a 16 processors system.

Proceedings of the 21st European MPI Users' Group Meeting on | 2014

A Graphical Environment for Development of MPI Applications

J. L. Quiroz-Fabián; Graciela Román-Alonso; Miguel A. Castro-García; J. Buenabad-Chávez; Manuel Aguilar-Cornejo

This paper presents GD-MPI: a Graphical environment for Development of parallel MPI applications. GD-MPI offers users a web browser-based GUI to graphically specify both: workflows that represent a set of Java-MPI processes and communication between these processes including group creation, point-to-point and collective communications. GD-MPI also runs such processes remotely.

international conference on electrical engineering, computing science and automatic control | 2013

Greedily using GPU capacity for data list processing in multicore-GPU platforms

Carlos Alberto Martinez-Angeles; Jorge Buenabad-Chávez; Miguel A. Castro-García; Jose Luis Quiroz-Fabian

We have designed data list processing for multicore-GPU platforms and significantly improved the performance of both numerical and symbolic applications. For the latter, a novel aspect of our design was the management and processing of new data dynamically generated within GPUs. This paper presents various optimisations to our first design [1] aimed to use more the GPU, through reducing communication between the host (a multicore) and the GPU, in order to improve performance further. We present experimental results for three applications with different granularities and access patterns. Performance was improved again, significantly in some cases; using multicore-GPU platforms efficiently may involve complex changes to software.

Explore More