Inanc Senocak
Boise State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Inanc Senocak.
48th AIAA Aerospace Sciences Meeting Including The New Horizons Forum and Aerospace Exposition | 2010
Dana A. Jacobsen; Julien C. Thibault; Inanc Senocak
Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications tremendously. While multiGPU workstations with several TeraFLOPS of peak computing power are available to accelerate computational problems, larger problems require even more resources. Conventional clusters of central processing units (CPU) are now being augmented with multiple GPUs in each compute-node to tackle large problems. The heterogeneous architecture of a multi-GPU cluster with a deep memory hierarchy creates unique challenges in developing scalable and efficient simulation codes. In this study, we pursue mixed MPI-CUDA implementations and investigate three strategies to probe the efficiency and scalability of incompressible flow computations on the Lincoln Tesla cluster at the National Center for Supercomputing Applications (NCSA). We exploit some of the advanced features of MPI and CUDA programming to overlap both GPU data transfer and MPI communications with computations on the GPU. We sustain approximately 2.4 TeraFLOPS on the 64 nodes of the NCSA Lincoln Tesla cluster using 128 GPUs with a total of 30,720 processing elements. Our results demonstrate that multi-GPU clusters can substantially accelerate computational fluid dynamics (CFD) simulations.
parallel computing | 2013
Dana A. Jacobsen; Inanc Senocak
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA parallel implementations, in which all computations are done on the GPU using CUDA. We explore efficiency and scalability of incompressible flow computations using up to 256GPUs on a problem with approximately 17.2 billion cells. Our work addresses some of the unique issues faced when merging fine-grain parallelism on the GPU using CUDA with coarse-grain parallelism that use either MPI or MPI-OpenMP for communications. We present three different strategies to overlap computations with communications, and systematically assess their impact on parallel performance on two different GPU clusters. Our results for strong and weak scaling analysis of incompressible flow computations demonstrate that GPU clusters offer significant benefits for large data sets, and a dual-level MPI-CUDA implementation with maximum overlapping of computation and communication provides substantial benefits in performance. We also find that our tri-level MPI-OpenMP-CUDA parallel implementation does not offer a significant advantage in performance over the dual-level implementation on GPU clusters with two GPUs per node, but on clusters with higher GPU counts per node or with different domain decomposition strategies a tri-level implementation may exhibit higher efficiency than a dual-level implementation and needs to be investigated further.
9th AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition, 4-7 January 2011, Orlando, Florida | 2011
Dana A. Jacobsen; Inanc Senocak
Numerical computations of incompressible flow equations with pressure-based algorithms necessitate the solution of an elliptic Poisson equation, for which multigrid methods are known to be very efficient. In our previous work we presented a dual-level (MPI-CUDA) parallel implementation of the Navier-Stokes equations to simulate buoyancy-driven incompressible fluid flows on GPU clusters with simple iterative methods while focusing on the scalability of the overall solver. In the present study we describe the implementation and performance of a multigrid method to solve the pressure Poisson equation within our MPI-CUDA parallel incompressible flow solver. Various design decisions and algorithmic choices for multigrid methods are explored in light of NVIDIA’s recent Fermi architecture. We discuss how unique aspects of an MPI-CUDA implementation for GPU clusters is related to the software choices made to implement the multigrid method. We propose a new coarse grid solution method of embedded multigrid with amalgamation and show that the parallel implementation retains the numerical efficiency of the multigrid method. Performance measurements on the NCSA Lincoln and TACC Longhorn clusters are presented for up to 64 GPUs.
Computing in Science and Engineering | 2013
Rey DeLeon; Dana A. Jacobsen; Inanc Senocak
A dual-level parallel incompressible flow solver accelerates turbulent flow computations on GPU clusters. This approach solves the pressure Poisson equation with a full-depth, amalgamated parallel geometric multigrid method, and implements a Lagrangian dynamic model for subgrid-scale turbulence modeling.
Journal of Atmospheric and Oceanic Technology | 2015
Inanc Senocak; Micah Sandusky; Rey DeLeon; Derek Wade; Kyle Felzien; Marianna Budnikova
AbstractThere is a growing interest to apply the immersed boundary method to compute wind fields over arbitrarily complex terrain. The computer implementation of an immersed boundary module into an existing flow solver can be accomplished with minor modifications to the rest of the computer program. However, a versatile preprocessor is needed at the first place to extract the essential geometric information pertinent to the immersion of an arbitrarily complex terrain inside a 3D Cartesian mesh. Errors in the geometric information can negatively impact the correct implementation of the immersed boundary method as part of the solution algorithm. Additionally, the distance field from the terrain is needed to implement various subgrid-scale turbulence models and to initialize wind fields over complex terrain. Despite the popularity of the immersed boundary method, procedures used in the geometric preprocessing stage have received less attention. The present study found that concave and convex regions of compl...
ASME 2012 Fluids Engineering Division Summer Meeting collocated with the ASME 2012 Heat Transfer Summer Conference and the ASME 2012 10th International Conference on Nanochannels, Microchannels, and Minichannels | 2012
Rey DeLeon; Kyle Felzien; Inanc Senocak
A short-term wind power forecasting capability can be a valuable tool in the renewable energy industry to address load-balancing issues that arise from intermittent wind fields. Although numerical weather prediction models have been used to forecast winds, their applicability to micro-scale atmospheric boundary layer flows and ability to predict wind speeds at turbine hub height with a desired accuracy is not clear. To address this issue, we develop a multi-GPU parallel flow solver to forecast winds over complex terrain at the micro-scale, where computational domain size can range from meters to several kilometers. In the solver, we adopt the immersed boundary method and the Lagrangian dynamic large-eddy simulation model and extend them to atmospheric flows. The computations are accelerated on GPU clusters with a dual-level parallel implementation that interleaves MPI with CUDA. We evaluate the flow solver components against test problems and obtain preliminary results of flow over Bolund Hill, a coastal hill in Denmark.Copyright
Computers & Chemical Engineering | 2013
Fabio P. Santos; Inanc Senocak; Jovani L. Favero; Paulo L.C. Lage
Abstract The Dual Quadrature Method of Generalized Moments (DuQMoGeM) is an accurate moment method for solving the population balance equation (PBE). The drawback of DuQMoGeM is the high computational cost associated with numerical integrations of the PBE integral terms in which each integrand can be integrated independently and, therefore, amenable to parallelization on GPUs. In this work, two parallel adaptive cubature algorithms were implemented on a hybrid architecture (CPU–GPU) to accelerate the DuQMoGeM. The speedup and scalability of these parallel algorithms were studied with different types of Genzs test functions. Then, we applied these parallel numerical integration algorithms in the DuQMoGeM solution of the PBE for three bivariate cases, obtaining speedups between 11 and 15.
50th AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition | 2012
Rey DeLeon; Inanc Senocak
High performance computing clusters that are augmented with cost and power efficient graphics processing unit (GPU) provide new opportunities to broaden the use of large-eddy simulation technique to study high Reynolds number turbulent flows in fluids engineering applications. In this paper, we extend our earlier work on multi-GPU acceleration of an incompressible Navier-Stokes solver to include a large-eddy simulation (LES) capability. In particular, we implement the Lagrangian dynamic subgrid scale model and compare our results against existing direct numerical simulation (DNS) data of a turbulent channel flow at Reτ = 180. Overall, our LES results match fairly well with the DNS data. Our results show that the Reτ = 180 case can be entirely simulated on a single GPU, whereas higher Reynolds cases can benefit from a GPU cluster.
Boundary-Layer Meteorology | 2017
Clancy Umphrey; Rey DeLeon; Inanc Senocak
We investigate a Cartesian-mesh immersed-boundary formulation within an incompressible flow solver to simulate laminar and turbulent katabatic slope flows. As a proof-of-concept study, we consider four different immersed-boundary reconstruction schemes for imposing a Neumann-type boundary condition on the buoyancy field. Prandtl’s laminar solution is used to demonstrate the second-order accuracy of the numerical solutions globally. Direct numerical simulation of a turbulent katabatic flow is then performed to investigate the applicability of the proposed schemes in the turbulent regime by analyzing both first- and second-order statistics of turbulence. First-order statistics show that turbulent katabatic flow simulations are noticeably sensitive to the specifics of the immersed-boundary formulation. We find that reconstruction schemes that work well in the laminar regime may not perform as well when applied to a turbulent regime. Our proposed immersed-boundary reconstruction scheme agrees closely with the terrain-fitted reference solutions in both flow regimes.
Volume 1D, Symposia: Transport Phenomena in Mixing; Turbulent Flows; Urban Fluid Mechanics; Fluid Dynamic Behavior of Complex Particles; Analysis of Elementary Processes in Dispersed Multiphase Flows; Multiphase Flow With Heat/Mass Transfer in Process Technology; Fluid Mechanics of Aircraft and Rocket Emissions and Their Environmental Impacts; High Performance CFD Computation; Performance of Multiphase Flow Systems; Wind Energy; Uncertainty Quantification in Flow Measurements and Simulations | 2014
Tyler B. Phillips; Inanc Senocak; Jake P. Gentle; Kurt S. Myers; Phil Anderson
Dynamic Line Rating (DLR) is a smart grid technology that allows the rating of power line to be based on real-time conductor temperature dependent on local weather conditions. In current practice overhead power lines are generally given a conservative rating based on worst case weather conditions. Using historical weather data collected over a test bed area, we demonstrate there is often additional transmission capacity not being utilized with the current static rating practice. We investigate a new dynamic line rating methodology using computational fluid dynamics (CFD) to determine wind conditions along transmission lines at dense intervals. Simulated results are used to determine conductor temperature by calculating the transient thermal response of the conductor under variable environmental conditions. In calculating the conductor temperature, we use both a calculation with steady-state assumption and a transient calculation. Under low wind conditions, steady-state assumption predicts higher conductor temperatures that could lead to curtailments, whereas transient calculations produce conductor temperatures that are significantly lower, implying the availability of additional transmission capacity.