Tariq Magdon-Ismail
VMware
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tariq Magdon-Ismail.
Technology Conference on Performance Evaluation and Benchmarking | 2014
Raghunath Nambiar; Meikel Poess; Akon Dey; Paul Cao; Tariq Magdon-Ismail; Da Qi Ren; Andrew Bond
The designation Big Data has become a mainstream buzz phrase across many industries as well as research circles. Today many companies are making performance claims that are not easily verifiable and comparable in the absence of a neutral industry benchmark. Instead one of the test suites used to compare performance of Hadoop based Big Data systems is the TeraSort. While it nicely defines the data set and tasks to measure Big Data Hadoop systems it lacks a formal specification and enforcement rules that enable the comparison of results across systems. In this paper we introduce TPCx-HS, the industry’s first industry standard benchmark, designed to stress both hardware and software that is based on Apache HDFS API compatible distributions. TPCx-HS extends the workload defined in TeraSort with formal rules for implementation, execution, metric, result verification, publication and pricing. It can be used to asses a broad range of system topologies and implementation methodologies of Big Data Hadoop systems in a technically rigorous and directly comparable and vendor-neutral manner.
Technology Conference on Performance Evaluation and Benchmarking | 2017
Tariq Magdon-Ismail; Chinmayi Narasimhadevara; Dave Jaffe; Raghunath Nambiar
The TPCx-HS Hadoop benchmark has helped drive competition in the Big Data marketplace and has proven to be a successful industry standard benchmark for Hadoop systems. However, the Big Data landscape has rapidly changed since its initial release in 2014. Key technologies have matured, while new ones have risen to prominence in an effort to keep pace with the exponential expansion of datasets. For example, Hadoop has undergone a much-needed upgrade to the way that scheduling, resource management, and execution occur in Hadoop, while Apache Spark has risen to be the de facto standard for in-memory cluster compute for ETL, Machine Learning, and Data Science Workloads. Moreover, enterprises are increasingly considering cloud infrastructure for Big Data processing. What has not changed since TPCx-HS was first released is the need for a straightforward, industry standard way in which these current technologies and architectures can be evaluated. In this paper, we introduce TPCx-HS v2 that is designed to address these changes in the Big Data technology landscape and stress both the hardware and software stacks including the execution engine (MapReduce or Spark) and Hadoop Filesystem API compatible layers for both on-premise and cloud deployments.
Archive | 2013
Jayanth Gummaraju; Richard Mcdougall; Michael Nelson; Rean Griffith; Tariq Magdon-Ismail; Razvan Cheveresan; Junping Du
Archive | 2013
Tariq Magdon-Ismail; Razvan Cheveresan
Archive | 2013
Banit Agrawal; Rishi Bidarkar; Uday Kurkure; Tariq Magdon-Ismail; Hari Sivaraman; Lawrence Spracklen
Archive | 2013
Banit Agrawal; Rishi Bidarkar; Uday Kurkure; Tariq Magdon-Ismail; Hari Sivaraman; Lawrence Spracklen
Archive | 2014
Jayanth Gummaraju; Yunshan Lu; Tariq Magdon-Ismail
Archive | 2014
Tariq Magdon-Ismail; Duy Nguyen; Brian James Martin
Archive | 2014
Daniel J. Scales; Tariq Magdon-Ismail; Razvan Cheveresan; Michael Nelson; Richard Mcdougall
Archive | 2014
Tariq Magdon-Ismail; Duy Nguyen