ECE 227 Grid and Network Computing
Prof. Howie Huang
Department of Electrical and Computer Engineering
Fall 2008
Wednesday 7:10 - 9:40 PM
Office Hours: Wednesday 2 - 6 PM or by appointments

Introduction
What would you do with a computer that delivers more than 750 petaflops of computation and 30 petabytes of storage capability (as of today's TeraGrid)?  What are the challenges in building such a grid?  What is cloud computing that is drawing investments in billions of dollars from companies like Google, Yahoo, Amazon, etc.?  ...  These are the questions that this research-oriented class will try to answer through a combination of extensive reading, writing, and most importantly, practical experience in clusters, grids, and clouds.  Each student will be expected to read and present papers, write short summaries, lead and participate in discussions.   There will be one midterm exam.  In addition, students are required to complete a semester-long research project in groups.  Students are required to honor the GWU Code of Academic Integrity when completing all assignments, projects, and examinations.

Although background in operating systems, distributed systems, and computer networking is highly desirable, a strong interest is what it takes to succeed in this class.  Graduate students in computer engineering and computer science are encouraged to take this course, especially if they are interested in doing research in the area of high performance computing.  

Recommended textbook: The Grid 2: Blueprint for a New Computing Infrastructure, ed. by Ian Foster and Carl Kesselman, 2004, Morgan Kaufmann.

Grading
Participation:                                       10%

            Each student must be willing to participate fully in the class.  It is insufficient just to show up and do presentations.

Reading Summaries:                         10%

            We will read about five papers each week.  Each student is required to read them ahead of the class, and write a short summary (template). How to review a paper: "The Task of a Referee" by Alan Jay Smith
Paper Presentation:                           20%
            Each student would be expected to present and lead the discussion of a number of papers.  The student must email the presentation slides to the instructor by 8 AM on Mondays of each week.  Late or no submissions will result in the penalties.  How to present a paper: link 1 by Leslie Lamport, and link 2 by Ashwin Ram.
Mid-term Exam:                                 20%  (Week 12)
Course Project:                                   40%
   Initial Proposal                                    5%  (Week 4)
   Final Proposal                                     5%  (Week 5)
   Mid-term Report                               10%  (Week 10)
   Final Report                                       20%  (Week 15)
            Projects will be done in groups of two students.  Each group is required to submit a project proposal, a midterm report, and a final report.  Students are strongly encouraged to talk to the instructor often and seek help as soon as possible.  Each group will present their projects in class.  It is expected that publishable results will come out of some projects.

All assignments are due in class on Wednesdays.  No late assignments.

TENTATIVE Course Schedule

Week 1
Sep 3

Grid Computing – Introduction

Ian Foster, What is The Grid, July 2002

Ian Foster, Grid, South Africa, March 2007

Andrew Grimshaw, Why Grids Have Failed to Cross the Chasm, keynote, HPDC 2008

 

 

Week 2

Sep 10

Grid Architecture

Ian Foster and Carl Kesselman, Computational Grids, The Grid, 1999

Ian Foster et al., The Anatomy of the Grid: Enabling Scalable Virtual Organizations, IJSA, 2001

Ian Foster et al., The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration , 2002

Form groups and email to the instructor 

Week 3

Sep 17

Cloud Computing

BusinessWeek, Google and the wisdom of clouds , December ‘07

Raj Buyya, Market-Oriented Cloud Computing, HPCC ‘07

Sanjay Ghemawat et. al., Google File Systems, SOSP ‘03

Google, Introduction to Parallel Programming (Optional)

Google, Parallelization Models  (Optional)

Google, Remote Procedure Call (Optional)


Week 4

Sep 24

 

MapReduce Programming

Jeffrey Dean and Sanjay Ghemawat, Mapreduce: Simplified Data Processing on Large Clusters, OSDI ‘04

Ralf Lammal, Google's MapReduce Programming Model Revisited

Fay Chang et al., BigTable: A Distributed Storage System for Structured Data, OSDI ‘06

Hadoop

Preliminary proposal due

Meet with the instructor

Week 5

Oct 1
3:30-6:00PM

Computational Grids

Project proposal presentations

Ian Foster et al., Globus Primer, 2005

Mark Morgan and Andrew Grimshaw, Genesis II, CCGrid ‘07

Project proposal due

Week 5

Oct 1
7:10-9:40PM

Data Grids

Ann Chervenak et al., The Data Grid

A. Grimshaw et al., Avaki Data Grid

M. Antonioletti et al., The Design and Implementation of Grid Database Services in OGSA-DAI, CCPE

Arcot Rajasekar et al., MySRB & SRB – Componets of a Data Grid  



Week 6

Oct 8

GridFTP

W. Allcock et al., The Globus Striped GridFTP Framework and Server, SC 2005                                       (Mike)

J. Bresnahan et al., Globus GridFTP: What’s New in 2007, GridNets 2007                                                  (Mike)
                                           


Week 7

Oct 15

Grid Resource Management and Programming

MPI introduction


Tutorial on MPI and Hadoop                                                                                                                                  (Olivier)

K. Czajkowski et al., A Resource Management Architecture for Metacomputing Systems (GRAM)                   

K. Czajkowski et al., Grid Information Services for Distributed Resource Sharing (MDS)                                     


N. Karonis et al., MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface
                 (Lenny)

S. Dong et al., Cross-Site Computations on the TeraGrid                                                                                     (Lenny)



 

Week 8

Oct 22

File and Storage Systems

Frank Schmuck and Roger Haskin, GPFS: A Shared-Disk File System for Large Computing Clusters, FAST 2002             (Teng)
Exploring the hyper-grid idea with grand challenge applications: The DEISA-TERAGRID interoperability demonstration (Teng)


Lustre A Scalable, High-Performance File System                                                                                                      (Reserved)

SUN, Lustre File System, December 2007                                                                                                                    (Reserved)

Stephen Simms, Wide Area Filesystem Performance using Lustre on the TeraGrid, TeraGrid 2007             

DataDirect Networks, Best Practices for Architecting a Lustre-Based Storage Environment, 2008               

 

Week 9

Oct 29

Grid Application Domains

Service-Oriented Science                                                                                                                                                (Carl)

Ian Foster, The Grid: A New Infrastructure for 21st Century Science                                                                        (Carl)

Gabrielle Allen et al., Supporting Efficient Execution in Heterogeneous Distributed Computing Environments with Cactus and Globus                                                                                                                                                                              (Craig)

Julia Andreeva et al., High Energy Physics on the Grid: the ATLAS and CMS experience                                    (Craig)

The Open Science Grid                                                                                                                                                   (Lai)

New Science on the Open Science Grid                                                                                                                        (Lai)



Week 10

Nov 5

Grid Security


Globus toolkit Version 4 Grid Security Infrastructure                                                                                                 (David)

Mike Surridge, A Rough Guide to Grid Security                                                                                                         (David)

Midterm project report due
Week 11
Nov 12
Grid Workflow
DAGMan                                                                                                                                                                         (Ahsen)
Chimera: AVirtual Data System for Representing, Querying, and Automating Data Derivation                           (Ahsen)

Y. Zhao et al., Swift: Fast, Reliable, Loosely Coupled Parallel Computation, 2007                                                   (Alex)
Taverna                                                                                                                                                                             (Alex)
                                                                                                                                                                     

 

Week 12

Nov 19 

Midterm Exam

Supercomputing 08, Austin TX

 

 

Week 13
Nov 26

Emerging Technologies

 


Week 15

Dec 10

 

Final project presentation

Final Project Report Due