Hoy me he dado cuenta que hace un mes que no escribo ningún artículo en el blog. Hasta ahora he tratado de mantenerme en un post por semana, pero he estado bastante distraído últimamente. Una de las “distracciones” es mi proyecto de Master. Es relativamente sencillo sacarse el título de Master mientras estudias en el programa de doctorado en esta universidad. Así, con una presentación de una parte de mi trabajo puedo graduarme en Mayo con el flamante título de “Master of Computer Science” otorgado por la Universidad de Virginia. Aquí tenéis el email con el anuncio a los miembros de la facultad sobre mi presentación:
UNIVERSITY OF VIRGINIA
SCHOOL OF ENGINEERING AND APPLIED SCIENCE
DEPARTMENT OF COMPUTER SCIENCE
Date: April 18, 2008
MEMO TO: Faculty and Students
FROM: Marty Humphrey, Advisor
RE: Master’s Project Presentation by Arkaitz Ruiz-Alvarez
All faculty and students are cordially invited to attend the Master’s Project presentation
by Arkaitz Ruiz-Alvarez to be held on Thursday, April 24, 2008, in Olsson Hall Conference
Room 236D at 2:00 pm. The committee members are Andrew Grimshaw, Chair;
and Marty Humphrey, Advisor.
BES++: HPC Profile Open Source Implementation
The use of computational (HPC) clusters continues to increase among companies, enterprises, universities and research institutions. There are several software products that manage these computational clusters: among them, the Portable Batch System (SGE), the Sun Grid Engine (SGE), Platform’s LSF and, recently, Microsoft Windows HPC Server 2008. Each software uses their own specific format for job description, submission and management. Thus, an organization with more than one cluster must often support multiple, idiosyncratic interfaces to largely similar backend capabilities.
The HPC Basic Profile is a Web-services-based specification aimed to create a common interface to computational clusters by focusing on the basic use case of an HPC system. The HPC Basic Profile specification is applicable to every cluster management software since it is based on a basic set of functionalities that every cluster management software provides, despite the differences in interfaces and data formats. This specification is based on the OGSA Basic Execution Service and the Job Submission and Description Language and offers an abstraction of the most basic interactions with a computational cluster: create a job, check the status and attributes of a job, delete the job and check
the status of the cluster.
In this talk, I present BES++, which is an open-source project that we have created with Platform Computing (LSF). BES++ implements the HPC Basic Profile and is architected to layer on any existing queuing system. We currently have support for PBS, LSF and SGE clusters. In addition to the HPC Basic Profile features, BES++ has been extended to support several extensions such as File Staging and Advanced Filter. We have also added the capability of job metascheduling and the support for legacy client tools such as PBS’s qsub. We present a performance evaluation and an analysis of our implementation.