Report ID
2005-26
Report Authors
John Brevik, Daniel Nurmi, Rich Wolski
Report Date
Abstract
Most space-sharing resources presently operated by high performance computing centers employ some sort of batch queueing system to manage resource allocation to multiple users. In this work, we explore a new method for providing end-users with predictions of the bounds on queuing delay individual jobs will experience when waiting to be scheduled to a machine partition. We evaluate this method using scheduler logs that cover a 9 year period from 7 large HPC centers. Our results show that it is possible to predict delay bounds with specified confidence levels for jobs in different queues, and for jobs requesting different ranges of processor counts.
Document
2005-26.pdf372.89 KB