In this paper we describe a new, efficient predictive scheduling methodology for implementing computing infrastructure power savings using private clouds. Our approach, termed “QPRED,” estimates the quantiles on the distribution of future machine usage so that unneeded machines may be powered down to save power. A cloud administrator sets a bound on the probability that all available machines will be powered down when a cloud request arrives. This target probability is the basis of a Service Level Agreement between the cloud administrator and all cloud users covering start-up delay resulting from power savings. Our results, validated using activity traces from several private clouds used in commercial production, indicate that QPRED successfully reduces power consumption substantially while maintaining the SLAs specified by the cloud administrator.