MapReduce is a promising parallel programming model for processing large data sets. Hadoop is an up-and-coming open-source implementation of MapReduce. It uses the Hadoop Distributed File System (HDFS) to store input and output data. Due to a lack …
I/O is the critical bottleneck for data-intensive scientific applications on HPC systems and leadership-class machines. Applications running on these systems may encounter bottlenecks because the I/O systems cannot handle the overwhelming intensity …
In a cluster of multiple processors or cpu-cores, many processes may run on each compute node. Each process tends to issue contiguous I/O requests for snapshot, checkpointing or so, however, if large number of processes enter the I/O phase at the …
An on-demand file staging system, Catwalk, is proposed. Catwalk is designed so that it can run on any Linux clusters without any special or additional hardware. By having hook functions on the system calls of file operations, a file staging system …
マルチコアCPUの普及 * コモディティ: Intel Core 2 Duo, AMD Athlon 64 X2 * クラスタにおいても一般的に使用される * クラスタ内で走る計算プロセスの数が増加
アプリケーションが扱うデータ量の増加 * 計算能力の増大により大規模なデータが生成 * CPU・メモリ速度に比べるとディスクは非常に低速 * ディスクI/Oがボトルネックになる
Multiple processors or multi-core CPUs are now in common, and the number of processes running concurrently is increasing in a cluster. Each process issues contiguous I/O requests individually, but they can be interrupted by the requests of other …