COMPUTERS & ELECTRICAL ENGINEERING, cilt.36, sa.1, ss.114-131, 2010 (SCI-Expanded)
Recent advances in the development of optical technologies suggest the possible emergence of broadcast-based optical interconnects within cache-coherent distributed shared memory (DSM) multiprocessor architectures. It is well known that the cache-coherence protocol is a critical issue in designing such architectures because it directly affects memory latencies. In this paper, we evaluate via simulation the performance of three directory-based cache-coherence protocols; strict request-response, intervention forwarding and reply forwarding on the Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus), which is a low-latency and high-bandwidth broadcast-based fiber-optic interconnection network supporting DSM. The simulated system contains 64 nodes, each of which has a processor, a cache controller, a directory controller and an output channel. Simulations have been conducted for each protocol to measure average processor utilization, average network latency and average number of packets transferred over the network for varying values of the important DSM parameters such as the ratio of the mean channel service time to mean thread run time (T/R), probability of a cache block being in modified state {P(M)}, the fraction of write misses {P(W)} and home node contention rate. The results reveal that for all cases. except for low values of P(M), intervention forwarding gives the worst performance (lowest processor utilization and highest latency). The performance of strict request-response and reply forwarding is comparable for several values of the DSM parameters and contention rate. For a contention rate of 0%. the increase of P(M) makes reply forwarding perform better than strict request-response. The performance of all protocols decreases with the increase of P(W) and contention rate. However, the performance of strict request-response is the least affected among other protocols due to the negative impact of the increase of P(W) and contention rate. Therefore, for the full contention case (i.e. contention rate of 100%); for low values of P(M), or for mid values of RM) and high values of P(W), strict request-response performs better than reply forwarding. These results are significant in the sense that they provide an insight to multiprocessor architecture designers for comparing the performance of different directory-based cache-coherence protocols on a broadcast-based interconnection network for different values of the DSM parameters and varying rates of contention. (C) 2009 Elsevier Ltd. All rights reserved.