Highly Fault-tolerant NoC Routing with Application-aware Congestion Management
Silicon devices are becoming less reliable as technology moves to smaller feature sizes. As a result, digital systems are increasingly likely to experience permanent failures during their lifetime. To overcome this problem, networks-on-chip (NoCs) should be designed to, not only fulfill performance requirements, but also be robust to many fault occurrences. This paper proposes a fault- and application-aware routing framework called FATE: it leverages the diversity of communication patterns in applications for highly faulty NoCs to reduce congestion during execution. We propose a set of novel route-enabling rules that greatly reduce the search for deadlock-free, maximally-connected routes for any faulty 2D mesh topology, by preventing early on the exploration of routing configuration options that lead eventually to unviable solutions. Our experimental results show a 33% improvement on average saturation throughput for synthetic traffic patterns, and a 59% improvement on average packet latency for SPLASH-2 benchmarks, over state-of-the-art fault-tolerant solutions.
Sunday, Sept. 20, 2015, 8 a.m. — Tuesday, Sept. 22, 2015, 10 p.m. CT
Austin, TX, United States
Technical conference and networking event for SRC members and students.