distributed-systems.
3 writings found
Latest Archives
Meta's RCCLX: Why AMD's GPU Communication Stack Just Got Interesting
Meta open-sources RCCLX with Direct Data Access and FP8 collectives for AMD GPUs. A deep look at what this means for multi-GPU AI workloads.
Meta's RCCLX: Why AMD GPU Communication Just Got Interesting
Meta open-sources RCCLX with Direct Data Access and low-precision collectives, potentially reshaping distributed AI workloads on AMD hardware.
Meta Open Sources RCCLX: AMD Gets Serious Performance Boosts for AI Workloads
Meta's RCCLX brings Direct Data Access and low-precision collectives to AMD GPUs, delivering 10-50% speedups for LLM inference on MI300X hardware.
Prev
Page 1 of 1 Next