Google Replaces BERT Self-Attention with Fourier Transform: 92% Accuracy, 7 Times Faster on GPUs | Synced

A research team from Google shows that replacing transformers’ self-attention sublayers with Fourier Transform achieves 92 percent of BERT accuracy on the GLUE benchmark with training times seven t...

By · · 1 min read

Source: Synced | AI Technology & Industry Review

A research team from Google shows that replacing transformers’ self-attention sublayers with Fourier Transform achieves 92 percent of BERT accuracy on the GLUE benchmark with training times seven times faster on GPUs and twice as fast on TPUs.