PITTI - Article - Rotary Embeddings: A Relative Revolution

Rotary Embeddings: A Relative Revolution

Artificial Intelligence,Information Processing | Computing

Date : 2021-04-20

Description

Summary drafted by a large language model.

In this article, the authors at EleutherAI present Rotary Positional Embedding (RoPE), a new type of position encoding that unifies absolute and relative approaches to position encoding in transformers. Developed by Jianlin Su, RoPE has already garnered interest for its ability to work with both vanilla and efficient attention. The authors conducted tests on RoPE against learned absolute positional embeddings used in GPT-3 and the learned relative positional embeddings used in T5, finding that RoPE performs better than other methods. They also tested it on Performer, an alternative attention mechanism designed to avoid quadratic bottlenecks with respect to sequence lengths. The authors note that the runtime cost of rotary embeddings is fairly negligible and find that they impose a 1-3% overhead across a range of transformer sizes.

Read article here

Link

Iceberg-like Tokens