HomeTechnologyArtificial intelligencePosits, a New Kind of Number, Improves the Math of AI

Posits, a New Kind of Number, Improves the Math of AI

Training the major neural networks behind many modern AI tools requires real computing power: for example OpenAI’s most advanced language model, GPT-3took an astonishing million billion billions of operations to train, and cost about US$5 million in computation time. Engineers think they have found a way to lighten the burden by using a different way to represent numbers.

Back in 2017, John Gustafsonsubsequently jointly appointed at A*STAR Computational Resources Center and the National University of Singapore, and Isaac Yonemotothen at Interplanetary Robot and Electric Brain Co., developed a new way to display numbers. These numbers, called posits, were suggested as an improvement over the standard floating point arithmetic processors in use today.

Now, a team of researchers from the Complutense University of Madrid to have developed the first processor core implementation of the posi standard in hardware and showed that, bit-by-bit, the accuracy of a basic computational task increased by up to four orders of magnitude, compared to using standard floating point numbers. They presented their results last week IEEE symposium on computer arithmetic.

“Today, it seems that Moore’s law is starting to fade,” said David Mallasén Quintana, a graduate researcher in the ArTeCS group at Complutens. “So we need to find some other ways to get more performance out of the same machines. One of the ways to do that is to change the way we encode the real numbers, how we represent them.”

The Complutense team isn’t alone in pushing the boundaries with Caller ID. Last week, Nvidia, Arm and Intel agreed on a specification to use 8-bit floating point numbers instead of the usual 32-bit or 16-bit for machine learning applications. Using the smaller, less accurate format improves efficiency and memory usage, at the expense of computational accuracy.

Must Read
Australia unveils blue carbon project at COP27

Real numbers cannot be represented perfectly in hardware simply because there are infinitely many of them. To fit a certain number of bits, many real numbers need to be rounded. The advantage of posits comes from the way the numbers they represent are distributed exactly along the number line. In the middle of the number line, around 1 and -1, are more posit representations than floating point. And on the wings, assuming large negative and positive numbers, the posi accuracy drops off more gracefully than floating point.

“It fits better with the natural distribution of numbers in a calculation,” Gustafson says. “It’s the right dynamic range and it’s the right accuracy where you need more accuracy. There are tons of bit patterns in floating point arithmetic that no one ever uses. And that is a sin.”

Posits achieve this improved accuracy around 1 and -1 thanks to an additional component in their display. Floats consist of three parts: a sign bit (0 for positive, 1 for negative), several “mantissa” (fraction) bits that indicate what comes after the binary version of a decimal, and the remaining bits that define the exponent ( 2exp).

This chart shows components of floating point display [top] and posit representation [middle]. The accuracy equation shows the advantage of posits when the exponent is close to 0. Complutense University of Madrid/IEEE

Posits keep all components of a float, but add an extra “regime” section, an exponent of an exponent. The beauty of the regimen is that it can vary in length. For small numbers, it can only take two bits, leaving more precision for the mantissa. This allows for a higher accuracy of posits in their sweet spot around 1 and -1.

Must Read
NeurIPS 2022 | MIT & Meta Enable Gradient Descent Optimizers to Automatically Tune Their Own Hyperparameters

Deep neural networks usually work with normalized parameters called weights, making them the perfect candidate to take advantage of posits strengths. A large part of neural net computations consists of multiple-accumulative operations. Each time such a calculation is performed, each sum has to be truncated again, leading to a loss of accuracy. With posits, a special register called a signature can efficiently perform the accumulation step to reduce the loss of accuracy. But today’s hardware implements floats, and so far the computational benefits of using posits in software have been largely overshadowed by losses from converting between the formats.

With their new hardware implementation, which was synthesized in a field programmable gate array (FPGA), the Complutense team was able to compare calculations with 32-bit floats and 32-bit posits side by side. They assessed their accuracy by comparing them to results using the much more accurate but computationally expensive 64-bit floating point format. Posits showed an astonishing four orders of magnitude improvement in the accuracy of matrix multiplication, a series of multiplications inherent in neural network training. They also found that the improved accuracy wasn’t at the expense of computation time, just slightly larger chip area and power consumption.

While the numerical accuracy gains are undeniable, it remains to be seen exactly how this would affect the training of big AIs like GPT-3.

“It’s possible for posits to speed up training because you don’t lose as much information along the way,” Mallasén says. “But these are things we don’t know. Some people have tried it in software, but now we want to try that in hardware as well.”

Must Read
AI-Powered Sales Coaching: Hands-Off Virtual Training

Other teams are working on their own hardware implementations to advance positivity. “It does exactly what I hoped it would; it’s being adopted like crazy,” Gustafson says. “The format of the posi numbers caught fire and there are dozens of groups, both companies and universities, using it.”

From your site articles

Related articles on the internet

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments