Editors Note: Users often ask What separates HPC from AI, they both do a lot of number crunching? While this statement is true, one big difference is the precision required for a valid answer. HPC often requires the highest possible precision (i.e. 64-bit double precision floating point), while many AI applications actually work with 8-bit integers or floating point numbers. The use of less precision often allows faster CPU/GPU mathematics and a good enough result for many AI applications. The following article explains the trend toward lower precision computing in AI.
A grand competition of numerical representation is shaping up as some companies promote floating point data types in deep learning, while others champion integer data types.
Artificial intelligence (AI) is proliferating into every corner of our lives. The demand for products and services powered by AI algorithms has skyrocketed alongside the popularity of large language models (LLMs) like ChatGPT, and image generation models like Stable Diffusion. With this increase in popularity, however, comes an increase in scrutiny over the computational and environmental costs of AI, and particularly the subfield of deep learning.
The primary factors influencing the costs of deep learning are the size and structure of the deep learning model, the processor it is running on, and the numerical representation of the data. State-of-the-art models have been growing in size for years now, with the compute requirements doubling every 6-10 months [1] for the last decade. Processor compute power has increased as well, but not nearly fast enough to keep up with the growing costs of the latest AI models. This has led researchers to delve deeper into numerical representation in attempts to reduce the cost of AI. Choosing the right numerical representation, or data type, has incredible implications on the power consumption, accuracy, and throughput of a given model. There is, however, no singular answer to which data type is best for AI. Data type requirements vary between the two distinct phases of deep learning: the initial training phase and the subsequent inference phase.
When it comes to increasing AI efficiency, the method of first resort is quantization of the data type. Quantization reduces the number of bits required to represent the weights of a network. Reducing the number of bits not only makes the model smaller, but reduces the total computation time, and thus reduces the power required to do the computations. This is an essential technique for those pursuing efficient AI.
AI models are typically trained using single precision 32-bit floating point (FP32) data types. It was found, however, that all 32 bits arent always needed to maintain accuracy. Attempts at training models using half precision 16-bit floating point (FP16) data types showed early success, and the race to find the minimum number of bits that maintains accuracy was on. Google came out with their 16-bit brain float (BF16), and models being primed for inference were often quantized to 8-bit floating point (FP8) and integer (INT8) data types. There are two primary approaches to quantizing a neural network: Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT). Both methods aim to reduce the numerical precision of the model to improve computational efficiency, memory footprint, and energy consumption, but they differ in how and when the quantization is applied, and the resulting accuracy.
Post-Training Quantization (PTQ) occurs after training a model with higher-precision representations (e.g., FP32 or FP16). It converts the models weights and activations to lower-precision formats (e.g., FP8 or INT8). Although simple to implement, PTQ can result in significant accuracy loss, particularly in low-precision formats, as the model isnt trained to handle quantization errors. Quantization-Aware Training (QAT) incorporates quantization during training, allowing the model to adapt to reduced numerical precision. Forward and backward passes simulate quantized operations, computing gradients concerning quantized weights and activations. Although QAT generally yields better model accuracy than PTQ, it requires training process modifications and can be more complex to implement.
The AI industry has begun coalescing around two preferred candidates for quantized data types: INT8 and FP8. Every hardware vendor seems to have taken a side. In mid 2022, a paper by Graphcore and AMD[2] floated the idea of an IEEE standard FP8 datatype. A subsequent joint paper with a similar proposal from Intel, Nvidia, and Arm[3] followed shortly. Other AI hardware vendors like Qualcomm[4, 5] and Untether AI[6] also wrote papers promoting FP8 and reviewing its merits versus INT8. But the debate is far from settled. While there is no singular answer for which data type is best for AI in general, there are superior and inferior data types when it comes to various AI processors and model architectures with specific performance and accuracy requirements.
Floating point and integer data types are two ways to represent and store numerical values in computer memory. There are a few key differences between the two formats that translate to advantages and disadvantages for various neural networks in training and inference.
The differences all stem from their representation. Floating point data types are used to represent real numbers, which include both integers and fractions. These numbers can be represented in scientific notation, with a base (mantissa) and an exponent.
On the other hand, integer data types are used to represent whole numbers (without fractions). The representations result in a very large difference in precision and dynamic range. Floating point numbers have a wider dynamic range then their integer counterparts. Integer numbers have a smaller range and can only represent whole numbers with a fixed level of precision.
In deep learning, the numerical representation requirements differ between the training and inference phases due to the unique computational demands and priorities of each stage. During the training phase, the primary focus is on updating the models parameters through iterative optimization, which typically necessitates higher dynamic range to ensure the accurate propagation of gradients and the convergence of the learning process. Consequently, floating-point representations, such as FP32, FP16, and even FP8 lately, should be employed during training to maintain sufficient dynamic range. On the other hand, the inference phase is concerned with the efficient evaluation of the trained model on new input data, where the priority shifts towards minimizing computational complexity, memory footprint, and energy consumption. In this context, lower-precision numerical representations, such as 8-bit integer (INT8) become an option in addition to FP8. The ultimate decision depends on the specific model and underlying hardware.
The best data type for inference will vary depending on the application and the target hardware. Real-time and mobile inference services tend to use the smaller 8-bit data types to reduce memory footprint, compute time, and energy consumption while maintaining enough accuracy.
FP8 is growing increasingly popular, as every major hardware vendor and cloud service provider has addressed its use in deep learning. There are three primary flavors of FP8, defined by the ratio of exponents to mantissa. Having more exponents increases the dynamic range of a data type, so FP8 E3M4 consisting of 1 sign bit, 3 exponent bits, and 4 mantissa bits, has the smallest dynamic range of the bunch. This FP8 representation sacrifices range for precision by having more bits reserved for mantissa, which increases the accuracy. FP8 E4M3 has an extra exponent, and thus a greater range. FP8 E5M2 has the highest dynamic range of the trio, making it the preferred target for training, which requires greater dynamic range. Having a collection of FP8 representations allows for a tradeoff between dynamic range and precision, as some inference applications would benefit from the increased accuracy offered by an extra mantissa bit.
INT8, on the other hand, effectively has 1 sign bit, 1 exponent bit, and 6 mantissa bits. This sacrifices much of its dynamic range for precision. Whether or not this translates into better accuracy compared to FP8 depends on the AI model in question. And whether or not it translates into better power efficiency will depend on the underlying hardware. Research from Untether AI research[6] shows that FP8 outperforms INT8 in terms of accuracy, and for their hardware, performance and efficiency as well. Alternatively, Qualcomm research [5] had found that the accuracy gains of FP8 are not worth the loss of efficiency compared to INT8 in their hardware. Ultimately, the decision for which data type to select when quantizing for inference will often come down to what is best supported in hardware, as well as depending on the model itself.
References
[1] Compute Trends Across Three Eras Of Machine Learning, https://arxiv.org/pdf/2202.05924.pdf [2] 8-bit Numerical Formats for Deep Neural Networks, https://arxiv.org/abs/2206.02915 [3] FP8 Formats for Deep Learning, https://arxiv.org/abs/2209.05433 [4] FP8 Quantization: The Power of the Exponent, https://arxiv.org/pdf/2208.09225.pdf [5] FP8 verses INT8 for Efficient Deep Learning Inference, https://arxiv.org/abs/2303.17951 [6] FP8: Efficient AI Inference Using Custom 8-bit Floating Point Data Types, https://www.untether.ai/content-request-form-fp8-whitepaper
About the Author
Waleed Atallah is a Product Manager responsible for silicon, boards, and systems at Untether AI. Currently, he is rolling out Untether AIs second generation silicon product, the speedAI family of devices. He was previously a Product Manager at Intel, where he was responsible for high-end FPGAs with high bandwidth memory. His interests span all things compute efficiency, particularly the mapping of software to new hardware architectures. He received a B.S. degree in Electrical Engineering from UCLA.
Read more:
The Great 8-bit Debate of Artificial Intelligence - HPCwire
- What is Artificial Intelligence (AI)? - Definition from ... [Last Updated On: June 12th, 2016] [Originally Added On: June 12th, 2016]
- Artificial Intelligence | Neuro AI [Last Updated On: June 12th, 2016] [Originally Added On: June 12th, 2016]
- Association for the Advancement of Artificial Intelligence [Last Updated On: June 13th, 2016] [Originally Added On: June 13th, 2016]
- A.I. Artificial Intelligence - Wikipedia, the free ... [Last Updated On: June 17th, 2016] [Originally Added On: June 17th, 2016]
- Artificial Intelligence - The New York Times [Last Updated On: June 17th, 2016] [Originally Added On: June 17th, 2016]
- Intro to Artificial Intelligence Course and Training ... [Last Updated On: June 28th, 2016] [Originally Added On: June 28th, 2016]
- Artificial Intelligence | Neuro AI [Last Updated On: July 1st, 2016] [Originally Added On: July 1st, 2016]
- What is Artificial Intelligence (AI)? Webopedia Definition [Last Updated On: July 1st, 2016] [Originally Added On: July 1st, 2016]
- Intro to Artificial Intelligence Course and Training Online ... [Last Updated On: July 5th, 2016] [Originally Added On: July 5th, 2016]
- Artificial Intelligence News -- ScienceDaily [Last Updated On: September 16th, 2016] [Originally Added On: September 16th, 2016]
- Artificial intelligence positioned to be a game-changer - CBS ... [Last Updated On: October 13th, 2016] [Originally Added On: October 13th, 2016]
- Artificial Intelligence: A Modern Approach - amazon.com [Last Updated On: October 31st, 2016] [Originally Added On: October 31st, 2016]
- Artificial Intelligence - IndiaBIX [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- The Non-Technical Guide to Machine Learning & Artificial ... [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- Artificial Intelligence - Graduate Schools of Science ... [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- Artificial Intelligence in Medicine: An Introduction [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- What does artificial intelligence mean? - Definitions.net [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- Artificial Intelligence Lockheed Martin [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- Artificial Intelligence Course - Computer Science at CCSU [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- FREE Artificial Intelligence Essay - Example Essays [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- Elon Musk's artificial intelligence group signs Microsoft ... [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- Real FX - Slotless Racing with Artificial Intelligence [Last Updated On: November 23rd, 2016] [Originally Added On: November 23rd, 2016]
- Artificial Intelligence: What It Is and How It Really Works [Last Updated On: January 4th, 2017] [Originally Added On: January 4th, 2017]
- Artificial Intelligence Market Size and Forecast by 2024 [Last Updated On: January 4th, 2017] [Originally Added On: January 4th, 2017]
- Algorithm-Driven Design: How Artificial Intelligence Is ... [Last Updated On: January 4th, 2017] [Originally Added On: January 4th, 2017]
- 9 Development in Artificial Intelligence | Funding a ... [Last Updated On: January 4th, 2017] [Originally Added On: January 4th, 2017]
- Artificial Intelligence Tops Humans in Poker Battle What's the Big Deal? - PokerNews.com [Last Updated On: February 6th, 2017] [Originally Added On: February 6th, 2017]
- Is AI a Threat to Christianity? - The Atlantic [Last Updated On: February 6th, 2017] [Originally Added On: February 6th, 2017]
- Allow mathematicians to pierce artificial intelligence frontiers - Livemint [Last Updated On: February 6th, 2017] [Originally Added On: February 6th, 2017]
- Montreal sees its future in smart sensors, artificial intelligence (with video) - Computerworld [Last Updated On: February 6th, 2017] [Originally Added On: February 6th, 2017]
- Silicon Valley Hedge Fund Takes On Wall Street With AI Trader - Bloomberg [Last Updated On: February 6th, 2017] [Originally Added On: February 6th, 2017]
- The Observer view on artificial intelligence - The Guardian [Last Updated On: February 6th, 2017] [Originally Added On: February 6th, 2017]
- Artificial Intelligence Is Coming Whether You Like It Or Not - Mother Jones [Last Updated On: February 6th, 2017] [Originally Added On: February 6th, 2017]
- RealDoll Creating Artificial Intelligence System, Robotic Sex Dolls ... - Breitbart News [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Forget lessons, these smart skis are loaded with artificial intelligence - Mashable [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Artificial Intelligence Correctly Predicted the Patriots' 34-28 Super ... - Digital Trends [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Why C-Levels Need To Think About eLearning And Artificial Intelligence - Forbes [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Artificial Intelligence-Driven Robots: More Brains Than Brawn - Forbes [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Artificial intelligence: How to build the business case - ZDNet [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- What 'social artificial intelligence' means for marketers - VentureBeat [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Actress Kristen Stewart's Research Paper On Artificial Intelligence: A Critical Evaluation - Forbes [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Baidu cut its healthcare business to concentrate on artificial intelligence - Asia Times [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- Google Android Wear 2.0 update puts artificial intelligence inside your wristwatch - The Sun [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- How criminals use Artificial Intelligence and Machine Learning - BetaNews [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- In the Labs: Connected vehicles in Ohio, artificial intelligence in Illinois and Massachusetts - Network World [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- Keeping an eye on artificial intelligence - The National Business Review [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Actors, teachers, therapists think your job is safe from artificial intelligence? Think again - The Guardian [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Wells Fargo Innovation Group to Focus on Artificial Intelligence, Payments and APIs - Wall Street Journal (blog) [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- SAP aims to step up its artificial intelligence, machine learning game as S/4HANA hits public cloud - ZDNet [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Artificial Intelligence Is Coming To Police Bodycams, Raising Privacy Concerns - Forbes [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Nvidia Beats Earnings Estimates As Its Artificial Intelligence Business Keeps On Booming - Forbes [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Could Artificial Intelligence Ever Become A Threat To Humanity? - Forbes [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Artificial intuition will supersede artificial intelligence, experts say - Network World [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- The Peril of Inaction with Artificial Intelligence - Gigaom [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- TASER International Bringing Artificial Intelligence to Law Enforcement - Motley Fool [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- LG G6 teasers emphasize battery life, artificial intelligence - CNET [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Wells Fargo sets up artificial intelligence team in tech push - Reuters [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Ford spending $1 billion on self-driving artificial intelligence - CNET [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Artificial Intelligence in Business Process Automation - Nanalyze [Last Updated On: February 12th, 2017] [Originally Added On: February 12th, 2017]
- An artificial intelligence gamble that paid off - Minneapolis Star Tribune [Last Updated On: February 12th, 2017] [Originally Added On: February 12th, 2017]
- Ford to Invest $1 Billion in Artificial Intelligence Start-Up - New York Times [Last Updated On: February 12th, 2017] [Originally Added On: February 12th, 2017]
- Wells Fargo Pushes Into Artificial Intelligence - Fortune [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- Artificial intelligence predictions surpass reality - UT The Daily Texan [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- Creating artificial intelligence-driven technology products is almost like unleashing the Frankenstein's monster - Economic Times (blog) [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- Inside Intel Corporation's Artificial Intelligence Strategy - Motley Fool [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- The artificial intelligence revolutionising healthcare - Irish Times [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- Ford Announces Investment in Artificial Intelligence Company Argo AI - Motor Trend [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- Ford Invests $1-Billion in Artificial Intelligence - AutoGuide.com [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- Salesforce adds some artificial intelligence to customer service products - TechCrunch [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- No hype, just fact: Artificial intelligence in simple business terms - ZDNet [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Artificial Intelligence and The Confusion of Our Age - Patheos (blog) [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- How Artificial Intelligence Startups Struck Gold - Entrepreneur [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Terrifyingly, Google's Artificial Intelligence acts aggressive when cornered - Chron.com [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- This Startup Has Developed A New Artificial Intelligence That Can (Sometimes) Beat Google - Forbes [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- RPI artificial intelligence expert looks at Westworld - Albany Times Union [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Google's DeepMind artificial intelligence becomes 'highly aggressive' when stressed. Skynet, anyone? - Mirror.co.uk [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Artificial Intelligence Enters The Classroom - News One [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- John Pisarek Talks Artificial Intelligence - Customer Think [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Can Artificial Intelligence Predict Earthquakes? - Scientific American [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Artificial Intelligence Is Becoming A Major Disruptive Force In Banks' Finance Departments - Forbes [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]