Building apps with GPT-3? Here’s what devs need to know about cost and performance

Last week, OpenAI removed the waitlist for the application programming interface to GPT-3, its flagship language model. Now, any developer who meets the conditions for using the OpenAI API can apply and start integrating GPT-3 into their applications.

Since the beta release of GPT-3, developers have built hundreds of applications on top of the language model. But building successful GPT-3 products presents unique challenges. You must find a way to leverage the power of OpenAIs advanced deep learning models to provide the best value to your users while keeping your operations scalable and cost-efficient.

Fortunately, OpenAI provides a variety of options that can help you make the best use of your money when using GPT-3. Heres what the people who have been developing applications with GPT-3 have to say about best practices.

OpenAI offers four versions of GPT-3: Ada, Babbage, Curie, and Davinci. Ada is the fastest, least expensive, and lowest-performing model. Davinci is the slowest, most expensive, and highest performing. Babbage and Curie are in-between the two extremes.

OpenAIs website doesnt provide architectural details on each of the models, but the original GPT-3 paper includes a list of different versions of the language model. The main difference between the models is the number of parameters and layers, going from 12 layers and 125 million parameters to 96 layers and 175 billion parameters. Adding layers and parameters improves the models learning capacity but also increases the processing time and costs.

OpenAI calculates the pricing of its models based on tokens. According to OpenAI, one token generally corresponds to ~4 characters of text for common English text. This translates to roughly of a word (so 100 tokens ~= 75 words).

Heres an example from OpenAIs Tokenizer tool:

In general, if you use good English (avoid jargon, use simple words with few syllables, etc.), youll get better token-to-word ratios. In the example below, aside from GPT-3, every other word counts as one token.

One of the benefits of GPT-3 is its few-shot learning capabilities. If youre not satisfied with the models response to a prompt, you can guide it by giving it a longer prompt that includes correct examples. These examples will work like real-time training and improve GPT-3s results without the need to readjust its parameters.

It is worth noting that OpenAI charges you for the total tokens in your input prompt plus the output tokens GPT-3 returns. Therefore, long prompts with few-shot learning examples will increase the cost of using GPT-3.

With a 75x cost difference between the cheapest and most expensive GPT-3 models, it is important to know which option best suits your application.

Matt Shumer, the co-founder and CEO of OthersideAI, has used GPT-3 to develop AI-powered writing tools. HyperWrite, OthersideAIs main product, uses GPT-3 for text generation, autocomplete, and rephrasing.

When choosing between different GPT-3 models, Shumer starts by considering the complexity of the intended use case, he told TechTalks.

If its something simple, like binary classification, I might start with Ada or Babbage. If its something very complex, like conditional generation where high-quality output and reliability is necessary, I start with Davinci, he said.

When unsure of complexity, Shumer starts by trying the biggest model, Davinci. Then, he works his way down toward the smaller models.

When I get it working with Davinci, I try to modify the prompt to use Curie. This typically means adding more examples, refining the structure, or both. If it works on Curie, I move to Babbage, then Ada, he said.

For some applications, he uses a multi-step system that includes a mix of different models.

For example, if its a generative task that requires some classification as a precursor step, I might use Babbage for the classification, then Curie or Davinci for the generative step, he said. After using it for a while, you get a feel for what might be useful for different use cases.

Paul Bellow, author and developer of LitRPG Adventures, used Davinci for his GPT-3-powered RPG content generator.

I wanted to generate the highest quality output possiblefor later fine-tuning, Bellow told TechTalks. Davinci is the slowest and most expensive, but the tradeoff is higher quality output which was important to me at this stage of development. Ive spent a premium, but I now have over 10,000 generations that I can use for future fine-tuning. Datasets have value. (More on fine-tuning later.)

Bellow says that the best way to find out if another model is going to work for a task is to run some tests on Playground, a tool you can use to directly try prompts on different GPT-3 models (note that OpenAI bills you for using Playground).

A lot of the time, a well-thought-out prompt can get good content out of the Curie model. It all just depends on the use-case, Bellow said.

When choosing a model for your application, youll have to weigh the balance between the cost and value. Choosing a high-performing model might provide better quality output, but the improved results might not justify the price difference.

You have to build a business model around your product that supports the engines youre using, Shumer said. If you want high-quality outputs for your users, itll be worth it to use Davinciyou can pass off the costs to your users. If youre looking to build a large-scale free product, and your users are okay with mediocre results, you can use a smaller engine. It all depends on your product goals.

OthersideAI has developed a solution that uses a mix of different GPT-3 models to enable different use cases, Shumer said. Paid users enjoy the power of large GPT-3 models, while free-tier users get access to the smaller models.

For LitRPG Adventures, quality is prime, which is why Bellow initially stuck to the Davinci model. He used the base Davinci model with one- or two-shot prompts, which increased the costs but made sure GPT-3 provided quality output.

OpenAI API Davinci model is a bit expensive at this time, but I see the cost going down eventually, he said. What provides flexibility right now is the ability to fine-tune the Curie and lower models, or Davinci with permission. This will bring my costs per generation down quite a bit while hopefully maintaining high quality.

He has been able to develop a business model that maintains a profit margin while using Davinci.

While not a huge money-maker, the LitRPG Adventures project is paying for itself and just about ready to scale up, he said.

OpenAIs scientists initially introduced GPT-3 as a task-agnostic language model. According to their initial tests, GPT-3 rivaled state-of-the-art models on specific tasks without the need for further training. But they also mentioned fine-tuning as a promising direction of future work.

In the months that followed the beta release of GPT-3, OpenAI and Microsoft fine-tuned the model for a number of different tasks, including database query and source-code generation.

Like other deep learning architectures, fine-tuning has several benefits for GPT-3. OpenAI API allows customers to create fine-tuned versions of its GPT-3 for a premium. You can create your own training dataset, upload it to OpenAIs servers, and use it to create a finetuned model of GPT-3. OpenAI will host your model and make it available to you through its API.

Fine-tuning will enable you to tackle problems that are impossible to solve with the basic models.

The vanilla models are highly capable and are usable for many tasks. However, some tasks (i.e., multi-step generation) are too complex for a vanilla model, even Davinci, to complete with high accuracy, Shumer said. In cases like this, you have two options: 1) create a prompt chain that feeds outputs from one prompt into another prompt, or 2) fine-tune a model. I typically first try to create a prompt chain, and if that doesnt work, I then move to fine-tuning.

If done properly, fine-tuning can also reduce the costs of using GPT-3. If youll be using GPT-3 for a specific application, a fine-tuned small model can produce results that are as good as those provided by a large vanilla model. Fine-tuned models also reduce the size of prompts, which further slashes your token usage.

One other case where I tend to fine-tune is when I can get something working with a vanilla model, but the prompt ends up being so long that it is costly to serve to users. In cases like these, I fine-tune, as it actually can reduce the overall serving costs, Shumer said.

But fine-tuning isnt without challenges. Without a quality training dataset, finetuning can have adverse effects.

Clean your dataset as much as you can. Garbage in, garbage out is one of my big mantras now when it comes to prompt engineering, Bellow said.

If you manage to gather a sizeable dataset of quality examples, however, fine-tuning can do wonders. After starting LitRPG with the Davinci model, Bellow gathered and cleaned a dataset of around 4,000 samples in a 7-megabyte JSON file. While he is still experimenting, the initial results show that he can move from Davinci to Curie without a noticeable change in quality, which reduces the costs of GPT-3 queries by 90 percent.

Another consideration is the time it takes to fine-tune GPT-3, which grows with the size of the model and the training dataset.

It can take as little as five minutes to fine-tune a smaller model on a few hundred examples, Shumer said. Ive also seen cases where it takes upwards of five hours to train a larger model on thousands of examples.

Theres also an inverse correlation between the size of the model and the amount of data you need to fine-tune GPT-3, according to Shumers experiments. Larger models require less data for fine-tuning.

For many tasks, you can think of increasing base model size as a way to reduce how much data youll need to fine-tune a quality model, Shumer said. A Curie fine-tuned on 100 examples may have similar results to a Babbage fine-tuned on 2,000 examples. The larger models can do remarkable things with very little data.

OpenAI received a lot of criticism for deciding not to release GPT-3 as an open-source model. Subsequently, other developers released GPT-3 alternatives and made them available to the public. One very popular project is GPT-J by EleutherAI. Like other open-source projects, GPT-J requires technical effort on the part of application developers to set up and run. It also doesnt benefit from the ease of use and scalability that comes with hosting and fine-tuning your models on Microsofts Azure cloud.

But open-source models are nonetheless useful and are worth considering if you have the in-house talent to set them up and they meet your applications requirements.

GPT-J isnt the same as full-scale GPT-3but it is useful if you know how to work with it. Its exponentially harder to get a complex prompt working on GPT-J, as compared with Davinci, but it is possible for most use-cases, Shumer said. You wont get the same super high-quality output, but you can likely get to something passable with some time and effort. Plus, these models can be cheaper to run, which is a big plus, considering the cost of Davinci. We have successfully used models like these at Otherside.

In my experience, they operate at about the level of the Curie model from OpenAI, Bellow said. Ive also been looking into Cohere AI, but theyre not giving details on the size of their model, so I imagine its around the same as GPT-J, et al. I do think (hope) that there will be even more options soon from other players. Competition between suppliers is good for consumers like me.

This article was originally published by Ben Dickson onTechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech, and what we need to look out for. You can read the original articlehere.

See the original post here:
Building apps with GPT-3? Here's what devs need to know about cost and performance - TNW

Research, Evaluation and Learning at the International Rescue Committee - World - ReliefWeb [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Conserving Biodiversity with AI - BBN Times [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
DevOps Fundamentals You Ever Wanted To Know - hackernoon.com [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Another Perspective on Evictions - Bacon's Rebellion [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Amitabh Bachchan on fans alternate job suggestion: My job is now insured - The Indian Express [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Will You Soon Download Packaging Machine Controls from the Internet? - Packaging Digest [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
5 free resources every data scientist should start using today - The Next Web [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Who's hoping to make an Epic impact on Green Bay area music scene with a new concert venue? | Streetwise - Green Bay Press Gazette [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Industrial robots are dominating but are they safe from cyber-attacks? - TechHQ [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Friday Rant - Rise of the Rogue-Bots? - Diginomica [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Important Reasons Why You Should Pick RoR As Your Web-Based Development Project - Customer Think [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Portrait of the software developer as an artist - ComputerWeekly.com [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Python may be your safest bet for a career in coding - Gadgets Now [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
1Password is coming to Linux - ZDNet [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
IBM creates an open source tool to simplify API documentation - TechRepublic [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
Mastercard : Accelerate Ignites Next Generation of Fintech Disruptors and Partners to Build the Future of Commerce - Marketscreener.com [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
Expanding the Universe of Haptics | by Lofelt | Aug, 2020 - Medium [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
UX Designer Salary: 5 Important Things to Know - Dice Insights [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
Persistent memory reshaping advanced analytics to improve customer experiences - IT World Canada [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
NextCorps and SecondMuse Open Application Period for Programs that Help Climate Technology Startups Accelerate Hardware Manufacturing - GlobeNewswire [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
Buried deep in the ice is the GitHub code vault humanity's safeguard against devastation - ABC News [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
Top 12 Most Used Tools By Developers In 2020 - Analytics India Magazine [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
Facebook's React 17 JavaScript library: Here's why its top feature is 'no new features' - ZDNet [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
CORRECTING and REPLACING Anyscale Hosts Inaugural Ray Summit on Scalable Python and Scalable Machine Learning - Business Wire [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
Google: Here's how much we give to open source through our GitHub activity - ZDNet [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
How Chriselle Lim And Joan Nguyen Created Bmo, The Coworking Space And Virtual Classroom Of The Future (With A Childcare Twist) - Forbes [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
How Will Public Libraries Adapt To New School Year Norms? - Book Riot [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
Google: We'll test hiding the full URL in Chrome 86 to combat phishing - ZDNet [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
How to install Python 3 and PIP 3 on Ubuntu 20.04 LTS - Linux Shout - H2S Media [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
What are Bitcoin Wallets: Everything You Need to Know - Programming Insider [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
JSHint is Now Free Software after Updating License to MIT Expat - WP Tavern [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
How to learn JavaScript: These are the best online courses - Mashable [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
What developers need to know about inter-blockchain communication - ComputerWeekly.com [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
Introducing the CDK construct library for the serverless LAMP stack - idk.dev [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
IBM asked software developers to take on the wrath of Mother Nature - The Drum [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
Aspire Technology Launches First Truly Secure Public Blockchain for Creation of Digital Assets - GlobeNewswire [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
GM Creates And Shares New Workplace Safety Technologies - Pulse 2.0 [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
Key Considerations and Tools for IP Protection of Computer Programs in Europe and Beyond - Lexology [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
The state of application security: What the statistics tell us - CSO Online [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
Open Source: What's the delay on the former high/middle school on North Mulberry? - knoxpages.com [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
The Risks Associated with OSS and How to Mitigate Them - Security Boulevard [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
news digest: Microsoft launches open source website, TensorFlow Recorder released, and Stackery brings serverless to the Jamstack - SD Times -... [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
Build Your Own PaaS with Crossplane: Kubernetes, OAM, and Core Workflows - InfoQ.com [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
ISRO Is Recruiting For Vacancies with Salary Upto Rs 54000: How to Apply - The Better India [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
Does technology increase the problem of racism and discrimination? - TechTarget [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
CORRECTING and REPLACING Anyscale Hosts Inaugural Ray Summit on Scalable Python and Scalable Machine Learning - Yahoo Finance [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
In the City: Take advantage of open recreation, cultural and park amenities - Coloradoan [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
Exploring the future of modern software development - ComputerWeekly.com [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Hadoop Developer Interview Questions: What to Know to Land the Job - Dice Insights [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
SiFive Opens Business Unit to Build Chips With Arm and RISC-V Inside - Electronic Design [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Use Pulumi and Azure DevOps to deploy infrastructure as code - TechTarget [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Why ASP.NET Core Is Regarded As One Of The Best Frameworks For Building Highly Scalable And Modern Web Applications - WhaTech [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
NITK figures 4th in Google Summer of Code ranking - BusinessLine [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Learn More About Dynamo for Revit: Features, Functions, and News - ArchDaily [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Linux Foundation showcases the greater good of open source - ComputerWeekly.com [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Programming language Kotlin 1.4 is out: This is how it's improved quality and performance - ZDNet [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Top 10 Languages That Paid Highest Salaries Worldwide In 2020 - Analytics India Magazine [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Programming language Rust: Mozilla job cuts have hit us badly but here's how we'll survive - ZDNet [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
In-App Bidding Gathers Steam, But Adoption Looks Nothing Like Header Bidding On The Web - AdExchanger [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
13 thoughts on Fitting Snake Into A QR Code - Hackaday [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Newham test and trace app was designed by man who grew up in the borough - Newham Recorder [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
'Trapped in a code' the fight over our algorithmic future - Open Democracy [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Telegram launches one-on-one video calls on iOS and Android - The Verge [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
AWS Controllers for Kubernetes Will Be A 'Boon For Developers' - CRN: Technology news for channel partners and solution providers [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Coding within company constraints - ComputerWeekly.com [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Open Source and Open Standards: The Recipe for Success Featured - The Fast Mode [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
How Intel helped give the worlds first cyborg a voice - The Next Web [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
Tiger Woods, Rory McIlroy near bottom of field at The Northern Trust - ESPN [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
Intel Owl OSINT tool automates the intel-gathering process using a single API - The Daily Swig [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
IOTA Foundation presents the current projects in the mobility industry - Crypto News Flash [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
How 'Fortnite' and 'Second Life' Shaped the Future of Indian Market - Santa Fe Reporter [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
Apple Enters $ 2 Trillion Club, Github's Chinese Counterpart And More In This Week's Top News - Analytics India Magazine [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
As world grapples with pandemic, schools are the epicenter - ABC News [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
Why Businesses Should Embrace Modernizing Their Legacy Applications - TechBullion [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
Is It Time To Rename RPG? - IT Jungle [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
Phantasy Star Online programmers on breaking new ground and their Diablo-style isometric prototype - Polygon [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
How To Learn To Program In Python By Playing Videogames - Analytics India Magazine [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
New Microsoft program to help develop the quantum computing workforce of the future in India - Microsoft [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
How the Docker Revolution Will Change Your Programming, Part 1 - Walter Bradley Center for Natural and Artificial Intelligence [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
The art of developing happy customers - ComputerWeekly.com [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]