I Tested The New OpenAI GPT-3 Davinci Model
OpenAI announced the release of a new GPT-3 model called ‘text-davinci-003’ and there are markable improvements from a generative perspective.
Below I do a side-by-side comparison of Davinci models 1, 2 and the latest addition, Davinci 3.
Firstly, Prompt Engineering as a skill will evolve as new models are introduced, together with the model settings.
Secondly, generated responses are very different on each generation request, even-though the engineered prompt, model and settings remain the same.
So whilst all the responses are relevant and accurate, for certain implementation types a more constant response might be required.
Thirdly, the level of detail the generated response contains and how sequential points are crafted are truly astounding, with each point building on the previous one.
The responses in general are much longer, the level of coherency and fluency are what we have become accustomed to from OpenAI.
More about that later…
OpenAI’s new text-davinci-003 model has improved performance in the following aspects:
▪️ Higher quality writing with clearer, more engaging, and more compelling content.
▪️ Handle more complex instructions, which allows for more creativity in prompt engineering.
▪️ Create long-form content via generation and unlocking tasks which would have been impossible previously.
⭐️ Follow me on LinkedIn for the best Conversational AI Content ⭐️
I performed a straight-up comparison between the three Davinci models, named:
1️⃣ text-davinci-001
2️⃣ text-davinci-002
3️⃣ text-davinci-003
For all three models, I used the generative aspect of the model, with this engineered prompt:
I want to create an intelligent chatbot people can get weather information from. How do I create such a chatbot?
Below, you see the generated content from the text-davinci-001 model:
And, here (below) is the generated content from the newer text-davinci-002 model:
And lastly, below are two examples from the new text-davinci-003 model:
When re-generating or re-running the query, the results can differ quite a bit. Also consider the technologies listed below, which I found interesting:
Conclusion
From the results above it is evident that the new Davinci model yields much higher quality writing with longer output. And added to these improvements, the writing style is instructive, with actionable and sequenced points.
Obviously cost will be a consideration and in a production setting, intended tasks and expected outputs will need to be compared to the cost versus performance of other models.
⭐️ Follow me on LinkedIn for the best Conversational AI Content ⭐️
I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.