Large Language Model: GPT3

Explore GPT3 alternatives

Jimmy (xiaoke) Shen
2 min readNov 6, 2022

This article is not talking about what is GPT1/2/3. If you don’t know those models yet, you can first check other online blogs and videos to know what those models are doing first. Here is a quick summary (Also needs the read knows what is transformer):

To be short, GPT1/2/3 is using the transformer’s decoder to pretrain the language models. One of the most successful and suprising is GPT3. All of the GPT1/2/3 are from Open AI. By using the GPT3 or similar architecture to train the model on github’s codebase, we have the OpenAI Codex, which can help programmer to auto generate some code samples.

About GPT3

As GPT3 is one of the most successful model and it is not open source, let’s try to find some alternatives from Hugging face.

From hugging face we can find GPT1/2 models

From here

We can also find more GPT related implementations:

GPT3 alternatives

Seems GPT-J is an alternative implementation as you can see from the following comparison:

More detail can be found here.

GPT-J author info

The author is Wang, Ben, from EleutherAI: AI Research Group

From the EleutherAI’s website, GPT-NeoX is also contributed by this company.

--

--