LLMs are everywhere and it’s worth remembering that there isn’t just one way of interacting with them.
Historically, the first common way of interacting with LLMs was using them as text completion models. That’s what original GPT-3 at launch in 2020 was. To get anything useful out of it, you would have to compose a prefix, an imaginary scene designed such that a continuation of this scene would contain an answer to a problem or a question. Without careful prompting, the results could be unexpected:
Original GitHub Copilot launched in 2021 acted as a code-completion system: a developer would type in code (often just a signature of the function or a docstring for a function), and GitHub Copilot would complete the body of the function, generating logic or an algorithm. In reality, it is a fill-in-the-middle model, because its prediction depends not only on the text preceding the cursor position but on the code following it as well. Fill in the middle generates text while being aware of what precedes and follows it:
In 2022 OpenAI released InstructGPT, which was an evolution of the original GPT-3. The killer feature of InstructGPT was its ability to understand a direct command and respond with an answer. The UX mode of this model wasn’t anymore a completion. It was a single turn instruction following. The model assumed that you are making a competent request and attempted to provide the most competent answer. The system was not conversational and wasn’t designed for follow-ups, but it was easy to use for practical tasks:
At the end of 2022, ChatGPT launched and popularized a new LLM UX approach: a multi-turn instruction following. By mimicking the interaction pattern used naturally by humans, it became an instant hit.
This distinction is still meaningful in Q3 of 2025 because complex long-context follow-up requests work most reliably when expressed as single-turn interactions, even with frontier LLMs. Image-retrieval from a large pool is one such example:
Can be expressed as a single-turn pattern: