LLM Training and Inference

LLM Training and Inference

Follow up on Manifolds and Representations

This article is a culmination of gaining AI Intuition introductory series. The previous five posts explained scientific underpinnings and building blocks behind Generative AI magic:

  1. AI Intuition
  2. Importance of Language
  3. World Knowledge: Compression and Representation
  4. LLM foundations and Brain-DNA analogy
  5. Manifolds and Representations

Here we will talk about what is achieved during Large Language Model training and inference.

Main objectives of LLM Training

We learn two main things during LLM training phase:

  • the embeddings of the words in domains of interest. That is, we LLM learns efficient codification of words’ essence in different contexts (e.g. cat is animal, pet, can live in streets and at home, eats meat and drinks milk, was admired by Egyptian pharaohs etc – all captured in one embedding of cat)
  • the parameters that along with the fixed architecture of LLM (number of layers, width of reasoning embedding, called otherwise bottleneck) form a sort of efficiently configured landscape of learned domains that is ready to map inputs and outputs

These outputs resemble intelligence at the inference phase, e.g. process of completion of the prompt.

Why inference works so beautifully

Let’s see what happens during inference:

  • we ask for idea completion, for example, a natural language prompt “A dog devours pizza in…”
  • the prompt is first translated to collation of respective word embeddings (retrieved from LLM vocabulary, talked about here)
  • the codified prompt is subsequently fetched into and processed by the LLM that has specific architecture and learned parameters, i.e. it is a Deep Neural Network.
  • figuratively speaking, what we have is the prompt embedding gets laid over – mapped – hills and planes of the LLM landscape that leads us to desired “intelligent” outcome. Which is the next word.
  • then this “guessed” word gets appended to the previous prompt and the process enters the next iteration. And so on until the idea completion satisfies us.  

The role of the model parameters is to direct, or nudge, the flow of prediction toward the likely:

  • sub-contexts (e.g. animals, food, locations, etc)
  • words for completing the idea

What happens during many iterations of inference? In the first iteration an initial input prompt embedding helps localize the “closest” sub-context. Then a rather precise choice of words for idea completion is happening through meandering either within chosen sub-context or jumping to other semantically close sub-contexts. E.g. dog is animal – sub-context “animals”, dog eats pizza – sub-context “food”, eats pizza in a street – sub-context “location”. This meandering, or “reasoning” if you wish, is achieved thanks to properly calibrated parameters (w’s in diagram) during the training of the LLM. These w’s act as traffic signs.

Notice that during training we captured cat and dog traits in different contexts. LLM also knew that “A cat eats sausage at home“. But it had not encountered the knowledge “A dog devours pizza“.

During inference when we wish a prompt “A dog devours pizza in…” to be completed the DNN completes it to semantically plausible “A dog devours pizza in a street” thanks to w’s direction of inference flow. Amazing! Of course, this illustration is rather high level and essentially conceptual. There is a lot of very complex machinery involved to achieve this magic.


Well, Dear Friend this concludes an introduction to gaining AI intuition. I hope by now you have attained a solid grasp of why Large Language Models work, and have a good mental map of areas of concentration you would like to dive deeper. Continue exploring AI with confidence!

I would love to hear your feedback, suggestions, ideas, and a wish list for me to cover in the near future. Below in the footer is how you can reach me. Look forward to hearing from you! 🙂


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *