A model context supports storage and retrieval of Chat Completion messages. It is always used together with a model client to generate LLM-based responses. For example, BufferedChatCompletionContext is a most-recent-used (MRU) context that stores the most recent buffer_size number of messages. This is useful to avoid context overflow in many LLMs. Let’s see an example that uses BufferedChatCompletionContext.