go to ORKG: http://orkg.org/orkg/predicate/P163009

context length (in tokens)

It refers to the maximum number of tokens (words, characters, or subword units) the model can process in a single input sequence. It determines how much prior information the model can consider when generating or predicting the next tokens. A longer context length allows the model to understand and utilize more extensive contextual information, which is crucial for handling complex tasks, such as summarization, dialogue, or document-level analysis.