Fix inaccurate Ollama context length for qwen2.5 models (#20933)
Peter Tripp
and
Patrick Samson
created
Since Ollama/llama.cpp do not currently YARN for context length
extension, the context length is limited to `32768`. This can be
confirmed by the Ollama model card.
See corresponding issue on Ollama repo :
https://github.com/ollama/ollama/issues/6865
Co-authored-by: Patrick Samson <1416027+patricksamson@users.noreply.github.com>