Use the newer more space-efficient Llama 3.2 model and avoid maxing out the context window by default.
The context window comes with sensible defaults and we should not set it to the maximum without good reason as it significantly increases memory consumption.