#3179 Ask Chat: How Can I Get Consistent, Deterministic Results from my LLM? #3179 How can I realistically get consistent, deterministic results from my LLM? “Deterministic” means every run of the same prompt returns the identical token sequence. With today’s LLM stacks you can get very close, but there are three different randomness layers you have to lock down: Layer Where the randomness creeps in What you can realistically do Decoder / sampling Temperature, top-p, top-k, nucleus/beam sampling Force a greedy decode: temperature = 0 , top_p = 1 , top_k = 1 (or simply omit the samplers entirely). This removes algorithmic randomness. GitHub Seeder / RNG state Each call seeds its random-number generator Pass an explicit, fixed seed. OpenAI exposes a seed field in ChatCompletion (beta) and returns a system_fingerprint so you can verify the backend hasn’t changed. OpenAI Cookbook OpenAI Platform Implementation / hardware Non-deterministic kernels, multithreading, model updates • Clo...