Gradient-free optimization of lanugage models is performed by iteratively improving the context instruction to perform a given task. This is obtained by evaluating with reference data, criteria and making use of the reasoning as signal to improve a language model until an end condition is met.