I tried GPT-5.4, and most answers were really good - but a few had me concerned ...
An AI agent reads its own source code, forms a hypothesis for improvement (such as changing a learning rate or an architecture depth), modifies the code, runs the experiment, and evaluates the results ...