Country-Policy LLMs: Co-Evolving near a Nash Equilibrium

The problem I’m trying to solve is whether there are biases in the responses of the LLMs that are not explicit in their responses but can only be seen when they are asked to iterate on complex tasks.

An interesting task im trying to test biases with is choosing policy decisions for a country.

Changes to national and international policies can have surprising counterintuitive results, and it would be unreasonable to model this with any accuracy, so instead I use an NKC tuneable fitness landscape to model the complex interactions between countries and policy decisions. 

Quick summary of the process:

  1. Create an NKC co-evolutionary system where each “country” (species) is controlled by an LLM to optimize its policy genome.
  2. Initialise and iterate the co-evolutionary participants until they achieve a weak nash equilibrium
  3. Back off the nash equilibrium by randomising a few elements in the policy genomes. 
  4. Assign an LLM to each genome, and names to each policy
  5. Repeatedly ask each LLM in turn if it wants to make a specific change to its genome, based on the resultant changes to its fitness and other countries finesses.
  6. Watch to see where it ends up: stability – chaos – oscillation etc. 

I’m interested to see if an LLM pretending to be a country has biases that could only be winkled out using these co-evolving iterative strategies.

Here is a quick pick of four differently complex fitness landscapes from the NKC System:

And here is a picture of the 16 random countries optimising their fitness in a co-evolutionary setting

Once the first weak Nash equilibrium is achieved, I pause the system and randomise 3 policies, and set it running again. This would be the point that i would let the LLMs take control fully but at the moment I can only run it for a few steps on my laptop. 

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *