Building an Autonomous Multi-Agent Social Simulation Engine

Introduction
Inspiration and Motivation
The Architecture: Three Tiers, One Goal
Tools as Boundaries
Putting it all together
- What I wanted it to be
- The problems with this implementation
Some interesting observations
Challenges, Limitations and Learnings
Final Thoughts

What happens when you give AI agents distinct personalities, let them interact freely, and step back to observe? That's the question I set out to answer when building this multi-agent social simulation engine.

Unlike traditional simulations with rigid rules, this system uses large language models to create agents that think, react, and evolve. Each agent has a personality—political views, traits, goals—and they interact in a simulated social environment. The result? Emergent behaviors that nobody programmed, factions forming organically, and a digital society that evolves on its own.

I was getting a lot of youtube video recommendations for AI agents getting rogue and running amok. I was curious to see how it would play out. I specifically wanted to see how rogue and evil agents can go, if given chance and a social environment to thrive in. After much thought and multiple iterations, I came up with the social media simulation design.

The system needed structure, but not so much that it stifled emergence. I settled on a three-tier hierarchy that separates concerns while allowing the simulation to breathe.

At the top, Master agents maintain the system. They create new agents, introduce world events, and ensure diversity doesn't collapse into homogeneity. Think of them as the concept of a god in a simulation.

In the middle, Historian agents watch and document. They're neutral, objective chroniclers who identify patterns, track sentiment shifts, and provide feedback.

At the bottom, User agents are the population. They post, comment, form groups, and react to events—all driven by their personalities and goals. This is where the magic happens, where unexpected behaviors emerge from simple interactions.

The execution model cycles through these tiers: Master agents set the stage, participants interact, historians document. This creates a feedback loop where observations inform future administrative actions, keeping the simulation dynamic.

Agents (master, historian, user), World events, Users, Posts, Comments, Groups, etc. are all assets in the system. Any time any asset is created, its' embedding is also created and stored in the database. This can be used for similarity search to create context for the agents.

One key insight: agents shouldn't directly modify the simulation state. Instead, they interact through a tool-based system.

Each agent type gets a specific set of tools. Master agents can create users and events. Users can post, comment, and form groups. Historians can document but not interfere. This role-based capability model means agents literally can't conceptualize actions outside their role—they don't see tools they can't use.

The tool system uses schemas for validation, ensuring type safety and providing clear interfaces for LLMs. But the real power comes from iterative execution: agents can chain tool calls, building complex behaviors from simple operations. Query recent posts → analyze sentiment → create response → like related content—all in one execution cycle.

This iterative approach balances autonomy with stability. Agents can be sophisticated without risking infinite loops or resource exhaustion.

The execution is very straightforward. I wanted it to be in a different way but due to time and hardware constraints, I settled for the current implementation.

The program starts by creating a world.
The master is created
A world event is created (that the world has begun)
The historian is created
The master agent is run for some cycles (10 in this case)
Now the whole system is run indefinitely, where in each cycle
- The master agent is run
- 10 user agents are selected randomly and run.
- The historians are run to document the events.

The earlier solution was in Go using goroutines and channels. It enabled me to run the whole system in a way independednt of each other (running all of it concurrently).

The program starts by creating a world (as usual).
The master and the corresponding world events are created.
The historian is created.
The master and the historian are run in a different goroutine (changing the state of the world)
As soon as the master creates a new user, a goroutine is created for the user to run. This might be tricky resulting in a goroutine leak and explosion of goroutines because of all those new users created. Specifically for users, a goroutine pool is used (with fixed pool size) to limit how many goroutines are managed at any given time.

This implementation is more complex and requires more careful handling of goroutines.

I was running LLM locally with Ollama and it accepts requests sequentially. But I wanted to run it concurrently (and better be asynchronous).
The hardware I am running on, is not powerful enough to handle all those concurrent LLM calls.
The language Go it not the best language for this kind of application, because of the lack of mature libraries and tools for agentic AI applications.

There can be very nuanced and interesting behaviors that emerge from the interactions of the agents. Although, this does not give any direct correlation with the real world, it is still interesting to observe with a grain of salt.

The agents are able to form groups and form opinions based on the interactions with other agents. The groups so formed were very coherent to the type of people (agents) in that group. The groups had similar opinions and beliefs. Initially, I gave tools to create personal friendships with other AI agents but later removed it. The agents preferred tribalism (forming groups) over individualism (forming friendships).
With uncensored LLMs, the agents were able to adapt more to their given personalities and goals giving a more realistic behavior. I observed this behaviour more with the evil personas rather than the good ones.
The world events stir debates and opinions (viewed as posts by the agents) which were very much aligned with the personalities of the agents.

Running the LLM locally with Ollama is not the best idea. It is not powerful enough to handle all those concurrent LLM calls.
I should have invested for a hosted LLM service like OpenAI or Anthropic.
The initial idea was to use a different LLM for each agent. Selecting a different LLM based on the type of the user. For example: A more powerful LLM given to people with more power-level.
For creating posts (by agents), Currently I am accepting all the parameters, but I wanted just the "intent" or the "topic of the post" and based on all the things the agent has seen earlier (posts, comments, etc.), it should generate the post (using the stored embeddings and similarity search).
Currently, the system does not have a deterministic way to associate an agent with a user. Once a user is transformed into an agent, it is difficult to know which user is associated with which agent. This is a critical flaw because when they want to create any asset, they also need to pass their own ID for the tool to know "who" is creating the asset. This is an LLM-way (and it is non-deterministic)
Currently, the system does not have individual memory for each agent. Every iteration starts with a fresh context, (which contains the system prompt, and the activation prompt of the agent) which does not allow the agent to remember previous interactions and context. This is a very big limitation because the agent cannot recall and all the previous hard-work is lost in the next cycle of run.

For further reading, you can check out the code for more details.

This was an interesting project to work on. It was a great learning experience and I was able to explore the capabilities of LLMs in a more practical way. It was also a great way to understand the concepts of multi-agent systems and how they can be used to create a realistic simulation.

I hope you found this blog post helpful and informative. Feel free to reach out to me on X @m3_rashid or Linkedin MD Rashid Hussain.