Notes On Entropic Forcing

Someone introduced the concept of Causal Entropic Forces to me.

One of the author’s TED talk seems to be a publicity stunt.

A Python implementation by a third party explained to me how it works. The project applied entropic forcing to particle-in-a-box. I fixed it a bit to run with Python 3.

The idea can be described as the following:

  • Seeking the most diverse future.
  • Seeking to be in a position to have the most options available.

Short explanation with bad KaTeX\KaTeX:

M:Nat (number of steps to look ahead)S?:StateS?F?:StateFr (each instance is a random action); Fr2=FrFrM: Nat \space \text{(number of steps to look ahead)} \\ S_?: State \\ S_? \cdot F_?: State \\ F_r \space \text{(each instance is a random action)}; \space {F_r}^2 = F_r \cdot F_r

Given current state SS, propose a some actions F1...FnF_1 ... F_n randomly or uniformly, calculate the future state of each action MM steps into the future (future actions are taken randomly): Si=SFiFrMS'_i = S \cdot F_i \cdot {F_r}^M.

Approximate future state density: kernel=gaussian_kde(Si)kernel = gaussian\_kde(S'_i).

The final action to take is F=i Fipdf(kernel,Si)F = \sum_i \space F_i \cdot -pdf(kernel, S'_i). Note the negative sign. Basically, if a subset of proposed actions result in similar outcomes, the agent will take the opposite of those actions.

In the particle-in-a-box system, the particle/agent’s position (state) will gravitate towards having spatially distributed positions from random work from current position. The agent’s state will gravitate towards having more “interesting” outcomes instead of “boring” (more common) ones. This “force” is called the entropic force.

In the particle-in-a-box system, the particle ends up at the center of the box, and stays approximately there.

If past outcomes are taken into account (with some discount) when calculating kernelkernel, the algorithm turns into novelty search.

Novelty search is a really old idea. In one top-down stealth game, the enemy AI uses it to search for the player (each grid on the map gets assigned a last-seen time. the algorithm is stable).