Talk:Free energy principle

Citation [3]
In the introduction it says: "AI implementations based on the active inference principle have shown advantages over other methods.[3]".

This citation leads to a Wired article. I think a better source is needed. 134.184.26.82 (talk) 10:42, 10 July 2020 (UTC)

Typo in formula?
I suspect there is a typo in the formula:


 * $$ a(t) = \underset{a}{\operatorname{arg\,min}}   \{ F(s(t),\mu(t)) \}$$

The parameter "a" of the argmin does not appear in the expression that it should minimize. — Preceding unsigned comment added by 193.171.142.191 (talk) 08:39, 21 October 2014 (UTC)

No typo
The sensory states are a probabilistic mapping from actions (and hidden states). See the definition for S and the Brain picture.


 * $$ s(t) = f_z(\psi,a) + \omega$$

So, actually:


 * $$ s(t,a) = f_z(\psi,a) + \omega$$

And you can start sweeping over parameter a if you like that, or in other ways consider that a impacts s. Especially if s is the perception of a belly when someone is trying to look at their feet. ;-)

Anne van Rossum (talk) 13:39, 4 January 2015 (UTC)

A funny little comment on Fristons models and priors
According to Karl Friston (see Friston 2010), all unconscious living systems have to maximize p(s│m) [with s: sensory state, m: model (or map)] in order to be able to live, i.e., to fight against dissolving or disintegration.

Interestingly, in this implicit equation above (with a probabilistic function or map!), m is given or presupposed, as is the separation between m and s.

However, only a "conscious" living system (e.g., I) "knows" of "having" "maps" (including differentiated "sensory maps") and that I may be a "map-maker". But even "conscious" living systems have to live, so even for them the equation above still holds, albeit now with a slight difference: maximize p(s*│m*) AND p(s│m), and that is why scientists have had to make up artificial experimental tests in order to test their "conscious" maps (e.g., functional hypotheses, etc.).

It is clear that the body (including "genomic maps" "inherited" from "the past") is far better at this optimization process than scientists, because the body does this constantly and full time (whereas most scientists only work part time nowadays). That is why professional soccer players (i.e., unconscious Bayesian machines) are so favoured and payed in "our" world -- because they have (nearly) "made it" on an unconscious level, whereas understanding and testing scientific maps or models may be much more difficult (and only in existence for some 400 years or so).

But all these probabilities mentioned above are smaller than 1, so the only option for scientists (given a messianic prior) is to wait until Judgement Day (where the whole truth will become unveiled anyway) while working and earning money endlessly...

Only for mapologists -- having become "conscious of" (i.e., having been able to map) all maps and biases and priors WITHOUT having to "act" upon some seemingly "outer world" -- the following equation holds:

p(m│m*) = p(m*│m) = p(m│m) = p(m) = p(s*│s) = p(s│s) = p(s)

This means: having reached the horizon (where "life" ≡ "death" ≡ Nirvana ≡ Samsara ≡ COSMOS ≡ I )...

FRISTON, Karl J. (2010): The free-energy principle: a unified brain theory? Nature Reviews Neuroscience 11, 127-138.

— Preceding unsigned comment added by 2A02:1205:C68D:47C0:2052:B3F5:1909:DEB1 (talk) 10:22, 16 June 2014 (UTC)

Too esoteric
This is not close to appropriate for a general audience. I'm a neuroscientist, have written mathematical papers about bayesian inference, information theory, and this is gobbeldy-gook to me. All due respect, but this needs to be written in a language that isn't just for True Believers. Comment from May 2017 by User:152.16.191.113

Agreed. This might be a good starting point: https://www.youtube.com/watch?v=NIu_dJGyIQI -- Akvadrako (talk) 17:59, 31 March 2018 (UTC)

Same. I'm a PhD neural engineer and this is nonsensical to me. — Preceding unsigned comment added by Xfzki12j33 (talk • contribs) 23:53, 14 February 2023 (UTC)

I agree too. I'm a PhD philosopher and clinical psychologist, and can make close to zero sense of this page. I also find that when I read papers in this area they are chock full with jargon. Mentalistic metaphors (talk of predictions, errors, models, representations, hypotheses, inferences etc) are used, presumably to try to make the raw neuropsychology less unapproachable, but they are rarely cashed out in a perspicuous way. Furthermore one is always left with the worry that they are not always meant as metaphors - as if the brain really were magically supposed to be involved in making 'inferences' or 'hypotheses' or in generating 'percepts' etc. (Helmholtz might not have known better and we can forgive him; today, though, it's not ok!) Please please please could someone who is cognisant of this area rewrite the page for us?! 84.68.66.212 (talk) 09:54, 8 July 2019 (UTC)


 * I'm from the physics side of things. Here's a quote from Friston 2003:


 * "EM  provides   a   useful   procedure   for   density estimation  that  helps  relate  many  different  models  within a  framework  that  has  direct  connections  with  statistical mechanics.   Both   steps   of   the   EM   algorithm   involve maximising a function of the densities that corresponds to the negative free energy in physics"


 * Here's a book review concerning a book titled "evolution as entropy":


 * "Since C. E. Shannon introduced the information measure in 1948 and showed a formal analogy between the information measure ($$-\sum p_i ln_2 (p_i)$$ ) and the entropy measure of statistical mechanics ($$- k\sum f_i ln( f_i) $$), a number of works have appeared trying to relate "entropy" to all sorts of academic disciplines. Many of these theories involve profound confusion about the underlying thermal physics and their authors use the language and formulae of the physical sciences to bolster otherwise trivial and vacuous theories. "


 * As far as I can tell, the "free energy principle" is based on this confusion. Specifically, two formulas look very similar, so there's a strong intuition that they must have some deep connection. The mathematical analogy is obvious, the conceptual connection is actually rather dubious. The main Wikipedia article is entropy in thermodynamics and information theory, which addresses the controversial assumption of treating them like the same concept. $$\langle$$ Forbes72 &#124; Talk $$\rangle$$ 03:48, 8 February 2020 (UTC)

Agreed, I work in active perception and bayesian inference. This is basically nonsense and presents a very narrow and specific view of the field of active perception and inference, specifically one that is basically not workable for any solution which someone might want to implement. — Preceding unsigned comment added by 207.151.223.182 (talk) 16:48, 23 June 2022 (UTC)

The FEP is not based on that confusion, otherwise it would work with the Boltzmann constant I suppose (the only difference between Shannon entropy and Boltzmann/Gibbs entropy). Looking at Shannon's work from 1948, demonstrating information theory via makov processes, the FEP is just an extension going from exact Bayes / cond. probability to approx. Bayes and uses Markov blankets instead of Markov chains (and works with non-equilibrium steady states, not only equilibrium). I mean, especially the last commenter claims he has worked with Bayes inference? I don't undertand the fuzz. The math behind information theory is essentially Bayes inference, just with some logarithms for convenience. It doesn't even deviate a lot from RL and it was shown that a classic ML can be looked at from that perspective. Doing some rearrangments you get a predictive processing scheme for a neural processing theory. — Preceding unsigned comment added by 95.90.241.15 (talk) 01:10, 22 November 2022 (UTC)

Type signature of definition section unclear
In the definition section, it states the type signature of some of these quantities in an unclear way. For example it states "Hidden or external states $$\Psi:\Psi\times A \times \Omega \to \mathbb{R}$$" But if $$\Psi$$ is a function, then it cannot also be a set. The topic is already confusing enough so I suggest using a different symbol for the set $$\Psi$$ than for the distribution $$\Psi$$.
 * From the lead of the Wikipedia article Function (mathematics):"A function is uniquely represented by the set of all [its] pairs (x, f (x))..." So, indeed, a set can be a function (in the appropriate context). (copy and pasted April 25, 2022).(I inserted [its] for clarity here).(whether that mathematical fact is appropriate here (in the article's gobbledygook), I can't say.) (also note that the mathematical term 'distribution' needs careful attention (lol, irony intended) since its definition differs depending on context (and I'd guess is used/confused both as it is in statistical mechanics and in finite-State automata/computer science/coding).207.155.85.22 (talk) 02:29, 26 April 2022 (UTC)


 * The definition of the type signature is definitely unclear, and there are some other issues here, including an incompatibility between these definitions and the examples in the schematic. This needs to be fixed really. Alpacaswamp (talk) 11:32, 23 May 2022 (UTC)

Time average is action?
Near the end we read "Finally, because the time average of energy is action..."

Shouldn't that be "the time integral of energy is action..."? Wouldn't the time average of energy just be energy, not action? Vaughan Pratt (talk) 18:33, 11 April 2021 (UTC)

Definition
The Definition section is awful. 1. $$\mathbb{R}$$ usually means the set of Real numbers. That doesn't make sense here. 2. : → is used as the core structure of the four of the clauses and A$$\times$$B is used 6 times. Without explanation! 3. I don't understand S: $$\Psi$$$$\times$$A$$\times\Omega$$ → $$\mathbb{R}$$ and A:S$$\times$$R→$$\mathbb{R}$$. Are these circular? How can you define S in terms of A and A in terms of S??? 4. the (Bayesian?) expressions p(s,$$\psi$$ |m) and q($$\psi,\mu$$ |m) are two types (?) of "density" Again, not defined term(s). 5. $$\omega$$ is described first but never again used. 6. if the "generative model m" is a fundamental building block, then shouldn't it be part of the "tuple"? 7. And shouldn't there be a clear link to its meaning? 7. finally, a trivial point but I assume s $$\in$$ S should be included like $$\psi $$ and $$\mu$$ are.207.155.85.22 (talk) 04:33, 26 April 2022 (UTC)


 * Yes the definition section needs to be fixed but is there anyone editing this page who really understands the issues? Alpacaswamp (talk) 11:34, 23 May 2022 (UTC)

Parsimony
I've been struggling with an interpretation of this theory, and it seems to me that the main idea here is the loss function is parsimony. That's to say that nature selects based on the condition that the least amount of energy used to obtain a goal always wins. I wouldn't add this section myself, in the case that I may be misinterpreting this theory, but I put it to the community. If this parsimony is Dr. Friston's loss function, shouldn't it be explicitly stated in this article? MikeBee2020 (talk) 09:06, 5 March 2023 (UTC)

Confusing Intro
I came here to learn what is Free energy principle, but after reading the intro still had no idea... Can we summarize it some, so in a few sentences it is actually summarizes the principle...

Something like

''The Free Energy Principle is a theory that suggests that the brain reduces surprise or uncertainty by making predictions based on internal models and updating them using sensory input. It highlights the brain's objective of aligning its internal model with the external world to enhance prediction accuracy. This principle integrates Bayesian inference with active inference, where actions are guided by predictions and sensory feedback refines them. It has wide-ranging implications for comprehending brain function, perception, and action.''

Thoughts?Kolma8 (talk) 02:01, 14 June 2023 (UTC)


 * @Mechachleopteryx, what are you thoughts on this? Kolma8 (talk) 22:20, 9 July 2023 (UTC)
 * Tried to add a simplified introduction like the one suggested. Mechachleopteryx (talk) 22:59, 9 July 2023 (UTC)

Lack of critical discussion
Reading through the article it seems to me that many important contributions that question the claims of the FEP have not been included. Here's a (non-exhaustive) list of open problems/discussion that i think should be mentioned:

- Quite generally, the FEP has been questioned in its applicability to any non-trivial (or even some trivial) system - mostly because the presupposed requirements are not met.

- A substantial problem of FEP and related theories seems to be the 'dark room problem'. That is, why should an agent not prefer to be in a dark room, since this environment is perfectly predictable? I understand that solving this problem via prior expectations (e.g., not being in a dark room) introduces other issues; mainly perception will be biased and not 'true' to the environment.

- Some of its claims towards explaining cognitive processes have been questioned. For example, attention has been argued to be not explained well by precision weighting (or, at least, not only by it).

Maybe someone can think of more ongoing discussions or knows more/better sources for the mentioned issues. I think it would be good to add this to the article to give a more comprehensive picture of FEP. Apoptheosis (talk) 09:45, 29 October 2023 (UTC)


 * I think you're right. A couple of important critical discussions: 1. Bruineberg, J., Dołęga, K., Dewhurst, J., & Baltieri, M. (2022). The emperor's new Markov blankets. Behavioral and Brain Sciences, 45, e183.2. Colombo, M. (2022). Nothing but a Useful Tool? (F)utility and the free energy principle. Commentary on Bruineberg et al. Behavioral and Brain Sciences, 45, E191. 3. Colombo, M., & Wright, C. (2021). First Principles in the Life Sciences: The free-energy principle, organicism, and mechanism. Synthese. 198, 3463–3488.
 * I think the issues raised in these papers should be discussed in the main entry. 5.132.29.20 (talk) 19:53, 11 November 2023 (UTC)