User:Cosmia Nebula/sandbox

$$E = mc^2$$

Pendulum
Analysis based on. The pendulum with length $$L$$ has equation of motion $$\ddot x = - \omega_0^2 \sin x$$, where $$\omega_0 = \sqrt{g/L}$$. By virial theorem, we have$$\langle \dot x^2 \rangle = \omega_0^2 \langle x \sin x \rangle = \omega_0^2 \left(\langle x^2 \rangle - \frac 16 \langle x^4 \rangle + \frac{1}{120} \langle x^6 \rangle + \dots \right)$$Since the pendulum's motion is symmetric around its center, the motion of the oscillator is also symmetric around its peaks and troughs. Thus, if we shift the origin of time and perform a Fourier expansion, we would have$$x = A(\cos \omega t + \epsilon \cos 3\omega t + c \epsilon^2 \cos 5 \omega t + \dots)$$where $$A$$ is the amplitude, and $$\omega$$ is the angular frequency, and $$\epsilon$$ is a small perturbation parameter. Now it remains to perform a perturbation analysis.

Expand the virial equation to order $$\epsilon^2$$, we have$$\frac{\omega^2}{\omega_0^2} (1 + 9 \epsilon^2) = (1 + \epsilon^2 ) - \frac 18 A^2 (1 + 4\epsilon/3 + 4\epsilon^2)$$To allow the power series to match, we must have $$A^2 = z \epsilon $$ and $$\omega = \omega_0 (1 + a \epsilon + b \epsilon^2 \cdots)$$ for some constants $$z, a, b, \dots$$. Plugging those into the equation, and matching the coefficients of $$\epsilon, \epsilon^2$$, we have$$\begin{cases} 2a + z/8 &= 0 \\ 8 + a^2 + 2 b + z/6 &= 0 \end{cases}$$As before, to solve the full system, we need another equation. This we can obtain by using the total energy of the system:$$\frac 12 m \dot x_{max}^2L^2 = mgL(1 - \cos x_{max}) \implies \dot x_{max} = \omega_0 \sqrt{2 (1 - \cos x_{max})} = \omega_0 x_{max} (1 - x_{max}^2/24 + x_{max}^4/1920 + \cdots)$$Plugging in $$\dot x_{max} = A\omega (1 + 3\epsilon + 5 c \epsilon^2 + \cdots) $$ and $$x_{max} = A (1 + \epsilon + c \epsilon^2 + \cdots) $$, and expanding this to order $$\epsilon^2$$, we have$$\begin{cases} a+\frac{z}{24}+2 &= 0\\ 3 a+b+4 c+\frac{z}{8}-\frac{z^2}{1920} &= 0 \end{cases} $$

AI mathematics
A 2021 review article on the progress of AI mathematician.

Premise selection
Given a theorem to be proved or a calculation to be solved, one does not want to recall all known statements (axioms, assumptions, and propositions), but only recall those that are likely to contribute.

. uses vector retrieval. Specifically, they used a neural network to embed statements as vectors, then for a given problem, the problem is also embedded as a vector, and the closest vectors are selected.

Conjecturing
> Inductive logic programming (ILP) (Muggleton, 1991; Muggleton & De Raedt, 1994) is a form of machine learning (ML). As with other forms of ML, the goal is to induce a hypothesis that generalises training examples. However, whereas most forms of ML use vectors/tensors to represent data, ILP uses logic programs (sets of logical rules).

GPT
GPT-f found new short proofs that were accepted into the main Metamath library.

774M parameters, GPT architecture.

Scaling laws
is a pioneering paper in measuring neural scaling laws empirically. Most previous papers were theoretical, and they predicted $$L \propto D^{-\alpha}$$ with $$\alpha \in [0, 5, 1]$$. However, they found $$\alpha \in [0.07, 0.35]$$. The multiplicative factor depends on the precise algorithm and optimizers used, but the exponent only depends on the task itself.

Power laws found for machine translation, language modeling, image classification, and speech recognition.

models transition from a small training set region dominated by best guessing to a region dominated by power-law scaling. With sufficiently large training sets, models will saturate in a region dominated by irreducible error (e.g., Bayes error).

Pruning
On the Predictability of Pruning Across Scales

RL
https://arxiv.org/pdf/2104.03113.pdf Figure 6, 8, 9.