The only difference is that fwdFDeriv' is uncurried version of fwdFDeriv.
My marking the definition with @[fun_trans] attribute we let Lean know that it is a function transformation. For fun_trans to work properly with this newly defined function transformation we have to state its transformation rules/theorems. There are couple types
Lambda theorems.
Theorems about basic lambda terms like indentity fun x => x, constant fun x => y, composition fun x => f (g x) and few more.
Function theorems.
Theorems about functions like addition, multiplication etc.
Free variable theorems.
These theorems usually state a transformation rule under some special condition. Like derivative of linear function is the function itself.
Morphism theorems.
Theorems about bundled morphisms. These are very special theorems that
Lambda Theorems
The tactic fun_trans transforms a complicated functions by first decomposing them into a sequence of function compositions and then transforms each of those function separetely. For example when we want to transform function (fun(x:ℝ)=>((2*x)+x)^2) then fun_trans effectivelly rewrites it as
Therefore fun_trans needs to know how to handle transfoming composition but also to transform (fun x => (2*x,x)) it ineeds to know how to transform pairing of two functions fun x => 2*x and fun x => x.
Lambda theorems are responsible for reducing the problem of transforming function like (fun(x:ℝ)=>((2*x)+x)^2) into transformation of all the elementary functions (fun x => x^2), (fun (x,y) => x+y) and (fun x => 2*x). Those are handled by function theorems and we will talk about those a bit later.
There are six basic lambda theorems saying how lambda terms are transformed
Identity rule for transforming fun (x : X) => x
Constant rule for transforming fun (x : X) => y
Composition rule for transforming fun (x : X) => f (g x)
Let rule for transforming fun (x : X) => let y := g x; f x y
Apply rule for transforming fun (f : X → Y) => f x
Pi rule for transforming fun (x : X) (y : Y) => f x y
There are also three theorems about product type which technically speaking fall under the category of "function theorems" but they are so fundamental that we list them here
Product rule for transforming fun (x : X) => (f x, g x)
First projection rule for transforming fun (xy : X×Y) => xy.1
Second projection rule for transforming fun (xy : X×Y) => xy.2
The collection of these theorems allows fun_trans to transform any complicated lambda expression if it can transform all the atomic functions in the lambda expression.
Let us give an example of few lambda theorems for fwdFDeriv'
Write down the product and projection lambda theorems of fwdFDeriv'. Hint: The product rule should be formulated for (fun x : X => (g x, f x)) where g and f are differentiable functions. The projection rule can be formulated for (fun (xy : X×Y) => xy.1)
Write down the apply lambda theorems of fwdFDeriv'.
Hint: Writhe the rule for the function fun (f : I → X) => f i where I has instance of [IndexType I].
Write down the let lambda theorems of fwdFDeriv'.
Pay attention the the use of let bindings on the right hand side. Careful use of let bindings is important for the efficiency of the resulting code. Also you do not want to differentiate f w.r.t. to x and y separetely but rather differentiate it w.r.t. x and y simultatinously.
Once we postulate all the lambda theorems for a new function transformation we have to also postulate theorems for functions like addition, multiplication etc.
Let's start with a function theorem for negation is it is a function of a single argument
For a function of two or more arguments we just uncurry the function sufficiently and furmulate the theorem as for a function of single argument. For example the transformation rule for addition is
⊢ (fwdFDeriv'Rfunxz => fun1xz.1 yxz.2) = funx =>
matchxwith
| ((x, z), dx, dz) => (fun1xyz, expx * (dz * cos (z + y)) + dx * expx * sin (z + y) + dz)
R
:
Type
inst✝
:
RealScalarR
y
:
R
⊢ (funxdx =>
((funxz => expxz.1 * sin (xz.2 + y) + xz.2) xdx.1, (∂ (xz:=xdx.1), (expxz.1 * sin (xz.2 + y) + xz.2)) xdx.2)) =
funx =>
matchxwith
| ((x, z), dx, dz) => (expx * sin (z + y) + z, expx * (dz * cos (z + y)) + dx * expx * sin (z + y) + dz)
;
All goals completed! 🐙
Writing rules for user defined functions like fun1 is tedious and because fun1 is defined in terms of known function the right hands side of this rule should be automatically generated anyway. We will discuss this how to do this a bit later. Here we just wanted to demonstrate that you can indeed formulate the transformation rule only for a subset of the input arguments. The important rule is that the order of the arguments should be maintained e.g. the left hand side should not be fwdFDeriv' R (fun (zx : R×R) => fun1 zx.2 y zx.1).
In some cases the rule can get a bit tricky and there might be additional requirements on the arguments. The canonical example is division, x / y, which can be differentiated only if we divide by non zero y. In this case, the theorem can't be stated as a function in ((x,y),(dx,dy)) anymore. We have to fix particular value and require that the y is non zero.
⊢ fwdFDeriv'R (funxy => xy.1 / xy.2) xydxy =
matchxydxy, hwith
| ((x, y), dx, dy), h => (x / y, (dx * y - x * dy) / y ^ 2)
R
:
Type
inst✝
:
RealScalarR
xydxy
:
(R × R) × R × R
h
:
xydxy.1.2 ≠ 0
⊢ ((funxy => xy.1 / xy.2) xydxy.1, (∂ (xy:=xydxy.1), xy.1 / xy.2) xydxy.2) =
matchxydxy, hwith
| ((x, y), dx, dy), h => (x / y, (dx * y - x * dy) / y ^ 2)
fun_trans(disch:=
All goals completed! 🐙
)
Note that the argument to the theorem is (xydxy : (R×R)×(R×R)) instead of (x y dx dy : R). Not writing the product pattern explicitely as ((x,y),(dx,dy)) on the left hand side of the rule makes it easier for fun_trans to apply this transformation rule.
User Defined Functions
Exercises
Define
Free Variable Theorems
Often we can write down a function transformation rule for a whole class of functions at once. The canonical example of this is that the derivative of liner function is the function itself. For fwdFDeriv' this is whould be
The characteristic feature of these theorems is that the left hand side of the rule is about a single free variable function f and the theorem has some nontrivial conditions on the function f.
These theorems are potentially dangerous as the left hand side of the theorem can match an arbitrary function. The tactic fun_trans would be incredibly slow if it tried applying these theorems at every step. Therefore "free variable theorems" are applied only in cases when the function that is being transformed can't be written as non-trivial composition of two functions.
Exercises
Write down free variable theorem for differentiating affine function.
⊢ (f (x, dx).1, (∂ f (x, dx).1) (x, dx).2) = (f (x, dx).1, f (x, dx).2 - f 0)
h
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
x
:
X
dx
:
X
g
:
X → Y := funx => fx - f 0
⊢ (f (x, dx).1, (∂ f (x, dx).1) (x, dx).2) = (f (x, dx).1, f (x, dx).2 - f 0)
havehf'':IsContinuousLinearMapRg:=
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
⊢ fwdFDeriv'Rf = funxdx => (fxdx.1, fxdx.2 - f 0)
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
x
:
X
dx
:
X
g
:
X → Y := funx => fx - f 0
⊢ IsContinuousLinearMapRfunx => fx - f 0
linear
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
x
:
X
dx
:
X
g
:
X → Y := funx => fx - f 0
⊢ IsLinearMapRfunx => fx - f 0
cont
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
x
:
X
dx
:
X
g
:
X → Y := funx => fx - f 0
⊢ autoParam (Continuousfunx => fx - f 0) _auto✝
linear
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
x
:
X
dx
:
X
g
:
X → Y := funx => fx - f 0
⊢ IsLinearMapRfunx => fx - f 0
All goals completed! 🐙
cont
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
x
:
X
dx
:
X
g
:
X → Y := funx => fx - f 0
⊢ autoParam (Continuousfunx => fx - f 0) _auto✝
cont
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
x
:
X
dx
:
X
g
:
X → Y := funx => fx - f 0
⊢ Continuousfunx => fx - f 0
;
All goals completed! 🐙
haveh:f=funx=>gx+f0:=
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
⊢ fwdFDeriv'Rf = funxdx => (fxdx.1, fxdx.2 - f 0)
All goals completed! 🐙
h
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
x
:
X
dx
:
X
g
:
X → Y := funx => fx - f 0
hf''
:
IsContinuousLinearMapRg
h
:
f = funx => gx + f 0
⊢ ((funx => gx + f 0) (x, dx).1, (∂ (x:=(x, dx).1), (gx + f 0)) (x, dx).2) =
((funx => gx + f 0) (x, dx).1, (funx => gx + f 0) (x, dx).2 - (funx => gx + f 0) 0)
h
R
:
Type
inst✝⁴
:
RealScalarR
X
:
Type
inst✝³
:
NormedAddCommGroupX
inst✝²
:
NormedSpaceRX
Y
:
Type
inst✝¹
:
NormedAddCommGroupY
inst✝
:
NormedSpaceRY
f
:
X → Y
hf
:
IsAffineMapRf
hf'
:
Continuousf
x
:
X
dx
:
X
g
:
X → Y := funx => fx - f 0
hf''
:
IsContinuousLinearMapRg
h
:
f = funx => gx + f 0
⊢ (∂ (x:=x), (gx + f 0)) dx = gdx - g 0
-- this does not seem to work :(
All goals completed! 🐙
make the proof trivial i.e. unfold fwdFDeriv' R f; fun_trans
Morphism Theorems
Advanced Function Theorems
Compositional form
There is also an alternative way of fomulating function theorems in so called "compositional form". The idea is that we introduce and auxiliary type W and parametrize every of the function with this type. The "compositional form" of negation theorem
The original form of formulating "function theorems" is called "simple form" or "uncurried form" as for function of multiple arguments has to be uncurried.
In simple cases, it does not really matter which form you use but in more complicated scenarios like dealing with higher order functions the "compositional form" is the only way of formulating the function transformation rules. Also certain function transformations seem to support only "compositional form". At the end of this chapter give an example of such function transformation called vectorize.
The general idea behing the "compositional form" is that we parametrize every argument of the function with some auxiliary type W and then we state the function transformation rule in terms of the argument (w : W).
The theorem for addition in "simple form"
This form is also called "uncurried form" as the left hand side contains uncurried version for addition fun (x : X×X) => x.1 + x.2.
set_default_scalarR
-- TODO: make this work
-- def_fun_trans (n : Nat) (f : X → X) (hf : Differentiable R f) :
-- (∂> x, applyNTimes n f x) by
-- induction n n' f' eq
-- · fun_trans [applyNTimes]
-- · fun_trans [applyNTimes]
variable(n:ℕ)Nat.recOn (motive := fun a => R → R → R × R) n (fun x dx => (x, dx)) fun n' f' x dx =>
f' (x ^ 2) (2 * dx * x) : R → R → R × R#check(∂>x,applyNTimesn(funy:R=>y^2)x)rewrite_byinductionnn'f'eq·
R
:
Type
inst✝⁸
:
RealScalarR
X
:
Type
inst✝⁷
:
NormedAddCommGroupX
inst✝⁶
:
NormedSpaceRX
Y
:
Type
inst✝⁵
:
NormedAddCommGroupY
inst✝⁴
:
NormedSpaceRY
Z
:
Type
inst✝³
:
NormedAddCommGroupZ
inst✝²
:
NormedSpaceRZ
W
:
Type ?u.3661
inst✝¹
:
NormedAddCommGroupW
inst✝
:
NormedSpaceRW
n
:
ℕ
| funxdx => (x, dx)
·
R
:
Type
inst✝⁸
:
RealScalarR
X
:
Type
inst✝⁷
:
NormedAddCommGroupX
inst✝⁶
:
NormedSpaceRX
Y
:
Type
inst✝⁵
:
NormedAddCommGroupY
inst✝⁴
:
NormedSpaceRY
Z
:
Type
inst✝³
:
NormedAddCommGroupZ
inst✝²
:
NormedSpaceRZ
W
:
Type ?u.3661
inst✝¹
:
NormedAddCommGroupW
inst✝
:
NormedSpaceRW
n
:
ℕ
n'
:
ℕ
f'
:
R → R → R × R
eq
:
∂> x, applyNTimesn' (funy => y ^ 2) x = f'
| funxdx => f' (x ^ 2) (2 * dx * x)
-- variable (n : ℕ) (f : R → R) (f' : R → R → R)
-- (hf : Differentiable R f)
-- (hf' : (fwdFDeriv R f) = f')
-- #check (∂> x, applyNTimes n f x)
-- rewrite_by
-- induction n n' f' eq
-- · fun_trans [applyNTimes]
-- · fun_trans [applyNTimes]
-- variable (n : ℕ)
-- #check (fderiv ℝ (fun x => applyNTimes n (fun x : ℝ => x^2) x))
-- rewrite_by
-- induction n n' f' eq
-- · fun_trans [applyNTimes]
-- · fun_trans [applyNTimes]
Making applyNTimes differentiable in f is still work in progress.