这是绿皮书刷题记录系列的第 2 篇。

今天涉及的是我很久没有复习过的微积分与线性代数,因此稍微吃力一点。我们先将书中涉及到的所有数学知识点整理一下。

数学知识整理

极限与导数

  • Derivative: Let y=f(x),then
    f(x)=dydx=limΔx0ΔyΔx=limΔx0f(x+Δx)f(x)Δx

  • The product rule: If u=u(x) and v=v(x) and their respective derivatives exist,
    d(uv)dx=udvdx+vdudx,(uv)=uv+uv

  • The quotient rule:
    ddx(uv)=vdudxudvdxv2,(uv)=uvuvv2

  • The chain rule: If y=f(u(x)) and u=u(x),then dydx=dydududx

  • The generalized power rule: dyndx=nyn1dydx for n0

  • Some useful equations:

    • ax=exlnaln(ab)=lna+lnbex=limn(1+xn)n

    • For any k,
      limx0sinxx=1limx0(1+x)k=1+kx

    • limx(lnx/xr)=0 for any r>0

    • limxxrex=0 for any r

    • ddxeu=eududxdaudx=(aulna)dudxddxlnu=1ududx=uu

    • ddxsinx=cosx,ddxcosx=sinx,ddxtanx=sec2x

  • Local maximum or minimum: suppose that f(x) is differentiable at c and is defined on an open interval containing c. If f(c) is either a local maximum value or a local minimum value of f(x),then f(c)=0.

  • Second Derivative test: Suppose the secondary derivative of f(x),f(x), is continuous near c. If f(c)=0 and f(c)>0,then f(x) has a local minimum at c; if f(c)=0 and f(c)<0,then f(x) has a local maximum at c.

  • L’Hospital’s rule: Suppose that functions f(x) and g(x) are differentiable at xa and that limxag(a)0. Further suppose that limxaf(a)=0 and limxag(a)=0 or that limxaf(a)± and limxag(a)±,then limxaf(x)g(x)=limxaf(x)g(x). L’Hospital’s rule converts the limit from an indeterminate form to a determinate form.

积分

  • If f(x)=F(x),
    abf(x)=abF(x)dx=[F(x)]ab=F(b)F(a)

    dF(x)dx=f(x)

    F(a)=yaF(x)=ya+axf(t)dt

  • The generalized power rule in reverse:
    ukdu=uk+1k+1+c(k1)

    where c is any constant.

  • Integration by substitution:
    f(g(x))g(x)dx=f(u)du$with$u=g(x),du=g(x)dx

  • Substitution in definite integrals:
    abf(g(x))g(x)dx=g(a)p(b)f(u)du

  • Integration by parts: udv=uvvdu

偏导数与多重积分

  • Partial derivative:
    w=f(x,y)fx(x0,y0)=limΔx0f(x0+Δx,y0)f(x0,y0)Δx=fx
  • Second order partial derivatives:
    2fx2=x(fx),2fxy=x(fy)=y(fx)
  • The general chain rule: Suppose that w=f(x1,x2,,xm) and that each of variables x1,x2,,xm is a function of the variables t1,t2,,tn. If all these functions have continuous first-order partial derivatives, then
    wti=wx1x1ti+wx2x2ti++wxmxmti
    for each i,1in.
  • Changing Cartesian integrals into polar integrals: The variables in two-dimension plane can be mapped into polar coordinates: x=rcosθ, y=rsinθ. The integration in a continuous polar region R is converted to
    Rf(x,y)dxdy=Rf(rcosθ,rsinθ)rdrdθ

重要的微积分方法

  • Taylor’s series: One-dimensional Taylor’s series expands function f(x) as the sum of a series using the derivatives at a point x=x0 :
    f(x)=f(x0)+f(x0)(xx0)+f(x0)2!(xx0)2++f(n)(x0)n!(xx0)n+
    If x0=0,
    f(x)=f(0)+f(0)x+f(0)2!x2++f(n)(0)n!xn+
    Taylor’s series are often used to represent functions in power series terms. For example, Taylor’s series for three common transcendental functions, ex,sinx and cosx,at x0=0 are
    ex=n=01n!=1+x1!+x22!+x33!+, sinx=n=0(1)nx2n+1(2n+1)!=xx33!+x55!x77!+, cosx=n=0(1)nx2n(2n)!=1x22!+x44!x66!+
    The Taylor’s series can also be expressed as the sum of the n th-degree Taylor polynomial
    Tn(x)=f(x0)+f(x0)(xx0)+f(x0)2!(xx0)2++f(n)(x0)n!(xx0)n
    and a remainder
    Rn(x):f(x)=Tn(x)+Rn(x)
    For some x~ between x0 and x,Rn(x)=f(n+1)(x~)(n+1)!|xx0|n+1. Let M be the maximum of |f(n+1)(x~)| for all x~ between x0 and x,we get constraint |Rn(x)|M×|xx0|n+1(n+1)!.

  • Newton’s method: Newton’s method, also known as the Newton-Raphson method or the Newton-Fourier method, is an iterative process for solving the equation f(x)=0. It begins with an initial value x0 and applies the iterative step xn+1=xnf(xn)f(xn) to solve f(x)=0 if x1,x2, converge.

    Convergence of Newton’s method is not guaranteed, especially when the starting point is far away from the correct solution. For Newton’s method to converge, it is often necessary that the initial point is sufficiently close to the root; f(x) must be differentiable around the root. When it does converge, the convergence rate is quadratic, which means |xn+1xf|(xnxf)2δ<1,where xf is the solution to f(x)=0.

  • Bisection method: is an intuitive root-finding algorithm. It starts with two initial values a0 and b0 such that f(a0)<0 and f(b0)>0. Since f(x) is differentiable, there must be an x between a0 and b0 that makes f(x)=0. At each step, we check the sign of f((an+bn)/2). If f((an+bn)/2)<0,we set bn+1=bn and an+1=(an+bn)/2; If f((an+bn)/2)>0,we set an+1=an and bn+1=(an+bn)/2; If f((an+bn)/2)=0,or its absolute value is within allowable error, the iteration stops and x=(an+bn)/2. The bisection method converges linearly, xn+1xfxnxfδ<1,which means it is slower than Newton’s method. But once you find an a0/b0 pair, convergence is guaranteed.

  • Secant method: It starts with two initial values x0,x1 and applies the iterative step
    xn+1=xnxnxn1f(xn)f(xn1)f(xn)
    It replaces the f(xn) in Newton’s method with a linear approximation f(xn)f(xn1)xnxn1. Compared with Newton’s method, it does not require the calculation of derivative f(xn),which makes it valuable if f(x) is difficult to calculate. Its convergence rate is (1+5)/2,which makes it faster than the bisection method but slower than Newton’s method. Similar to Newton’s method, convergence is not guaranteed if initial values are not close to the root.

  • Lagrange multipliers: The method of Lagrange multipliers is a common technique used to find local maximums/minimums of a multivariate function with one or more constraints.

    Let f(x1,x2,,xn) be a function of n variables x=(x1,x2,,xn) with gradient vector f(x)=fx1,fx2,,fxn. The necessary condition for maximizing or minimizing f(x) subject to a set of k constraints
    g1(x1,x2,,xn)=0,g2(x1,x2,,xn)=0,,gk(x1,x2,,xn)=0
    is that f(x)+λ1g1(x)+λ2g2(x)++λkgk(x)=0,where λ1,,λk are called the Lagrange multipliers.

常微分方程

  • Separable differential equations: A separable differential equation has the form dydx=g(x)h(y). Since it is separable, we can express the original equation as dyh(y)=g(x)dx. Integrating both sides, we have the solution dyh(y)=g(x)dx.

  • First-order linear differential equations: A first-order differential linear equation has the form dydx+P(x)y=Q(x). The standard approach to solving a first-order differential equation is to identify a suitable function I(x), called an integrating factor, such that
    I(x)(y+P(x)y)=I(x)y+I(x)P(x)y=(I(x)y)