百度飞桨黑客马拉松第三期–Laplace散布算子开发经验共享

开启生长之旅！这是我参与「日新计划 2 月更文应战」的第 1 天，点击检查活动概况

1、关于本次开源奉献的一些感想

其他形式的开源以前做过一些，可是黑客松仍是第一次参与（由于发现这个是有奖金的，hhh），个人觉得开源奉献，包括本次的黑客松活动，是有必定门槛的，可是这个门槛却不是很高。规划文档的提交你只需会push到对应的代码库，能够提交PR即可，和日常作业中的代码开发是相似的。此外，在代码的开发过程中你也需求有Debug的能力，需求能够处理代码的一些Bug。

2、使命解析

详细描述：Laplace 用于 Laplace 散布的概率核算与随机采样，此使命的目标是在 Paddle 框架中，依据现有概率散布计划进行扩展，新增 Laplace API，调用路径为：paddle.distribution.Laplace 。类签名及各个办法签名，请经过调研 Paddle 及业界完成常规进行规划。要求代码风格及规划思路与已有概率散布保持共同。

实际上说了一大堆，便是一件事：完成Laplace散布算子，那么首要咱们需求知道什么是 Laplace 散布，在概率论和核算学中，拉普拉斯散布是一种接连概率散布。由于它能够看作是两个不同方位的指数散布背靠背拼在一起，所以它也叫双指数散布。与正态散布比照，正态散布是用相关于平均值的差的平方来表明，而拉普拉斯概率密度用相关于差的绝对值来表明。如下面的代码所示，Laplace 散布的图像和正态散布实际上是有点相似的，所以它的公式也与正态散布的公式相似的。

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
def laplace_function(x, lambda_):
    return (1/(2*lambda_)) * np.e**(-1*(np.abs(x)/lambda_))
x = np.linspace(-5,5,10000)
y1 = [laplace_function(x_,1) for x_ in x]
y2 = [laplace_function(x_,2) for x_ in x]
y3 = [laplace_function(x_,0.5) for x_ in x]
plt.plot(x, y1, color='r', label="lambda:1")
plt.plot(x, y2, color='g', label="lambda:2")
plt.plot(x, y3, color='b', label="lambda:0.5")
plt.title("Laplace distribution")
plt.legend()
plt.show()

3、规划文档编撰

规划文档是咱们API规划思路的体现，是整个开发作业中必要的部分。经过上述使命简介，咱们能够知道此API的开发首要为Laplace散布的开发，需求包括一些相应的办法。首要咱们需求弄清楚Laplace散布的数学原理，这儿主张去维基百科检查Laplace散布的数学原理，弄理解数学原理。此外，咱们能够参阅Numpy、Scipy、Pytorch、Tensorflow的代码完成，进行规划文档的编撰。

首要，咱们应该知道Laplace散布的概率密度函数公式、累积散布函数、逆累积散布函数，而且依据公式开宣布代码，公式如下所示：

参阅Numpy、Scipy、Pytorch、Tensorflow的代码完成，咱们这儿能够很容易的完成公式对应的代码，其完成计划如下3.1小节所示。

3.1 API 完成计划

该 API 完成于 paddle.distribution.Laplace。依据paddle.distribution API基类进行开发。 class API 中的详细完成（部分办法已完成开发，故直接运用源代码），该api有两个参数：方位参数self.loc, 尺度参数self.scale。包括以下办法：

mean 核算均值:
```
  self.loc
```
stddev 核算标准差:
```
  (2 ** 0.5) * self.scale;
```
variance 核算方差:
```
  self.stddev.pow(2)
```
sample 随机采样(参阅pytorch复用重参数化采样成果):
```
  self.rsample(shape)
```
rsample 重参数化采样:
```
  self.loc - self.scale * u.sign() * paddle.log1p(-u.abs())
```
其间 u = paddle.uniform(shape=shape, min=eps - 1, max=1); eps依据dtype决议;
prob 概率密度(包括传参value):
```
  self.log_prob(value).exp()
```
直接继承父类完成

log_prob 对数概率密度(value):

  -paddle.log(2 * self.scale) - paddle.abs(value - self.loc) / self.scale

entropy 熵核算:
```
  1 + paddle.log(2 * self.scale)
```

cdf 累积散布函数(value):

  0.5 - 0.5 * (value - self.loc).sign() * paddle.expm1(-(value - self.loc).abs() / self.scale)

icdf 逆累积散布函数(value):

  self.loc - self.scale * (value - 0.5).sign() * paddle.log1p(-2 * (value - 0.5).abs())

kl_divergence 两个Laplace散布之间的kl散度(other–Laplace类的一个实例):
```
  (self.scale * paddle.exp(paddle.abs(self.loc - other.loc) / self.scale) + paddle.abs(self.loc - other.loc)) / other.scale + paddle.log(other.scale / self.scale) - 1
```
参阅文献：openaccess.thecvf.com/content/CVP…

同时在paddle/distribution/kl.py 中注册_kl_laplace_laplace函数，运用时可直接调用kl_divergence核算laplace散布之间的kl散度。

3.2 测验和验收的考量

在咱们开发完对应的代码后，咱们应该怎么证明咱们所开宣布来的代码是正确的呢？这时候就需求单元测验的代码来证明咱们的代码是正确的。那么什么是单元测验呢？单元测验的用例其实是一个“输入数据”和“估计输出”的调集。你需求跟你输入数据，依据逻辑功用给出估计输出，这儿所说的依据逻辑功用是指，经过需求文档就能给出的估计输出。而非咱们经过已经完成的代码去推导出的估计输出。这也是最容易被忽视的一点。你要去做单元测验，然后还要经过代码去推断出估计输出，如果你的代码逻辑本来就完成错了，给出的估计输出也是错的，那么你的单元测验将没有意义。实际上，这部分能够说是整个作业中最重要的部分也是比较难的部分，咱们需求想出估计输出，而且怎么经过已经完成的代码去推导出估计输出，只有单元测验经过了，咱们的开发使命才算基本完成了。

依据api类各个办法及特性传参的不同，把单测分成三个部分：测验散布的特性（无需额定参数）、测验散布的概率密度函数（需求传值）以及测验KL散度（需求传入一个实例）。

测验Lapalce散布的特性

测验办法：该部分首要测验散布的均值、方差、熵等特征。类TestLaplace继承unittest.TestCase，别离完成办法setUp（初始化），test_mean（mean单测），test_variance（variance单测），test_stddev（stddev单测），test_entropy（entropy单测），test_sample（sample单测）。
- 均值、方差、标准差经过Numpy核算相应值，比照Laplace类中相应property的回来值，若共同即正确；
- 采样办法除验证其回来的数据类型及数据形状是否合法外，还需证明采样成果符合laplace散布。验证战略如下：随机采样30000个laplace散布下的样本值，核算采样样本的均值和方差，并比较同散布下scipy.stats.laplace回来的均值与方差，检查是否在合理差错范围内；同时经过Kolmogorov-Smirnov test进一步验证采样是否归于laplace散布，若核算所得ks值小于0.02，则拒绝不共同假设，两者归于同一散布；
- 熵核算经过比照scipy.stats.laplace.entropy的值是否与类办法回来值共同验证成果的正确性。
测验用例：单测需求掩盖单一维度的Laplace散布和多维度散布状况，因而运用两种初始化参数
- ‘one-dim’: loc=parameterize.xrand((2, )), scale=parameterize.xrand((2, ));
- ‘multi-dim’: loc=parameterize.xrand((5, 5)), scale=parameterize.xrand((5, 5))。

测验Lapalce散布的概率密度函数

测验办法：该部分首要测验散布各种概率密度函数。类TestLaplacePDF继承unittest.TestCase，别离完成办法setUp（初始化），test_prob（prob单测），test_log_prob（log_prob单测），test_cdf（cdf单测），test_icdf（icdf）。以上散布在scipy.stats.laplace中均有完成，因而给定某个输入value，比照相同参数下Laplace散布的scipy完成以及paddle完成的成果，若差错在容忍度范围内则证明完成正确。
测验用例：为不失一般性，测验运用多维方位参数和尺度参数初始化Laplace类，并掩盖int型输入及float型输入。
- ‘value-float’: loc=np.array([0.2, 0.3]), scale=np.array([2, 3]), value=np.array([2., 5.]); * ‘value-int’: loc=np.array([0.2, 0.3]), scale=np.array([2, 3]), value=np.array([2, 5]);
- ‘value-multi-dim’: loc=np.array([0.2, 0.3]), scale=np.array([2, 3]), value=np.array([[4., 6], [8, 2]])。

测验Lapalce散布之间的KL散度

测验办法：该部分测验两个Laplace散布之间的KL散度。类TestLaplaceAndLaplaceKL继承unittest.TestCase，别离完成setUp（初始化），test_kl_divergence（kl_divergence）。在scipy中scipy.stats.entropy可用来核算两个散布之间的散度。因而比照两个Laplace散布在paddle.distribution.kl_divergence下和在scipy.stats.laplace下核算的散度，若成果在差错范围内，则证明该办法完成正确。
测验用例：散布1：loc=np.array([0.0]), scale=np.array([1.0]), 散布2: loc=np.array([1.0]), scale=np.array([0.5])

4、代码开发

代码的开发首要参阅Pytorch，此处涉及到单元测验代码的开发，kl散度注册等代码，需求仔细阅读PaddlePaddle中其他散布代码的完成形式。

import numbers
import numpy as np
import paddle
from paddle.distribution import distribution
from paddle.fluid import framework as framework
class Laplace(distribution.Distribution):
    r"""
    Creates a Laplace distribution parameterized by :attr:`loc` and :attr:`scale`.
    Mathematical details
    The probability density function (pdf) is
    .. math::
        pdf(x; \mu, \sigma) = \frac{1}{2 * \sigma} * e^{\frac {-|x - \mu|}{\sigma}}
    In the above equation:
    * :math:`loc = \mu`: is the location parameter.
    * :math:`scale = \sigma`: is the scale parameter.
    Args:
        loc (scalar|Tensor): The mean of the distribution.
        scale (scalar|Tensor): The scale of the distribution.
        name(str, optional): Name for the operation (optional, default is None). For more information, please refer to :ref:`api_guide_Name`.
    Examples:
        .. code-block:: python
                        import paddle
                        m = paddle.distribution.Laplace(paddle.to_tensor([0.0]), paddle.to_tensor([1.0]))
                        m.sample()  # Laplace distributed with loc=0, scale=1
                        # Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True, 
                        # [3.68546247])
    """
    def __init__(self, loc, scale):
        if not isinstance(loc, (numbers.Real, framework.Variable)):
            raise TypeError(
                f"Expected type of loc is Real|Variable, but got {type(loc)}")
        if not isinstance(scale, (numbers.Real, framework.Variable)):
            raise TypeError(
                f"Expected type of scale is Real|Variable, but got {type(scale)}"
            )
        if isinstance(loc, numbers.Real):
            loc = paddle.full(shape=(), fill_value=loc)
        if isinstance(scale, numbers.Real):
            scale = paddle.full(shape=(), fill_value=scale)
        if (len(scale.shape) > 0 or len(loc.shape) > 0) and (loc.dtype
                                                             == scale.dtype):
            self.loc, self.scale = paddle.broadcast_tensors([loc, scale])
        else:
            self.loc, self.scale = loc, scale
        super(Laplace, self).__init__(self.loc.shape)
    @property
    def mean(self):
        """Mean of distribution.
        Returns:
            Tensor: The mean value.
        """
        return self.loc
    @property
    def stddev(self):
        """Standard deviation.
        The stddev is 
        .. math::
            stddev = \sqrt{2} * \sigma
        In the above equation:
        * :math:`scale = \sigma`: is the scale parameter.
        Returns:
            Tensor: The std value.
        """
        return (2**0.5) * self.scale
    @property
    def variance(self):
        """Variance of distribution.
        The variance is 
        .. math::
            variance = 2 * \sigma^2
        In the above equation:
        * :math:`scale = \sigma`: is the scale parameter.
        Returns:
            Tensor: The variance value.
        """
        return self.stddev.pow(2)
    def _validate_value(self, value):
        """Argument dimension check for distribution methods such as `log_prob`,
        `cdf` and `icdf`. 
        Args:
          value (Tensor|Scalar): The input value, which can be a scalar or a tensor.
        Returns:
          loc, scale, value: The broadcasted loc, scale and value, with the same dimension and data type.
        """
        if isinstance(value, numbers.Real):
            value = paddle.full(shape=(), fill_value=value)
        if value.dtype != self.scale.dtype:
            value = paddle.cast(value, self.scale.dtype)
        if len(self.scale.shape) > 0 or len(self.loc.shape) > 0 or len(
                value.shape) > 0:
            loc, scale, value = paddle.broadcast_tensors(
                [self.loc, self.scale, value])
        else:
            loc, scale = self.loc, self.scale
        return loc, scale, value
    def log_prob(self, value):
        """Log probability density/mass function.
        The log_prob is
        .. math::
            log\_prob(value) = \frac{-log(2 * \sigma) - |value - \mu|}{\sigma}
        In the above equation:
        * :math:`loc = \mu`: is the location parameter.
        * :math:`scale = \sigma`: is the scale parameter.
        Args:
          value (Tensor|Scalar): The input value, can be a scalar or a tensor.
        Returns:
          Tensor: The log probability, whose data type is same with value.
        Examples:
            .. code-block:: python
                            import paddle
                            m = paddle.distribution.Laplace(paddle.to_tensor([0.0]), paddle.to_tensor([1.0]))
                            value = paddle.to_tensor([0.1])
                            m.log_prob(value) 
                            # Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
                            # [-0.79314721])
        """
        loc, scale, value = self._validate_value(value)
        log_scale = -paddle.log(2 * scale)
        return (log_scale - paddle.abs(value - loc) / scale)
    def entropy(self):
        """Entropy of Laplace distribution.
        The entropy is:
        .. math::
            entropy() = 1 + log(2 * \sigma)
        In the above equation:
        * :math:`scale = \sigma`: is the scale parameter.
        Returns:
            The entropy of distribution.
        Examples:
            .. code-block:: python
                            import paddle
                            m = paddle.distribution.Laplace(paddle.to_tensor([0.0]), paddle.to_tensor([1.0]))
                            m.entropy()
                            # Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
                            # [1.69314718])
        """
        return 1 + paddle.log(2 * self.scale)
    def cdf(self, value):
        """Cumulative distribution function.
        The cdf is
        .. math::
            cdf(value) = 0.5 - 0.5 * sign(value - \mu) * e^\frac{-|(\mu - \sigma)|}{\sigma}
        In the above equation:
        * :math:`loc = \mu`: is the location parameter.
        * :math:`scale = \sigma`: is the scale parameter.
        Args:
            value (Tensor): The value to be evaluated.
        Returns:
            Tensor: The cumulative probability of value.
        Examples:
            .. code-block:: python
                            import paddle
                            m = paddle.distribution.Laplace(paddle.to_tensor([0.0]), paddle.to_tensor([1.0]))
                            value = paddle.to_tensor([0.1])
                            m.cdf(value)
                            # Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
                            # [0.54758132])
        """
        loc, scale, value = self._validate_value(value)
        iterm = (0.5 * (value - loc).sign() *
                 paddle.expm1(-(value - loc).abs() / scale))
        return 0.5 - iterm
    def icdf(self, value):
        """Inverse Cumulative distribution function.
        The icdf is 
        .. math::
            cdf^{-1}(value)= \mu - \sigma * sign(value - 0.5) * ln(1 - 2 * |value-0.5|)
        In the above equation:
        * :math:`loc = \mu`: is the location parameter.
        * :math:`scale = \sigma`: is the scale parameter.
        Args:
            value (Tensor): The value to be evaluated.
        Returns:
            Tensor: The cumulative probability of value.
        Examples:
            .. code-block:: python
                            import paddle
                            m = paddle.distribution.Laplace(paddle.to_tensor([0.0]), paddle.to_tensor([1.0]))
                            value = paddle.to_tensor([0.1])
                            m.icdf(value)
                            # Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
                            # [-1.60943794])
        """
        loc, scale, value = self._validate_value(value)
        term = value - 0.5
        return (loc - scale * (term).sign() * paddle.log1p(-2 * term.abs()))
    def sample(self, shape=()):
        """Generate samples of the specified shape.
        Args:
            shape(tuple[int]): The shape of generated samples.
        Returns:
            Tensor: A sample tensor that fits the Laplace distribution.
        Examples:
            .. code-block:: python
                            import paddle
                            m = paddle.distribution.Laplace(paddle.to_tensor([0.0]), paddle.to_tensor([1.0]))
                            m.sample()  # Laplace distributed with loc=0, scale=1
                            # Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
                            # [3.68546247])
        """
        if not isinstance(shape, tuple):
            raise TypeError(
                f'Expected shape should be tuple[int], but got {type(shape)}')
        with paddle.no_grad():
            return self.rsample(shape)
    def rsample(self, shape):
        """Reparameterized sample.
        Args:
            shape(tuple[int]): The shape of generated samples.
        Returns:
            Tensor: A sample tensor that fits the Laplace distribution.
        Examples:
            .. code-block:: python
                            import paddle
                            m = paddle.distribution.Laplace(paddle.to_tensor([0.0]), paddle.to_tensor([1.0]))
                            m.rsample((1,))  # Laplace distributed with loc=0, scale=1
                            # Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=True,
                            # [[0.04337667]])
        """
        eps = self._get_eps()
        shape = self._extend_shape(shape) or (1, )
        uniform = paddle.uniform(shape=shape,
                                 min=float(np.nextafter(-1, 1)) + eps / 2,
                                 max=1. - eps / 2,
                                 dtype=self.loc.dtype)
        if len(self.scale.shape) == 0 and len(self.loc.shape) == 0:
            loc, scale, uniform = paddle.broadcast_tensors(
                [self.loc, self.scale, uniform])
        else:
            loc, scale = self.loc, self.scale
        return (loc - scale * uniform.sign() * paddle.log1p(-uniform.abs()))
    def _get_eps(self):
        """
        Get the eps of certain data type.
        Note: 
            Since paddle.finfo is temporarily unavailable, we 
            use hard-coding style to get eps value.
        Returns:
            Float: An eps value by different data types.
        """
        eps = 1.19209e-07
        if (self.loc.dtype == paddle.float64
                or self.loc.dtype == paddle.complex128):
            eps = 2.22045e-16
        return eps
    def kl_divergence(self, other):
        """Calculate the KL divergence KL(self || other) with two Laplace instances.
        The kl_divergence between two Laplace distribution is
        .. math::
            KL\_divergence(\mu_0, \sigma_0; \mu_1, \sigma_1) = 0.5 (ratio^2 + (\frac{diff}{\sigma_1})^2 - 1 - 2 \ln {ratio})
        .. math::
            ratio = \frac{\sigma_0}{\sigma_1}
        .. math::
            diff = \mu_1 - \mu_0
        In the above equation:
        * :math:`loc = \mu`: is the location parameter of self.
        * :math:`scale = \sigma`: is the scale parameter of self.
        * :math:`loc = \mu_1`: is the location parameter of the reference Laplace distribution.
        * :math:`scale = \sigma_1`: is the scale parameter of the reference Laplace distribution.
        * :math:`ratio`: is the ratio between the two distribution.
        * :math:`diff`: is the difference between the two distribution.
        Args:
            other (Laplace): An instance of Laplace.
        Returns:
            Tensor: The kl-divergence between two laplace distributions.
        Examples:
            .. code-block:: python
                            import paddle
                            m1 = paddle.distribution.Laplace(paddle.to_tensor([0.0]), paddle.to_tensor([1.0]))
                            m2 = paddle.distribution.Laplace(paddle.to_tensor([1.0]), paddle.to_tensor([0.5]))
                            m1.kl_divergence(m2)
                            # Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
                            # [1.04261160])
        """
        var_ratio = other.scale / self.scale
        t = paddle.abs(self.loc - other.loc)
        term1 = ((self.scale * paddle.exp(-t / self.scale) + t) / other.scale)
        term2 = paddle.log(var_ratio)
        return term1 + term2 - 1

5、总结

现在，该API已经锁定奉献。回顾API的开发过程，实际上该API的开发并不难，首要的问题在于怎么进行单元测验，证明开发的API是正确的，而且还有一些相关的细节点，比如KL散度的注册等。还有便是最开端走了弯路，参照了Normal的开发风格，将API写成了2.0风格的，影响了一些时刻，而且在最终的单测中，发现了Uniform完成方式的一些Bug，此处Debug花费了一些时刻，全体来看，花时刻的部分是在单测部分，比照奖金与花费的时刻，综合看起来不太划算，关于想挣钱的来说；关于大部分学生来说，有必要多参与此类比赛，与日常作业内容差不多。

百度飞桨黑客马拉松第三期–Laplace分布算子开发经验分享

百度飞桨黑客马拉松第三期–Laplace散布算子开发经验共享

1、关于本次开源奉献的一些感想

2、使命解析

3、规划文档编撰

3.1 API 完成计划

3.2 测验和验收的考量

4、代码开发

5、总结

作者信息

百度飞桨黑客马拉松第三期–Laplace分布算子开发经验分享

百度飞桨黑客马拉松第三期–Laplace散布算子开发经验共享

1、关于本次开源奉献的一些感想

2、使命解析

3、规划文档编撰

3.1 API 完成计划

3.2 测验和验收的考量

4、代码开发

5、总结

相关文章

在Python中使用Kafka帮助我们处理数据

不花一分钱，在 Mac 上跑 Windows（M1/M2 版）

【青听第1期】Nice 兔 meet u 写码抽奖直播预约开始啦

利用 Kali Linux 进行 ARP 欺骗攻击的实验

作者信息