From 65b0044b143f371439853b883cfd87cc94f4ebe0 Mon Sep 17 00:00:00 2001 From: rakhimov Date: Wed, 21 Dec 2016 00:49:57 -0800 Subject: [PATCH 1/6] Correct the Histogram expected value formula The confusing E_i is swapped for w_i for weights. --- mef/stochastic_layer.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mef/stochastic_layer.rst b/mef/stochastic_layer.rst index d142ea3..d23f8fe 100644 --- a/mef/stochastic_layer.rst +++ b/mef/stochastic_layer.rst @@ -665,17 +665,17 @@ Beta Deviates The default value of the beta distribution is its mean, i.e., :math:`\alpha/(\alpha + \beta)`. Histograms - Histograms are lists of pairs :math:`(x_1, E_1), \ldots, (x_n, E_n)`, + Histograms are lists of pairs :math:`(x_1, w_1), \ldots, (x_n, w_n)`, where the :math:`x_i`'s are numbers such that :math:`x_i < x_{i+1} \text{ for } i=1, \ldots, n-1` - and the :math:`E_i`'s are expressions. + and the :math:`w_i`'s are weights. The :math:`x_i`'s represent upper bounds of successive intervals. The lower bound of the first interval :math:`x_0` is given apart. The drawing of a value according to a histogram is a two-step process. First, a value :math:`z` is drawn uniformly in the range :math:`[x_0, x_n]`. - Then, a value is drawn at random by means of the expression :math:`E_i`, + Then, a value is drawn at random by means of the expression :math:`w_i`, where :math:`i` is the index of the interval such that :math:`x_{i-1} < z \leq x_i`. @@ -683,7 +683,7 @@ Histograms .. math:: - \mathbf{E}(X) = \frac{1}{x_n - x_0} \times \sum_{i=1}^{n}(x_i - x_{i-1})\mathbf{E}(E_i) + E(x) = \dfrac{\sum_{i=1}^{n}\tfrac{1}{2}(x_i + x_{i-1}) \cdot w_i}{\sum_{i=1}^{n}w_i} Both Cumulative Distribution Functions and Density Probability Distributions can be translated into histograms. From e42058fcc692974318e5a72488b7f1b815b166a0 Mon Sep 17 00:00:00 2001 From: rakhimov Date: Wed, 21 Dec 2016 02:02:18 -0800 Subject: [PATCH 2/6] Swap 'x' with 'b' for Histogram boundary notation The x is confusing in this context due to conventional use of x as the random variable (result of sampling). --- mef/stochastic_layer.rst | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/mef/stochastic_layer.rst b/mef/stochastic_layer.rst index d23f8fe..2e3a28a 100644 --- a/mef/stochastic_layer.rst +++ b/mef/stochastic_layer.rst @@ -665,25 +665,25 @@ Beta Deviates The default value of the beta distribution is its mean, i.e., :math:`\alpha/(\alpha + \beta)`. Histograms - Histograms are lists of pairs :math:`(x_1, w_1), \ldots, (x_n, w_n)`, - where the :math:`x_i`'s are numbers - such that :math:`x_i < x_{i+1} \text{ for } i=1, \ldots, n-1` + Histograms are lists of pairs :math:`(b_1, w_1), \ldots, (b_n, w_n)`, + where the :math:`b_i`'s are numbers + such that :math:`b_i < b_{i+1} \text{ for } i=1, \ldots, n-1` and the :math:`w_i`'s are weights. - The :math:`x_i`'s represent upper bounds of successive intervals. - The lower bound of the first interval :math:`x_0` is given apart. + The :math:`b_i`'s represent upper bounds of successive intervals. + The lower bound of the first interval :math:`b_0` is given apart. The drawing of a value according to a histogram is a two-step process. - First, a value :math:`z` is drawn uniformly in the range :math:`[x_0, x_n]`. + First, a value :math:`z` is drawn uniformly in the range :math:`[b_0, b_n]`. Then, a value is drawn at random by means of the expression :math:`w_i`, where :math:`i` is the index of the interval - such that :math:`x_{i-1} < z \leq x_i`. + such that :math:`b_{i-1} < z \leq b_i`. By default, the value of a histogram is its mean, i.e., .. math:: - E(x) = \dfrac{\sum_{i=1}^{n}\tfrac{1}{2}(x_i + x_{i-1}) \cdot w_i}{\sum_{i=1}^{n}w_i} + E(x) = \dfrac{\sum_{i=1}^{n}\tfrac{1}{2}(b_i + b_{i-1}) \cdot w_i}{\sum_{i=1}^{n}w_i} Both Cumulative Distribution Functions and Density Probability Distributions can be translated into histograms. @@ -697,9 +697,9 @@ Histograms second, they are presented in a cumulative way. The histogram that corresponds to a Cumulative Distribution Function :math:`(p_1, v_1), \ldots, (p_n, v_n)` - is the list of pairs :math:`(x_1, v_1), \ldots, (x_n, v_n)`, + is the list of pairs :math:`(b_1, v_1), \ldots, (b_n, v_n)`, with the initial value - :math:`x_0 = 0, x_1 = p_1, \text{ and } x_i = p_i - p_{i-1} \text{ for all } i>1`. + :math:`b_0 = 0, b_1 = p_1, \text{ and } b_i = p_i - p_{i-1} \text{ for all } i>1`. A Discrete Probability Distribution is a list of pairs :math:`(d_1, m_1), \ldots, (d_n, m_n)`. @@ -709,9 +709,9 @@ Histograms and are such that :math:`m_1 < m_2 < \ldots < m_n < 1`. The histogram that corresponds to a Discrete Probability Distribution :math:`(d_1, m_1), \ldots, (d_n, m_n)` - is the list of pairs :math:`(x_1, d_1), \ldots, (x_n, d_n)`, + is the list of pairs :math:`(b_1, d_1), \ldots, (b_n, d_n)`, with the initial value - :math:`x_0 = 0, x_1 = 2m_1, \text{ and } x_i = x_{i-1} + 2(m_i - x_{i-1})`. + :math:`b_0 = 0, b_1 = 2m_1, \text{ and } b_i = b_{i-1} + 2(m_i - b_{i-1})`. XML Representation From 1a32eb9bd2adca863d96f95b74e72c09773b0634 Mon Sep 17 00:00:00 2001 From: rakhimov Date: Wed, 21 Dec 2016 02:13:51 -0800 Subject: [PATCH 3/6] Add Histogram distribution density function --- mef/stochastic_layer.rst | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/mef/stochastic_layer.rst b/mef/stochastic_layer.rst index 2e3a28a..7126c85 100644 --- a/mef/stochastic_layer.rst +++ b/mef/stochastic_layer.rst @@ -679,7 +679,19 @@ Histograms where :math:`i` is the index of the interval such that :math:`b_{i-1} < z \leq b_i`. - By default, the value of a histogram is its mean, i.e., + The probability density function of the histogram (or piece-wise constant) distribution: + + .. math:: + + f(x;b_0,\ldots,b_n, w_1,\ldots,w_n) = \dfrac{w_k}{(b_k - b_{k-1})\cdot\sum_{i=1}^{n}w_i} + + Where :math:`k` is such that + + .. math:: + + b_{k - 1} < x \leq b_k \quad \forall k \in \mathbb{Z} : 1 \leq k \leq n + + By default, the value of the histogram distribution is its mean, i.e., .. math:: From aa4546cffb670e4a62c5a504a5c46b3ec8d8380c Mon Sep 17 00:00:00 2001 From: rakhimov Date: Wed, 21 Dec 2016 02:30:24 -0800 Subject: [PATCH 4/6] Update the definition of Histogram distribution The range is fixed to be right-exclusive as customary in other distribution sampling. --- mef/stochastic_layer.rst | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/mef/stochastic_layer.rst b/mef/stochastic_layer.rst index 7126c85..780b0ce 100644 --- a/mef/stochastic_layer.rst +++ b/mef/stochastic_layer.rst @@ -554,7 +554,7 @@ As for arithmetic operators and built-ins, this list can be extended on demand. +-----------------------+------------+-------------------------------------------------------------------------------------------------------------+ | **beta-deviate** | 2 | beta distributions defined by two shape parameters :math:`\alpha` and :math:`\beta` | +-----------------------+------------+-------------------------------------------------------------------------------------------------------------+ - | **histograms** | any | discrete distributions defined by means of a list of pairs | + | **histogram** | >1 | piecewise-constant distributions defined by means of a list of pairs | +-----------------------+------------+-------------------------------------------------------------------------------------------------------------+ Uniform Deviates @@ -666,18 +666,16 @@ Beta Deviates Histograms Histograms are lists of pairs :math:`(b_1, w_1), \ldots, (b_n, w_n)`, - where the :math:`b_i`'s are numbers - such that :math:`b_i < b_{i+1} \text{ for } i=1, \ldots, n-1` - and the :math:`w_i`'s are weights. - - The :math:`b_i`'s represent upper bounds of successive intervals. + where the :math:`b_i`'s are upper bounds of successive, contiguous intervals + such that :math:`b_i < b_{i+1} \text{ for } i=0, \ldots, n-1`, + and the :math:`w_i`'s are non-negative weights for the intervals :math:`[b_{i-1}, b_i)`. The lower bound of the first interval :math:`b_0` is given apart. The drawing of a value according to a histogram is a two-step process. First, a value :math:`z` is drawn uniformly in the range :math:`[b_0, b_n]`. Then, a value is drawn at random by means of the expression :math:`w_i`, where :math:`i` is the index of the interval - such that :math:`b_{i-1} < z \leq b_i`. + such that :math:`b_{i-1} \leq x < b_i`. The probability density function of the histogram (or piece-wise constant) distribution: @@ -689,7 +687,7 @@ Histograms .. math:: - b_{k - 1} < x \leq b_k \quad \forall k \in \mathbb{Z} : 1 \leq k \leq n + b_{k - 1} \leq x < b_k \quad \forall k \in \mathbb{Z} : 1 \leq k \leq n By default, the value of the histogram distribution is its mean, i.e., From 9b58b6bfbd9ef804973b40bc8b51d54ff0c15e42 Mon Sep 17 00:00:00 2001 From: rakhimov Date: Wed, 21 Dec 2016 02:56:03 -0800 Subject: [PATCH 5/6] Fix Histogram distribution sampling description The two steps were confused in the sampling description. The steps are rearranged and clarified. --- mef/stochastic_layer.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mef/stochastic_layer.rst b/mef/stochastic_layer.rst index 780b0ce..e902901 100644 --- a/mef/stochastic_layer.rst +++ b/mef/stochastic_layer.rst @@ -671,11 +671,11 @@ Histograms and the :math:`w_i`'s are non-negative weights for the intervals :math:`[b_{i-1}, b_i)`. The lower bound of the first interval :math:`b_0` is given apart. - The drawing of a value according to a histogram is a two-step process. - First, a value :math:`z` is drawn uniformly in the range :math:`[b_0, b_n]`. - Then, a value is drawn at random by means of the expression :math:`w_i`, - where :math:`i` is the index of the interval - such that :math:`b_{i-1} \leq x < b_i`. + The drawing of a value according to a histogram distribution is a two-step process. + First, the interval :math:`i` is drawn at random + from a discrete distribution with the corresponding weights :math:`w_i`'s; + then, a random value :math:`x` is drawn uniformly from the range :math:`[b_{i-1}, b_i)`. + The sampling of the intervals and random values must be independent. The probability density function of the histogram (or piece-wise constant) distribution: From 7dc1a812c820482a84358edaca756b1d8dc700e0 Mon Sep 17 00:00:00 2001 From: rakhimov Date: Wed, 21 Dec 2016 03:36:56 -0800 Subject: [PATCH 6/6] Fix (Cumulative|Discrete) => Histogram description The =>Histogram formulas are fixed. The text looks very confusing and out-of-place, being a sign that these distributions should be provided separately. --- mef/stochastic_layer.rst | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/mef/stochastic_layer.rst b/mef/stochastic_layer.rst index e902901..4c96462 100644 --- a/mef/stochastic_layer.rst +++ b/mef/stochastic_layer.rst @@ -696,32 +696,31 @@ Histograms E(x) = \dfrac{\sum_{i=1}^{n}\tfrac{1}{2}(b_i + b_{i-1}) \cdot w_i}{\sum_{i=1}^{n}w_i} Both Cumulative Distribution Functions - and Density Probability Distributions can be translated into histograms. + and Discrete Probability Distributions can be translated into histograms. A Cumulative Distribution Function is a list of pairs - :math:`(p_1, v_1), \ldots, (p_n, v_n)`, + :math:`(p_1, b_1), \ldots, (p_n, b_n)`, where the :math:`p_i`'s are - such that :math:`p_i < p_{i+1} \text{ for } i=1, \ldots, n \text{ and } p_n=1`. + such that :math:`p_i < p_{i+1} \text{ for } i=0, \ldots, n-1 \text{ and } p_n=1, p_0=0`. It differs from histograms in two ways. - First, :math:`X` axis values are normalized (to spread between 0 and 1); + First, :math:`Y` axis values are normalized (to spread between 0 and 1); second, they are presented in a cumulative way. The histogram that corresponds to a Cumulative Distribution Function - :math:`(p_1, v_1), \ldots, (p_n, v_n)` - is the list of pairs :math:`(b_1, v_1), \ldots, (b_n, v_n)`, - with the initial value - :math:`b_0 = 0, b_1 = p_1, \text{ and } b_i = p_i - p_{i-1} \text{ for all } i>1`. + :math:`(p_1, b_1), \ldots, (p_n, b_n)` + is the list of pairs :math:`(b_1, w_1), \ldots, (b_n, w_n)`, + where :math:`w_i = p_i - p_{i-1}`. A Discrete Probability Distribution is a list of pairs :math:`(d_1, m_1), \ldots, (d_n, m_n)`. The :math:`d_i`'s are probability densities. - However, they could be any kind of values. + However, they could be any kind of non-negative values. The :math:`m_i`'s are midpoints of intervals - and are such that :math:`m_1 < m_2 < \ldots < m_n < 1`. + and are such that :math:`0 < m_1 < m_2 < \ldots < m_n`. The histogram that corresponds to a Discrete Probability Distribution :math:`(d_1, m_1), \ldots, (d_n, m_n)` is the list of pairs :math:`(b_1, d_1), \ldots, (b_n, d_n)`, - with the initial value - :math:`b_0 = 0, b_1 = 2m_1, \text{ and } b_i = b_{i-1} + 2(m_i - b_{i-1})`. + with the initial boundary :math:`b_0 = 0`, + :math:`b_i = b_{i-1} + 2(m_i - b_{i-1})`. XML Representation