ZipfDistribution

class ZipfDistribution(val numberOfElements: Int, val exponent: Double) : DiscreteDistribution(source)

Represents the Zipf distribution (finite support variant), a discrete power-law distribution over the integers 1, 2, ..., numberOfElements.

The probability of observing rank k is proportional to 1/k^exponent. This distribution models phenomena where a few items are very frequent and the rest are increasingly rare, such as word frequencies in natural language (Zipf's law), city population sizes, and website traffic distributions.

This is the finite-support parameterization matching Apache Commons Math: all moments are finite and the CDF is an exact finite sum. The normalization constant is the generalized harmonic number H(numberOfElements, exponent).

Example:

val dist = ZipfDistribution(numberOfElements = 10, exponent = 1.0)
dist.pmf(1) // 0.3414 (most probable rank)
dist.pmf(10) // 0.0341 (least probable rank)
dist.cdf(5) // 0.7796
dist.mean // 3.4142
dist.quantileInt(0.5) // 2 (median)
dist.sample(Random(42)) // a single random draw

Constructors

Link copied to clipboard
constructor(numberOfElements: Int, exponent: Double)

Properties

Link copied to clipboard
open override val entropy: Double

The Shannon entropy of this distribution in nats, computed by summing over the entire support.

Link copied to clipboard

the exponent characterizing the distribution. Must be positive.

Link copied to clipboard
open override val kurtosis: Double

The excess kurtosis (Fisher definition) of this distribution.

Link copied to clipboard
open override val mean: Double

The mean of this distribution: H(n, s-1) / H(n, s).

Link copied to clipboard

the number of elements (upper bound of support). Must be at least 1.

Link copied to clipboard
open override val skewness: Double

The skewness of this distribution.

Link copied to clipboard
open override val variance: Double

The variance of this distribution.

Functions

Link copied to clipboard
open override fun cdf(k: Int): Double

Returns the cumulative distribution function value at k.

Link copied to clipboard
open override fun logPmf(k: Int): Double

Returns the natural logarithm of the probability mass at k.

Link copied to clipboard
open override fun pmf(k: Int): Double

Returns the probability mass at k, the probability of observing rank k.

Link copied to clipboard
open override fun quantileInt(p: Double): Int

Returns the quantile (inverse CDF) for the given probability p as an Int.

Link copied to clipboard
open override fun sample(random: Random): Int

Draws a single random value from this Zipf distribution using inverse transform sampling.

Link copied to clipboard
open override fun sf(k: Int): Double

Returns the survival function value at k, equal to 1 - cdf(k).