Tuesday, June 25, 2019

Relative frequency distributions of multisets

Let $A$ be a multiset. Then we can add $A$ to itself repeatedly to get $A+A$, $A+A+A$, and so on. The multisets {x,x,y},{x,x,x,x,y,y}, and {x,x,x,x,x,x,y,y,y} for example are all produced by repeatedly adding some multiset together. This means that they are equivalent to scale. The multisets {a,b,c,d,d} and {a,a,b,b,c,c,d,d,d,d} are also equivalent to scale for the same reason. This produces an equivalence relation of multisets. We can use the concept of relative frequencies to characterize this equivalence relation of multisets.

Definition: the relative frequency of some element of the support of a multiset is equal to $m/n$ where $m$ is equal to the multiplicity of the element and $n$ is equal to the cardinality of the multiset. The relative frequency distribution is the support set of the multiset with each element with each element weighted by its relative frequencies.

Definition: the exponent of a multiset is equal to the greatest common divisor of its set of multiplicities.

Using the concept of relative frequencies so defined, we can see that the relative frequency distribution of the first class of multisets is {x 2/3, y 1/3} and the relative frequency distribution of the second class of multisets is {a 1/5, b 1/5, c 1/5, d 2/5}. Multisets with a given distribution can be scaled in an infinite number of ways to get different multisets. These different multisets all belong to the same equivalence class. The multiset exponent describes the scale of the multiset.

Theorem: the product function from a multiset to the pair (relative frequency distribution, exponent) is reversible.

Proof: suppose that we have a pair (relative frequency distribution, exponent) then we can construct the corresponding multiset by scaling each relative frequency of the multiset by $lcm(D)*n$ where $D$ is the set of denominators of the relative frequency distribution and $n$ is the exponent. To see this consider first what happens when we scale the relative frequency distribution by $lcm(D)$. What we want to show is that this produces a coprime multiset, that is a multiset with exponent one.

Firstly, in order for the resulting set of multiplicities to be integers they must be multiplied by a common divisor of the set of denominators of the relative frequency distribution. If they are not then there is some element in the denominator that will not be multiplied by a value that is a multiple of it, so the resulting multiset will not consist of integers. But the least common multiple (LCM), is the smallest common multiple of the integers in the divisibility lattice which means the gcd of the resulting multiplicities will be one because if it wasn't then it wouldn't be the least common multiple but rather a greater common multiple. To then get the proper multiset scale that further by the multiset exponent in the second part of the pair so that the gcd is $n$.

This product function isn't only one to one it is a complete destructuring of multisets because the equivalence class of multisets up to scale and the equivalence class of multisets with the same exponent are both independent of one another in the lattice of equivalence relations related by coarsification and refinement. This makes it especially valuable type of reversible function.

Postulate: the regularity, antiregularity, and the order of a multiset are determined by its relative frequency distribution

Actually, these are all determined by the support partition of the multiset which is the partition of the support of the multiset by equivalent multiplicites. Two relative frequencies of a given multiset are only equal if their corresponding multiplicities are because $m_1/n = m_2/n$ can be simplified by multiplying both sides by $n$ to get $m_1=m_2$ so all the properties of the support partition are preserved by the relative frequency distribution. Regular multisets produce regular distributions and antiregular multisets produce antiregular distributions.

No comments:

Post a Comment