Wednesday, April 10, 2019

Core data structures

The core data structures of any reasoning system capable of dealing with symbolic expressions will be relatively Lisp-like: it will have lists and related collections as well as atoms. I will talk about these data structures and how they fit into a mathematical context. In the purest sense of the original Lisp programming can be made to be virtually indistinguishable from mathematics.

Collection data structures:
The most important collection structures are sets, multisets, and lists. All of these can be described mathematically as corresponding to a collection monoid. The collection monoid for sets is union, for multisets it is sum, and for lists it is concatenation. A partial order emerges from each of these monoids, describing how each collection can be decomposed and recombined in the collection monoid. These are the subsets, submultisets, and consecutive subsequences ordering.
  • Set
  • Multisets
  • Lists
All of these things are mathematical data structures which are also used in computation. Lists may have multiple implementations like vectors and sequences, but then we can get their meaning with a mathematical equality relation. A vector and a sequence with the same values are mathematically equal if they have the same elements. As a consequence, we can get to the mathematical meaning of lists. Multisets are underused, but they are worth including as well because they have a monoid associated with them.

As I described previously, ontologically these collection data structures belong to the broader class of structural multisets. Actually, all of these collection structures are partially ordered multisets or partitioned posets. The isomorphism type of a multiset is the isomorphism type of a partition of a set, and the isomorphism type of a list is the isomorphism type of the partition of a total order.

One thing that is worth commenting on is the fact that hash maps are not listed here. Clojure implements these and hash a literal syntax for them. Mathematically, however, they can be better understood as special types of binary relations which can already be described with sets and lists. Then hash maps belong to the broader mathematical ontological category of binary relations, which is well understood. Hash maps are a specialized binary relation data structure. When you call seq on one you even get a collection of ordered pairs.

Atomic data structures:
There isn't much to comment on here as the atomic data structures are roughly what you get from Clojure and related languages. There are two types of atomic data structures: the numeric data structures and the text related data structures. In order to get the mathematical meaning from a numeric data structure, it is necessary again to have a mathematical equivalence relation so that a floating point number with the same value is equal to a rational with the same value. Text related data structures like symbols have no mathematical meaning here, but they are going to be useful eventually for the symbolic representation of data structures like polynomials.

No comments:

Post a Comment