Trie
Functional maps
Functional maps (and sets) whose representation is "canonical", and independent of their operation history (unlike other popular search trees).
Background
The representation we use here comes from Section 6 of ["Incremental computation via function caching", Pugh & Teitelbaum](https://dl.acm.org/citation.cfm?id=75305).
AssocList
type AssocList<K, V> = AssocList.AssocList<K, V>
Key
type Key<K> = { hash : Hash.Hash; key : K }
isValid
func isValid<K, V>(t : Trie<K, V>, enforceNormal : Bool) : Bool
checks the invariants of the trie structure, including the placement of keys at trie paths
empty
func empty<K, V>() : Trie<K, V>
An empty trie.
size
func size<K, V>(t : Trie<K, V>) : Nat
Get the number of key-value pairs in the trie, in constant time.
merge
merge tries, preferring the right trie where there are collisions in common keys.
note: the `disj` operation generalizes this `merge` operation in various ways, and does not (in general) lose information; this operation is a simpler, special case.
See also:
-
disj -
join -
prod
disj
This operation generalizes the notion of "set union" to finite maps.
Produces a "disjunctive image" of the two tries, where the values of matching keys are combined with the given binary operator.
For unmatched key-value pairs, the operator is still applied to create the value in the image. To accomodate these various situations, the operator accepts optional values, but is never applied to (null, null).
Implements the database idea of an ["outer join"](https://stackoverflow.com/questions/38549/what-is-the-difference-between-inner-join-and-outer-join).
See also:
-
join -
merge -
prod
join
This operation generalizes the notion of "set intersection" to finite maps. Produces a "conjuctive image" of the two tries, where the values of matching keys are combined with the given binary operator, and unmatched key-value pairs are not present in the output.
Implements the database idea of an ["inner join"](https://stackoverflow.com/questions/38549/what-is-the-difference-between-inner-join-and-outer-join).
See also:
-
disj -
merge -
prod
foldUp
func foldUp<K, V, X>(t : Trie<K, V>, bin : (X, X) -> X, leaf : (K, V) -> X, empty : X) : X
This operation gives a recursor for the internal structure of tries. Many common operations are instantiations of this function, either as clients, or as hand-specialized versions (e.g., see , map, mapFilter, some and all below).
prod
Conditional catesian product, where the given
operation op conditionally creates output elements in the
resulting trie.
The keyed structure of the input tries are not relevant for this operation: all pairs are considered, regardless of keys matching or not. Moreover, the resulting trie may use keys that are unrelated to these input keys.
See also:
-
disj -
join -
merge
Build
let Build
Represent the construction of tries as data.
This module provides optimized variants of normal tries, for more efficient join queries.
The central insight is that for (unmaterialized) join query results, we do not need to actually build any resulting trie of the resulting data, but rather, just need a collection of what would be in that trie. Since query results can be large (quadratic in the DB size), avoiding the construction of this trie provides a considerable savings.
To get this savings, we use an ADT for the operations that would build this trie, if evaluated. This structure specializes a rope: a balanced tree representing a sequence. It is only as balanced as the tries from which we generate these build ASTs. They have no intrinsic balance properties of their own.
fold
func fold<K, V, X>(t : Trie<K, V>, f : (K, V, X) -> X, x : X) : X
Fold over the key-value pairs of the trie, using an accumulator. The key-value pairs have no reliable or meaningful ordering.
some
func some<K, V>(t : Trie<K, V>, f : (K, V) -> Bool) : Bool
Test whether a given key-value pair is present, or not.
all
func all<K, V>(t : Trie<K, V>, f : (K, V) -> Bool) : Bool
Test whether all key-value pairs have a given property.
toArray
func toArray<K, V, W>(t : Trie<K, V>, f : (K, V) -> W) : [W]
Gather the collection of key-value pairs into an array of a (possibly-distinct) type.
Implementation notes:
we use this function repeatedly in the Produce Exchange example application, often on very large tries.
Performance Profiling shows that it is important that this be memory efficient, and reasonably time efficient, at large scales.
To do so, we use a single array allocation (for the returned array) and we
sacrifice some efficiency in reading the input trie, and instead use function nth to
project each element with an independent trie traversal.
This approach is somewhat forced on us by the type signature of A.tabulate, and the desire to only allocate one array; that requirement rules out iterative mutation of an optionally-null array, since an imperative approach which would give us the wrong return type.
Since we want to statically rule out null output elements, and since the AS type system
cannot do that for an imperative approach unless we assume more about
the type W (e.g., the existence of "default values"), we settle for using nth.
isEmpty
func isEmpty<K, V>(t : Trie<K, V>) : Bool
Test for "deep emptiness": subtrees that have branching structure, but no leaves. These can result from naive filtering operations; filter uses this function to avoid creating such subtrees.
equalStructure
Test for equality, but naively, based on structure.
Does not attempt to remove "junk" in the tree;
For instance, a "smarter" approach would equate
#bin {left = #empty; right = #empty}
with
#empty.
We do not observe that equality here.