Generic top-down discrimination for sorting and partitioning in linear time
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
Generic top-down discrimination for sorting and partitioning in linear time. / Henglein, Fritz.
In: Journal of Functional Programming, Vol. 22, No. 3 , 2012, p. 300-374.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Generic top-down discrimination for sorting and partitioning in linear time
AU - Henglein, Fritz
PY - 2012
Y1 - 2012
N2 - We introduce the notion of discrimination as a generalization ofboth sorting and partitioning and show that discriminators (discriminationfunctions) can be dened generically, by structural recursionon order and equivalence expressions denoting a rich class of total preordersand equivalence relations, respectively.Discriminators improve the asymptotic performance of generic comparison-based sorting and partitioning, yet do not expose more informationthan the underlying ordering relation, respectively equivalence.For a large class of order and equivalence expressions, including allstandard orders for rst-order recursive types, the discriminators executein worst-case linear time.The generic discriminators can be coded compactly using list comprehensions,with order expressions specied using Generalized AlgebraicData Types (GADTs). We give some examples of the uses of discriminators,including a new most-signicant-digit lexicographic sortingalgorithm and type isomorphism with an associative-commutativeoperator. Full source code of discriminators and their applications isincluded.1We argue discriminators should be basic operations for primitiveand abstract types with equality. The basic multiset discriminator forreferences, originally due to Paige et al., is shown to be both ecientand fully abstract: it nds all duplicates of all references occurringin a list in linear time without leaking information about their representation.In particular, it behaves deterministically in the presenceof garbage collection and nondeterministic heap allocation even whenreferences are represented as raw machine addresses. In contrast, havingonly a binary equality test as in ML requires (n2) time, andallowing hashing for performance reasons as in Java, makes executionnondeterministic and complicates garbage collection.
AB - We introduce the notion of discrimination as a generalization ofboth sorting and partitioning and show that discriminators (discriminationfunctions) can be dened generically, by structural recursionon order and equivalence expressions denoting a rich class of total preordersand equivalence relations, respectively.Discriminators improve the asymptotic performance of generic comparison-based sorting and partitioning, yet do not expose more informationthan the underlying ordering relation, respectively equivalence.For a large class of order and equivalence expressions, including allstandard orders for rst-order recursive types, the discriminators executein worst-case linear time.The generic discriminators can be coded compactly using list comprehensions,with order expressions specied using Generalized AlgebraicData Types (GADTs). We give some examples of the uses of discriminators,including a new most-signicant-digit lexicographic sortingalgorithm and type isomorphism with an associative-commutativeoperator. Full source code of discriminators and their applications isincluded.1We argue discriminators should be basic operations for primitiveand abstract types with equality. The basic multiset discriminator forreferences, originally due to Paige et al., is shown to be both ecientand fully abstract: it nds all duplicates of all references occurringin a list in linear time without leaking information about their representation.In particular, it behaves deterministically in the presenceof garbage collection and nondeterministic heap allocation even whenreferences are represented as raw machine addresses. In contrast, havingonly a binary equality test as in ML requires (n2) time, andallowing hashing for performance reasons as in Java, makes executionnondeterministic and complicates garbage collection.
U2 - 10.1017/S0956796812000160
DO - 10.1017/S0956796812000160
M3 - Journal article
VL - 22
SP - 300
EP - 374
JO - Journal of Functional Programming
JF - Journal of Functional Programming
SN - 0956-7968
IS - 3
ER -
ID: 15294694