Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1411204.1411220acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

Generic discrimination: sorting and paritioning unshared data in linear time

Published: 20 September 2008 Publication History

Abstract

We introduce the notion of discrimination as a generalization of both sorting and partitioning and show that worst-case linear-time discrimination functions (discriminators) can be defined generically, by (co-)induction on an expressive language of order denotations. The generic definition yields discriminators that generalize both distributive sorting and multiset discrimination. The generic discriminator can be coded compactly using list comprehensions, with order denotations specified using Generalized Algebraic Data Types (GADTs). A GADT-free combinator formulation of discriminators is also given.
We give some examples of the uses of discriminators, including a new most-significant-digit lexicographic sorting algorithm.
Discriminators generalize binary comparison functions: They operate on n arguments at a time, but do not expose more information than the underlying equivalence, respectively ordering relation on the arguments. We argue that primitive types with equality (such as references in ML) and ordered types (such as the machine integer type), should expose their equality, respectively standard ordering relation, as discriminators: Having only a binary equality test on a type requires Θ(n2) time to find all the occurrences of an element in a list of length n, for each element in the list, even if the equality test takes only constant time. A discriminator accomplishes this in linear time. Likewise, having only a (constant-time) comparison function requires Θ(n log n) time to sort a list of n elements. A discriminator can do this in linear time.

Supplementary Material

JPG File (1411220.jpg)
index.html (index.html)
Slides from the presentation
ZIP File (p91-slides.zip)
Supplemental material for: Generic discrimination: sorting and paritioning unshared data in linear time
Audio only (1411220.mp3)
Video (1411220.mp4)

References

[1]
M. Ajtai, J. Komlos, and E. Szemeredi. Sorting in c log n parallel steps. Combinatorica, 3:1--19, 1983.
[2]
Thomas Ambus. Multiset discrimination for internal and external data management. Master's thesis, DIKU, University of Copenhagen, July 2004. http://plan-x.org/projects/msd/msd.pdf.
[3]
A. Andersson and S. Nilsson. A new efficient radix sort. In Proc. 35th Anniual IEEE Symposium on Foundations of Computer Science (FOCS), pages 714--721, 1994.
[4]
Arne Andersson and Stefan Nilsson. Implementing radixsort. J. Exp. Algorithmics, 3:7, 1998. ISSN 1084-6654.
[5]
Arne Andersson, Torben Hagerup, Stefan Nilsson, and Rajeev Raman. Sorting in linear time? Journal of Computer and System Sciences (JCSS), 57(1):74--93, August 1998.
[6]
K. E. Batcher. Sorting networks and their applications. In Proc. AFIPS Spring Joint Computer Conference, volume 32, pages 307--314, 1968.
[7]
Jon Bentley. Aha! Algorithms. Communications of the ACM, 26(9):623--627, September 1983. Programming Pearls.
[8]
J. Cai and R. Paige. Look ma, no hashing, and no arrays neither. In Jan., editor, Proc. 18th Annual ACM Symp. on Principles of Programming Languages (POPL), Orlando, Florida, pages 143--154, 1991.
[9]
Jiazhen Cai and Robert Paige. Using multiset discrimination to solve language processing problems without hashing. Theoretical Computer Science (TCS), 145(1--2), July 1995.
[10]
Gianni Franceschini, S. Muthukrishnan, and Mihai Patrascu. Radix sorting with no extra space. In Proc. European Symposium on Algorithms (ESA), volume 4698 of Lecture Notes in Computer Science (LNCS), pages 194--205. Springer, 2007.
[11]
M.L. Fredman and D.E. Willard. Surpassing the information-theoretic bound with fusion trees. Journal of Computer and System Sciences (JCSS), 47:424--436, 1993.
[12]
Glasgow Haskell. The Glasgow Haskell Compiler. http://www.haskell.org/ghc/, 2005.
[13]
Yijie Han and Mikkel Thorup. Integer sorting in o(n√log log n expected time and linear space. In Proceedings of the 43d Annual IEEE Sympositum on Foundations of Computer Science (FOCS), pages 135--144. IEEE Computer Society, 2002.
[14]
Fritz Henglein. Multiset discrimination. Unpublished manuscript. See http://plan-x.org/msd/multiset-discrimination.pdf, September 2003.
[15]
Fritz Henglein. A language for total preorders. Unfinished manuscript, March 2008.
[16]
Fritz Henglein and Jesper Jørgensen. Formally optimal boxing. In Proc. 21st ACM Symp. on Principles of Programming Languages (POPL), Portland, Oregon, P.O.Box 64145, Baltimore, MD 21264, Jan. 1994. ACM, ACM Press.
[17]
Ralf Hinze. Generalizing generalized tries. Journal of Functional Programming, 10(4):327--351, 2000.
[18]
C. A. R. Hoare. Algorithm 63: partition. Commun. ACM, 4(7):321, 1961. ISSN 0001-0782.
[19]
Johan Jeuring and Patrik Jansson. Polytypic programming. In Advanced Functional Programming, Lecture Notes in Computer Science, pages 68--114. Springer-Verlag, 1996.
[20]
Sian Jha, Jens Palsberg, Tian Zhao, and Fritz Henglein. Efficient type matching. In Olivier Danvy, Fritz Henglein, Harry Mairson, and Alberto Pettorossi, editors, Automatic Program Development-A Tribute to Robert Paige. Springer, 2008. ISBN 978-1-4020-6584-2.
[21]
Donald Knuth. The Art of Computer Programming: Sorting and Searching, volume 3. Addison Wesley, 2nd edition, 1998.
[22]
K. Mehlhorn. Data Structures and Algorithms 1: Sorting and Searching, volume I of EATCS Monographs on Theoretical Computer Science. Springer-Verlag, 1984.
[23]
R. Paige. Optimal translation of user input in dynamically typed languages. Draft, July 1991.
[24]
Robert Paige. Efficient translation of external input in a dynamically typed language. In Proc. 13th World Computer Congress. Elsevier, February 1994.
[25]
Robert Paige and Robert E. Tarjan. Three partition refinement algorithms. SIAM Journal of Computing, 16(6):973--989, December 1987.
[26]
Robert Paige and Zhe Yang. High level reading and data structure compilation. In Proc. 24th ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages (POPL), Paris, France, pages 456--469, http://www.acm.org, January 1997. ACM, ACM Press.
[27]
D. L. Shell. A high-speed sorting procedure. Communications of the ACM, 2(7), 1959.
[28]
Ranjan Sinha and Justin Zobel. Efficient trie-based sorting of large sets of strings. In Michael Oudshoorn, editor, Proc. 26th Australasian Computer Science Conference (ACSC), Adelaide, Australia, volume 16 of Conferences in Research and Practice in Information Technology, 2003.
[29]
J. W. J. Williams. Algorithm 232 - heapsort. Communications of the ACM, 7(6):347--348, 1964.
[30]
Yoav Zibin, Joseph Gil, and Jeffrey Considine. Efficient algorithms for isomorphisms of simple types. In Proc. 2003 ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 160--171. ACM, ACM Press, January 2003. SIGPLAN Notices, Vol. 38, No. 1.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICFP '08: Proceedings of the 13th ACM SIGPLAN international conference on Functional programming
September 2008
422 pages
ISBN:9781595939197
DOI:10.1145/1411204
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 43, Issue 9
    ICFP '08
    September 2008
    399 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1411203
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 September 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. discrimination
  2. discriminator
  3. equivalence
  4. functional
  5. generic
  6. multiset discrimination
  7. order
  8. partitioning
  9. sorting
  10. total preorder

Qualifiers

  • Research-article

Conference

ICFP08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 333 of 1,064 submissions, 31%

Upcoming Conference

ICFP '25
ACM SIGPLAN International Conference on Functional Programming
October 12 - 18, 2025
Singapore , Singapore

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)4
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Generic top-down discrimination for sorting and partitioning in linear time*Journal of Functional Programming10.1017/S095679681200016022:3(300-374)Online publication date: 24-Dec-2018
  • (2018)Generic multiset programming with discrimination-based joins and symbolic Cartesian productsHigher-Order and Symbolic Computation10.1007/s10990-011-9078-823:3(337-370)Online publication date: 14-Dec-2018
  • (2013)Sorting and Searching by DistributionProceedings of the 11th Asian Symposium on Programming Languages and Systems - Volume 830110.1007/978-3-319-03542-0_23(315-332)Online publication date: 9-Dec-2013
  • (2010)Generic multiset programming for language-integrated queryingProceedings of the 6th ACM SIGPLAN workshop on Generic programming10.1145/1863495.1863503(49-60)Online publication date: 26-Sep-2010
  • (2010)Optimizing relational algebra operations using generic equivalence discriminators and lazy productsProceedings of the 2010 ACM SIGPLAN workshop on Partial evaluation and program manipulation10.1145/1706356.1706372(73-82)Online publication date: 19-Jan-2010
  • (2010)Reasoning about CodataCentral European Functional Programming School10.1007/978-3-642-17685-2_3(42-93)Online publication date: 2010
  • (2009)Reasoning about codataProceedings of the Third summer school conference on Central European functional programming school10.5555/1939128.1939131(42-93)Online publication date: 21-May-2009
  • (2009)What is a Sorting Function?The Journal of Logic and Algebraic Programming10.1016/j.jlap.2008.12.00378:7(552-572)Online publication date: Aug-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media