In search of clusters (2nd ed.): | Guide books

In search of clusters (2nd ed.)January 1998

Author:
Gregory F. Pfister
IBM, Austin, TX

Publisher:

Prentice-Hall, Inc.
Division of Simon and Schuster One Lake Street Upper Saddle River, NJ
United States

ISBN:978-0-13-899709-0

Published:01 January 1998

Pages:

578

Available at Amazon

Bibliometrics

Abstract

No abstract available.

Cited By

Contributors

Gregory F Pfister
International Business Machines
- Publication Years1987 - 2005
- Publication counts10
- Citation count135
- Available for Download1
- Downloads (cumulative)244
- Downloads (12 months)36
- Downloads (6 weeks)11
- Average Downloads per Article244
- Average Citation per Article14
View Full Profile

Index Terms

In search of clusters (2nd ed.)

Reviews

Reviewer: Jason Gait

A true cluster is built around a single system image, a cluster-wide realization of services that are necessary to make the cluster an application platform, such as a clock, a file system, high-performance intermachine communication, interprocessor synchronization, caching, work queues for load balancing, a unified name space for all resources, the ability to administer the cluster as though it were a single machine, and transaction logging. A true cluster is a bunch of sheep decked out so the flock looks and acts like a wolf, and thinks it is one. The author describes one way of realizing a single system image, as used by IBM, that designates a particular machine as an initial “floating master” responsible for maintaining the single system image, and fails over the master functionality to another machine in the cluster when necessary. It is hard to find cluster-ready applications. The only application I know of that is really cluster-friendly is database or transaction processing. The highly organized database partitions naturally, so it is easy to execute queries in parallel, with each machine in the cluster working on a piece of the database. It is as though clusters and databases were made for each other. With the current proliferation of symmetric multiprocessing (SMP) machines, together with the relative rarity of clusters, the proponents of clusters are becoming defensive. The author sets out to prove that there are reasons why the present favors clusters over SMPs and that the sooner we all recognize these reasons, the sooner clusters will drive SMPs to oblivion. The first reason advanced is that faster processors and relatively slower off-the-shelf memory lead to SMP architectures that do not scale across more than two processors. I do not buy this because any machine architect upgrading to faster processors is going to find a way to upgrade to faster memory at the same time: that is what the NUMA architecture is all about. The second reason is that high-speed interconnects have become common and cheap, making clusters easier to design and cheaper. I do not give this much weight either. Serious clusters have always sported high-speed interconnects whose cost was commensurate with the cost of the cluster. The cost of a cluster is only a small part of the cost of computing. The third reason is that tools for distributed computing have become ubiquitous, a positive development for clusters. One of the examples advanced by the author of a new tool for distributed computing is TCP, which has been around since the mid-1970s. Fourth, the market needs high availability to support rapidly growing markets in database warehousing and Web service. Here is where the cluster shines—high availability is the design center. The author makes a good case here. Businesses are recentralizing resources to regain control, the Internet is a 24-hour-a-day, seven-day-a-week window to the world, and service must be provided continuously. A critical aspect of cluster behavior is graceful failover when a machine leaves the cluster and equally graceful task redistribution when a machine is added to the cluster. The chapter on high availability via failover is the high point of the book. The tricky part about failover is dealing with false alarms, when the cluster thinks a machine is dead but it is not. After failover, there are two machines beating against one another to perform the same task, perhaps disastrously. One fix is to convert a potential false alarm to the real thing by disconnecting an apparently failed machine from the cluster. In contrast to a generally lively and authoritative, if long-winded, description of clusters, Pfister provides a simplistic and uninteresting overview of SMP and NUMA machines—he does not like them, and it shows. The book's flaw is bad editing. The author is exuberant and was allowed free rein for more than 500 pages. A good editor would have kept the book down to 200-odd pages, making it much better. In the end, the book is fun to read but is a weak technical contribution.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

An approach to reshaping clusters for nearest neighbor search
IDEAL'12: Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning

In this paper, we present our research on similarity search and clustering problems. Similarity search problems define the distances between data points and a given query point Q, efficiently and effectively selecting data points which are closest to Q. ...
Unsupervised fuzzy clustering with multi-center clusters
Clustering and modeling

A new unsupervised fuzzy clustering algorithm is provided in this paper to cluster the data patterns without a priori information about the number of clusters. The initial guesses of the locations of the cluster centers or the initial guesses of the ...
Labeling of Web Search Result Clusters Using Heuristic Search and Frequent Itemset
Abstract
Clustering of search result is undoubtedly a tool that can provide the summarization of the millions of documents in a way where a user can easily locate his/her information. To guide user to the right cluster of documents, cluster labels should ...

Browse Books

Sections

Cited By

Index Terms

Reviews

Access critical reviews of Computing literature here

An approach to reshaping clusters for nearest neighbor search

Unsupervised fuzzy clustering with multi-center clusters

Labeling of Web Search Result Clusters Using Heuristic Search and Frequent Itemset

Save to Binder

Sections

Cited By

Save to Binder

Index Terms

Reviews

Access critical reviews of Computing literature here

Recommendations

An approach to reshaping clusters for nearest neighbor search

Unsupervised fuzzy clustering with multi-center clusters

Labeling of Web Search Result Clusters Using Heuristic Search and Frequent Itemset