Nothing Special   »   [go: up one dir, main page]

(Download PDF) Database Internals A Deep Dive Into How Distributed Data Systems Work Alex Petrov Online Ebook All Chapter PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Database Internals A Deep Dive into

How Distributed Data Systems Work


Alex Petrov
Visit to download the full and correct content document:
https://textbookfull.com/product/database-internals-a-deep-dive-into-how-distributed-d
ata-systems-work-alex-petrov/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Windows Security Internals: A Deep Dive into Windows


Authentication, Authorization, and Auditing 1 /
converted Edition James Forshaw

https://textbookfull.com/product/windows-security-internals-a-
deep-dive-into-windows-authentication-authorization-and-
auditing-1-converted-edition-james-forshaw/

Windows Security Internals - A Deep Dive into Windows


Authentication, Authorization, and Auditing (for True
Epub) 1st Edition James Forshaw

https://textbookfull.com/product/windows-security-internals-a-
deep-dive-into-windows-authentication-authorization-and-auditing-
for-true-epub-1st-edition-james-forshaw/

Deep Dive into Power Automate: Learn by Example 1st


Edition Mishra

https://textbookfull.com/product/deep-dive-into-power-automate-
learn-by-example-1st-edition-mishra/

A deep dive into NoSQL databases the use cases and


applications First Edition Raj

https://textbookfull.com/product/a-deep-dive-into-nosql-
databases-the-use-cases-and-applications-first-edition-raj/
Programming iOS 11 dive deep into views view
controllers and frameworks Eighth Edition Neuburg

https://textbookfull.com/product/programming-ios-11-dive-deep-
into-views-view-controllers-and-frameworks-eighth-edition-
neuburg/

Programming IOS 12 Dive Deep Into Views View


Controllers and Frameworks 1st Edition Matt Neuburg

https://textbookfull.com/product/programming-ios-12-dive-deep-
into-views-view-controllers-and-frameworks-1st-edition-matt-
neuburg/

Programming iOS 13 Dive Deep into Views View


Controllers and Frameworks 1st Edition Matt Neuburg

https://textbookfull.com/product/programming-ios-13-dive-deep-
into-views-view-controllers-and-frameworks-1st-edition-matt-
neuburg/

Programming iOS 10 Dive Deep into Views View


Controllers and Frameworks Seventh Edition Matt Neuburg

https://textbookfull.com/product/programming-ios-10-dive-deep-
into-views-view-controllers-and-frameworks-seventh-edition-matt-
neuburg/

Uberland How Algorithms Are Rewriting the Rules of Work


Alex Rosenblat

https://textbookfull.com/product/uberland-how-algorithms-are-
rewriting-the-rules-of-work-alex-rosenblat/
Database Internals
A Deep Dive into How
Distributed Data Systems Work

Alex Petrov

Beijing Boston Farnham Sebastopol Tokyo


Database Internals
by Alex Petrov
Copyright © 2019 Oleksandr Petrov. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional
sales department: 800-998-9938 or corporate@oreilly.com.

Acquisitions Editor: Mike Loukides Indexer: Judith McConville


Development Editor: Michele Cronin Interior Designer: David Futato
Production Editor: Christopher Faucher Cover Designer: Karen Montgomery
Copyeditor: Kim Cofer Illustrator: Rebecca Demarest
Proofreader: Sonia Saruba

October 2019: First Edition

Revision History for the First Edition


2019-09-12: First Release

See http://oreilly.com/catalog/errata.csp?isbn=9781492040347 for release details.

The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Database Internals, the cover image,
and related trade dress are trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the author, and do not represent the publisher’s views.
While the publisher and the author have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the author disclaim all responsibility
for errors or omissions, including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to ensure that your use
thereof complies with such licenses and/or rights.

978-1-492-04034-7
[MBP]
To Pieter Hintjens, from whom I got my first ever signed book:
an inspiring distributed systems programmer, author, philosopher, and friend.
Table of Contents

Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Part I. Storage Engines


1. Introduction and Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
DBMS Architecture 8
Memory- Versus Disk-Based DBMS 10
Durability in Memory-Based Stores 11
Column- Versus Row-Oriented DBMS 12
Row-Oriented Data Layout 13
Column-Oriented Data Layout 14
Distinctions and Optimizations 15
Wide Column Stores 15
Data Files and Index Files 17
Data Files 18
Index Files 18
Primary Index as an Indirection 20
Buffering, Immutability, and Ordering 21
Summary 22

2. B-Tree Basics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Binary Search Trees 26
Tree Balancing 27
Trees for Disk-Based Storage 28
Disk-Based Structures 29
Hard Disk Drives 30
Solid State Drives 30

v
On-Disk Structures 32
Ubiquitous B-Trees 33
B-Tree Hierarchy 35
Separator Keys 36
B-Tree Lookup Complexity 37
B-Tree Lookup Algorithm 38
Counting Keys 38
B-Tree Node Splits 39
B-Tree Node Merges 41
Summary 42

3. File Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Motivation 46
Binary Encoding 47
Primitive Types 47
Strings and Variable-Size Data 49
Bit-Packed Data: Booleans, Enums, and Flags 49
General Principles 50
Page Structure 52
Slotted Pages 52
Cell Layout 54
Combining Cells into Slotted Pages 56
Managing Variable-Size Data 57
Versioning 58
Checksumming 59
Summary 60

4. Implementing B-Trees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Page Header 61
Magic Numbers 62
Sibling Links 62
Rightmost Pointers 63
Node High Keys 64
Overflow Pages 65
Binary Search 67
Binary Search with Indirection Pointers 67
Propagating Splits and Merges 68
Breadcrumbs 69
Rebalancing 70
Right-Only Appends 71
Bulk Loading 72
Compression 73

vi | Table of Contents
Vacuum and Maintenance 74
Fragmentation Caused by Updates and Deletes 75
Page Defragmentation 76
Summary 76

5. Transaction Processing and Recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79


Buffer Management 81
Caching Semantics 83
Cache Eviction 83
Locking Pages in Cache 84
Page Replacement 85
Recovery 88
Log Semantics 90
Operation Versus Data Log 91
Steal and Force Policies 91
ARIES 92
Concurrency Control 93
Serializability 94
Transaction Isolation 95
Read and Write Anomalies 95
Isolation Levels 96
Optimistic Concurrency Control 98
Multiversion Concurrency Control 99
Pessimistic Concurrency Control 99
Lock-Based Concurrency Control 100
Summary 108

6. B-Tree Variants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111


Copy-on-Write 112
Implementing Copy-on-Write: LMDB 113
Abstracting Node Updates 113
Lazy B-Trees 114
WiredTiger 114
Lazy-Adaptive Tree 116
FD-Trees 117
Fractional Cascading 118
Logarithmic Runs 119
Bw-Trees 120
Update Chains 121
Taming Concurrency with Compare-and-Swap 121
Structural Modification Operations 122
Consolidation and Garbage Collection 123

Table of Contents | vii


Cache-Oblivious B-Trees 124
van Emde Boas Layout 125
Summary 127

7. Log-Structured Storage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129


LSM Trees 130
LSM Tree Structure 132
Updates and Deletes 136
LSM Tree Lookups 137
Merge-Iteration 137
Reconciliation 140
Maintenance in LSM Trees 141
Read, Write, and Space Amplification 143
RUM Conjecture 144
Implementation Details 145
Sorted String Tables 145
Bloom Filters 146
Skiplist 148
Disk Access 150
Compression 151
Unordered LSM Storage 152
Bitcask 153
WiscKey 154
Concurrency in LSM Trees 155
Log Stacking 157
Flash Translation Layer 157
Filesystem Logging 159
LLAMA and Mindful Stacking 160
Open-Channel SSDs 161
Summary 162

Part I Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

Part II. Distributed Systems


8. Introduction and Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Concurrent Execution 171
Shared State in a Distributed System 173
Fallacies of Distributed Computing 174
Processing 175
Clocks and Time 176

viii | Table of Contents


State Consistency 177
Local and Remote Execution 178
Need to Handle Failures 178
Network Partitions and Partial Failures 179
Cascading Failures 180
Distributed Systems Abstractions 181
Links 182
Two Generals’ Problem 187
FLP Impossibility 189
System Synchrony 190
Failure Models 191
Crash Faults 191
Omission Faults 192
Arbitrary Faults 193
Handling Failures 193
Summary 193

9. Failure Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195


Heartbeats and Pings 196
Timeout-Free Failure Detector 197
Outsourced Heartbeats 198
Phi-Accural Failure Detector 199
Gossip and Failure Detection 200
Reversing Failure Detection Problem Statement 201
Summary 202

10. Leader Election. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205


Bully Algorithm 207
Next-In-Line Failover 208
Candidate/Ordinary Optimization 209
Invitation Algorithm 210
Ring Algorithm 211
Summary 212

11. Replication and Consistency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215


Achieving Availability 216
Infamous CAP 216
Use CAP Carefully 217
Harvest and Yield 218
Shared Memory 219
Ordering 221
Consistency Models 222

Table of Contents | ix
Strict Consistency 223
Linearizability 223
Sequential Consistency 227
Causal Consistency 229
Session Models 233
Eventual Consistency 234
Tunable Consistency 235
Witness Replicas 236
Strong Eventual Consistency and CRDTs 238
Summary 240

12. Anti-Entropy and Dissemination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243


Read Repair 245
Digest Reads 246
Hinted Handoff 246
Merkle Trees 247
Bitmap Version Vectors 248
Gossip Dissemination 250
Gossip Mechanics 251
Overlay Networks 251
Hybrid Gossip 253
Partial Views 254
Summary 255

13. Distributed Transactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257


Making Operations Appear Atomic 258
Two-Phase Commit 259
Cohort Failures in 2PC 261
Coordinator Failures in 2PC 262
Three-Phase Commit 264
Coordinator Failures in 3PC 265
Distributed Transactions with Calvin 266
Distributed Transactions with Spanner 268
Database Partitioning 270
Consistent Hashing 271
Distributed Transactions with Percolator 272
Coordination Avoidance 275
Summary 277

14. Consensus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279


Broadcast 280
Atomic Broadcast 281

x | Table of Contents
Virtual Synchrony 282
Zookeeper Atomic Broadcast (ZAB) 283
Paxos 285
Paxos Algorithm 286
Quorums in Paxos 287
Failure Scenarios 288
Multi-Paxos 291
Fast Paxos 292
Egalitarian Paxos 293
Flexible Paxos 296
Generalized Solution to Consensus 297
Raft 300
Leader Role in Raft 302
Failure Scenarios 304
Byzantine Consensus 305
PBFT Algorithm 306
Recovery and Checkpointing 309
Summary 309

Part II Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

A. Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317

Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337

Table of Contents | xi
Preface

Distributed database systems are an integral part of most businesses and the vast
majority of software applications. These applications provide logic and a user inter‐
face, while database systems take care of data integrity, consistency, and redundancy.
Back in 2000, if you were to choose a database, you would have just a few options,
and most of them would be within the realm of relational databases, so differences
between them would be relatively small. Of course, this does not mean that all data‐
bases were completely the same, but their functionality and use cases were very
similar.
Some of these databases have focused on horizontal scaling (scaling out)—improving
performance and increasing capacity by running multiple database instances acting
as a single logical unit: Gamma Database Machine Project, Teradata, Greenplum, Par‐
allel DB2, and many others. Today, horizontal scaling remains one of the most impor‐
tant properties that customers expect from databases. This can be explained by the
rising popularity of cloud-based services. It is often easier to spin up a new instance
and add it to the cluster than scaling vertically (scaling up) by moving the database to
a larger, more powerful machine. Migrations can be long and painful, potentially
incurring downtime.
Around 2010, a new class of eventually consistent databases started appearing, and
terms such as NoSQL, and later, big data grew in popularity. Over the last 15 years,
the open source community, large internet companies, and database vendors have
created so many databases and tools that it’s easy to get lost trying to understand use
cases, details, and specifics.
The Dynamo paper [DECANDIA07], published by the team at Amazon in 2007, had
so much impact on the database community that within a short period it inspired
many variants and implementations. The most prominent of them were Apache Cas‐
sandra, created at Facebook; Project Voldemort, created at LinkedIn; and Riak, cre‐
ated by former Akamai engineers.

xiii
Today, the field is changing again: after the time of key-value stores, NoSQL, and
eventual consistency, we have started seeing more scalable and performant databases,
able to execute complex queries with stronger consistency guarantees.

Audience of This Book


In conversations at technical conferences, I often hear the same question: “How can I
learn more about database internals? I don’t even know where to start.” Most of the
books on database systems do not go into details of storage engine implementation,
and cover the access methods, such as B-Trees, on a rather high level. There are very
few books that cover more recent concepts, such as different B-Tree variants and log-
structured storage, so I usually recommend reading papers.
Everyone who reads papers knows that it’s not that easy: you often lack context, the
wording might be ambiguous, there’s little or no connection between papers, and
they’re hard to find. This book contains concise summaries of important database
systems concepts and can serve as a guide for those who’d like to dig in deeper, or as a
cheat sheet for those already familiar with these concepts.
Not everyone wants to become a database developer, but this book will help people
who build software that uses database systems: software developers, reliability engi‐
neers, architects, and engineering managers.
If your company depends on any infrastructure component, be it a database, a mes‐
saging queue, a container platform, or a task scheduler, you have to read the project
change-logs and mailing lists to stay in touch with the community and be up-to-date
with the most recent happenings in the project. Understanding terminology and
knowing what’s inside will enable you to yield more information from these sources
and use your tools more productively to troubleshoot, identify, and avoid potential
risks and bottlenecks. Having an overview and a general understanding of how data‐
base systems work will help in case something goes wrong. Using this knowledge,
you’ll be able to form a hypothesis, validate it, find the root cause, and present it to
other project maintainers.
This book is also for curious minds: for the people who like learning things without
immediate necessity, those who spend their free time hacking on something fun, cre‐
ating compilers, writing homegrown operating systems, text editors, computer
games, learning programming languages, and absorbing new information.
The reader is assumed to have some experience with developing backend systems and
working with database systems as a user. Having some prior knowledge of different
data structures will help to digest material faster.

xiv | Preface
Why Should I Read This Book?
We often hear people describing database systems in terms of the concepts and algo‐
rithms they implement: “This database uses gossip for membership propagation” (see
Chapter 12), “They have implemented Dynamo,” or “This is just like what they’ve
described in the Spanner paper” (see Chapter 13). Or, if you’re discussing the algo‐
rithms and data structures, you can hear something like “ZAB and Raft have a lot in
common” (see Chapter 14), “Bw-Trees are like the B-Trees implemented on top of log
structured storage” (see Chapter 6), or “They are using sibling pointers like in Blink-
Trees” (see Chapter 5).
We need abstractions to discuss complex concepts, and we can’t have a discussion
about terminology every time we start a conversation. Having shortcuts in the form
of common language helps us to move our attention to other, higher-level problems.
One of the advantages of learning the fundamental concepts, proofs, and algorithms
is that they never grow old. Of course, there will always be new ones, but new algo‐
rithms are often created after finding a flaw or room for improvement in a classical
one. Knowing the history helps to understand differences and motivation better.
Learning about these things is inspiring. You see the variety of algorithms, see how
our industry was solving one problem after the other, and get to appreciate that work.
At the same time, learning is rewarding: you can almost feel how multiple puzzle
pieces move together in your mind to form a full picture that you will always be able
to share with others.

Scope of This Book


This is neither a book about relational database management systems nor about
NoSQL ones, but about the algorithms and concepts used in all kinds of database sys‐
tems, with a focus on a storage engine and the components responsible for
distribution.
Some concepts, such as query planning, query optimization, scheduling, the rela‐
tional model, and a few others, are already covered in several great textbooks on data‐
base systems. Some of these concepts are usually described from the user’s
perspective, but this book concentrates on the internals. You can find some pointers
to useful literature in the Part II Conclusion and in the chapter summaries. In these
books you’re likely to find answers to many database-related questions you might
have.
Query languages aren’t discussed, since there’s no single common language among
the database systems mentioned in this book.
To collect material for this book, I studied over 15 books, more than 300 papers,
countless blog posts, source code, and the documentation for several open source

Preface | xv
databases. The rule of thumb for whether or not to include a particular concept in the
book was the question: “Do the people in the database industry and research circles
talk about this concept?” If the answer was “yes,” I added the concept to the long list
of things to discuss.

Structure of This Book


There are some examples of extensible databases with pluggable components (such as
[SCHWARZ86]), but they are rather rare. At the same time, there are plenty of exam‐
ples where databases use pluggable storage. Similarly, we rarely hear database vendors
talking about query execution, while they are very eager to discuss the ways their
databases preserve consistency.
The most significant distinctions between database systems are concentrated around
two aspects: how they store and how they distribute the data. (Other subsystems can
at times also be of importance, but are not covered here.) The book is arranged into
parts that discuss the subsystems and components responsible for storage (Part I) and
distribution (Part II).
Part I discusses node-local processes and focuses on the storage engine, the central
component of the database system and one of the most significant distinctive factors.
First, we start with the architecture of a database management system and present
several ways to classify database systems based on the primary storage medium and
layout.
We continue with storage structures and try to understand how disk-based structures
are different from in-memory ones, introduce B-Trees, and cover algorithms for effi‐
ciently maintaining B-Tree structures on disk, including serialization, page layout,
and on-disk representations. Later, we discuss multiple variants to illustrate the power
of this concept and the diversity of data structures influenced and inspired by B-
Trees.
Last, we discuss several variants of log-structured storage, commonly used for imple‐
menting file and storage systems, motivation, and reasons to use them.
Part II is about how to organize multiple nodes into a database cluster. We start with
the importance of understanding the theoretical concepts for building fault-tolerant
distributed systems, how distributed systems are different from single-node applica‐
tions, and which problems, constraints, and complications we face in a distributed
environment.
After that, we dive deep into distributed algorithms. Here, we start with algorithms
for failure detection, helping to improve performance and stability by noticing and
reporting failures and avoiding the failed nodes. Since many algorithms discussed
later in the book rely on understanding the concept of leadership, we introduce sev‐
eral algorithms for leader election and discuss their suitability.

xvi | Preface
Figure 1-1. Architecture of a database management system

Upon receipt, the transport subsystem hands the query over to a query processor,
which parses, interprets, and validates it. Later, access control checks are performed,
as they can be done fully only after the query is interpreted.
The parsed query is passed to the query optimizer, which first eliminates impossible
and redundant parts of the query, and then attempts to find the most efficient way to
execute it based on internal statistics (index cardinality, approximate intersection size,
etc.) and data placement (which nodes in the cluster hold the data and the costs asso‐
ciated with its transfer). The optimizer handles both relational operations required
for query resolution, usually presented as a dependency tree, and optimizations, such
as index ordering, cardinality estimation, and choosing access methods.
The query is usually presented in the form of an execution plan (or query plan): a
sequence of operations that have to be carried out for its results to be considered
complete. Since the same query can be satisfied using different execution plans that
can vary in efficiency, the optimizer picks the best available plan.

DBMS Architecture | 9
Another random document with
no related content on Scribd:
Figure 51. William Sellers

This paper had as great influence in America as Whitworth’s paper


of 1841 had in England. A committee was appointed to investigate
the question and recommend a standard. On this committee, among
others, were William B. Bement, C. T. Parry of the Baldwin
Locomotive Works, S. V. Merrick, J. H. Towne, and Coleman Sellers.
Early in the next year the committee reported in favor of the Sellers
standard, the Franklin Institute communicated their findings to other
societies, and recommended the general adoption of the system
throughout the country. The Sellers’ thread was adopted by the
United States Government for all government work in 1868, by the
Pennsylvania Railroad in 1869, the Master Car Builders’ Association
in 1872, and soon became practically universal. After exhaustive
investigation the Sellers’ form of thread was adopted in 1898 by the
International Congress for the standardization of screw threads, at
Zurich, and is now in general use on the continent of Europe.[209]
[209] For the discussion of the Sellers’ screw thread and the
circumstances surrounding its adoption, see: Journal of the Franklin
Institute, Vol. LXXVII, p. 344; Vol. LXXIX, pp. 53, 111; Vol. CXXIII, p. 261;
Vol. CXXV, p. 185.

In 1868 William Sellers organized the Edgemoor Iron Company


which furnished the iron work for the principal Centennial buildings
and all the structural work of the Brooklyn Bridge. In the
development of this business, he led the way in the distinctly
American methods and machinery by which the building of bridges
has been, to a great extent, put upon a manufacturing basis. This
involved the design and introduction of hydraulic machinery, large
multiple punches, riveters, cranes, boring machines, etc.
The excellence of his machinery soon brought him into contact
with government engineers and throughout his life his influence in
the War and Navy Departments was great. In 1890 the Navy
Department called for bids on an eight-foot lathe, with a total length
of over 128 feet, to bore and turn sixteen-inch cannon for the Naval
Gun Factory at Washington. Sellers disapproved of the design and
refused to bid on it. He proposed an alternative one of his own,
argued its merits in person before the Board of Engineers, and
secured its adoption and a contract for it. This great lathe, weighing
over 500,000 pounds, has attracted the attention of engineers from
all parts of the world. In 1873 Mr. Sellers reorganized the William
Butcher Steel Works as the Midvale Steel Company and became its
president. Under his management the company grew rapidly, and
later became a leader in production of heavy ordnance.
It was here that Frederick W. Taylor began in 1880 his work on the
art of cutting metals, which resulted in modern high-speed tool steels
and a general re-design of machine tools. These experiments,
covering a period of twenty-six years, cost upwards of $200,000. Mr.
Taylor has frequently acknowledged his indebtedness in this work to
the patience and courage of Mr. Sellers, who was then an old man
and might have been expected to oppose radical change. It was he
who made the work possible, however, and he supported Taylor
unwaveringly in the face of constant protests.[210] Mr. Sellers was a
man of commanding presence, direct but gracious in manner, who
won and held the respect and loyalty of all about him. His judgment
was almost unerring and he dominated each of the great
establishments he built up.
[210] F. W. Taylor: Paper on the “Art of Cutting Metals,” Trans. A. S. M. E.,
Vol. XXVIII, p. 34.

The firm of William Sellers & Company had another master mind
in that of Dr. Coleman Sellers, a second cousin of William
Sellers.[211] He was born in Philadelphia in 1827, his father, Coleman
Sellers, being also an inventor and mechanic. Like Nasmyth he
spent his school holidays in his father’s shop, which was at
Cardington. In 1846, when he was nineteen years old, he went to
Cincinnati and worked in the Globe Rolling Mill, operated by his elder
brothers, where the first locomotives for the Panama Railroad were
built; and in two years he became superintendent. In 1851 he
became foreman of the works of James and Jonathan Niles, who
were then in Cincinnati and building locomotives. Six years later he
returned to Philadelphia, became chief engineer of William Sellers &
Company, and remained with them for over thirty years, becoming a
partner in 1873. During these years he designed a wide range of
machinery, which naturally covered much the same field as that of
William Sellers, but his familiarity with locomotive work especially
fitted him for the design of railway tools. His designs were original,
correct and refined. The Sellers coupling was his invention and he
did much to introduce the modern systems of power transmission.
[211] See Trans. A. S. M. E., Vol. XXIX, p. 1163; Cassier’s Magazine,
August, 1903, p. 352; Journal of the Franklin Institute, Vol. CXLIX, p. 5.

Doctor Sellers was a good physicist, an expert photographer,


telegrapher, microscopist, and a professor in the Franklin Institute,
his lectures always drawing large audiences. Like William Sellers, he
was a member of most of the great engineering and scientific
societies, here and abroad; and he was president of the American
Society of Mechanical Engineers, of which he was a charter
member. He was received with the greatest distinction in his visits to
Europe. In 1886 impaired health compelled his relinquishing regular
work and he resigned his position of engineer for William Sellers &
Company, being succeeded by his son, the present president of the
company. His last great work was in connection with the power
development of Niagara Falls. He was engineer for the Cataract
Construction Company and served on the commission which
determined the types of turbines and generators and the methods of
power transmission finally adopted. Among the others on this
commission were Lord Kelvin, Colonel Turretini, the great Swiss
engineer, and Professor Unwin, and its report forms the foundation
of modern large hydro-electric work. William Sellers & Company has
a unique distinction among the builders of machine tools in having
had the leadership of two such men as William and Coleman Sellers.
William B. Bement, the son of a Connecticut farmer and
blacksmith, was born at Bradford, N. H., in 1817. His education was
obtained in the district schools and in his father’s blacksmith shop.
His mechanical aptitude was so clear that he was apprenticed to
Moore & Colby, manufacturers of woolen and cotton machinery at
Peterboro, N. H. His progress at first was rapid. Within two years he
became foreman, and on the withdrawal of one of the partners, was
admitted into the firm. He continued there three years, already giving
much thought to machine tools, for which he saw the rising need. In
1840 he went to Manchester and entered the Amoskeag shop when
it was just finished, remaining there two years as a foreman and
contractor under William A. Burke, to whom we have referred
elsewhere. From there Bement went to take charge of a shop for
manufacturing woolen machinery at Mishawaka, Ind. Unfortunately it
was burned to the ground while Bement had gone back to New
Hampshire for his family, so that when he returned with them he
found himself without employment and with only ten dollars in hand.
For the time being he worked as a blacksmith and gunsmith, and
made an engine lathe for himself in the shop of the St. Joseph Iron
Company, which gave him permission to use their tools in return for
the use of his patterns to make a similar machine for themselves.
Much of the work in making this lathe was done by hand as there
was no planer within many hundred miles. The St. Joseph Iron
Company, seeing his work, offered him the charge of their shop, to
which he agreed, provided the plant were enlarged and equipped
with proper tools. This was done, but just as everything was
completed this plant also was burned down. Bement had plans for
another shop ready the following day, went into the woods with
others, cut the necessary timber, and a new shop was soon
completed. He remained there for three years, constructing a variety
of machine tools, one of which was a gear cutter said to have been
the first one built in the West, or used beyond Cleveland.
Figure 52. Coleman Sellers
Figure 53. William B. Bement

He returned to New England as a contractor in the Lowell Machine


Shop under Burke, who had gone there from the Amoskeag Mills in
1845. On account of Bement’s resourcefulness and skill in
designing, Burke induced him to relinquish his contracts and take
charge of their designing, which he did for three years, his residence
at Lowell covering in all about six years.
In 1851 Elijah D. Marshall, who had established a business of
engraving rolls for printing calicos in 1848 and had a small shop at
Twentieth and Callowhill Streets in Philadelphia, offered Bement a
partnership. He moved to Philadelphia in September of that year,
and with Marshall and Gilbert A. Colby, a nephew, he began the
manufacture of machine tools under the name of Marshall, Bement &
Colby, thus starting only a year or so after Sellers. Marshall was a
large man, dignified and deliberate in speech. Bement was strong,
vigorous, a born designer, a remarkably rapid draftsman, and had a
capacity for work rarely equalled. Colby was also a man of
considerable mechanical ability, with advanced business ideas. Their
shop consisted of a single three-storied, stone, whitewashed
building, 40 by 90 feet. Their entire machine shop was on the first
floor, with a 10- by 12-foot room for an office. The engine, boiler and
blacksmith shop were in small outbuildings. Part of the second floor
was rented to another factory and the rest was sometimes used for
religious meetings, while the third floor was used for engraving
printing rolls. Their tools were few and crude; among them were a
36-inch lathe with a wooden bed and iron straps for ways, and a 48-
inch by 14-foot planer with ornate Doric uprights. Marshall and Colby
soon retired, the latter going to Niles, Mich., where he was very
successful. James Dougherty, an expert foundryman, and George C.
Thomas entered the firm, which became Bement & Dougherty, the
plant being known as the “Industrial Works.” Mr. Thomas contributed
considerable capital, and a new shop and a foundry were built. At
the same time they installed a planer 10 feet wide by 8 feet high, to
plane work 45 feet long, a notable tool for that day.
After a few years of struggle, the plant began to grow rapidly and
at one time was the largest of its kind in the country. Bement and
Sellers were among the first to concentrate wholly on tool building.
They confined themselves to work of the highest quality. Both made
much heavier tools, as we have said, than the New England
builders, their only competitors, and in a short time had established
great reputations. Bement relied little on patent protection, trusting to
quality and constant improvement. Thomas retired from the
partnership in 1856 and Dougherty in 1870; and Clarence S. Bement
joined the firm, which became William B. Bement & Son. John M.
Shrigley became a partner in 1875, William P. Bement in 1879, and
Frank Bement in 1888.
Frederick B. Miles was an employee of Bement & Dougherty who
established a tool business under the name of Ferris & Miles, which
afterward became the Machine Tool Works. While head of these
works, Miles greatly improved the steam hammer, particularly its
valve mechanism, and many details of what is known as the Bement
hammer were invented by Miles. In 1885 the Machine Tool Works
consolidated with William Bement & Son, forming Bement, Miles &
Company. Mr. Miles was an accomplished engineer and designer,
with the unusual equipment of six languages at his command, an
asset of value in the firm’s foreign business. William Bement, Senior,
died in 1897, and in 1900 the business became a part of the Niles-
Bement-Pond Company. Mr. Miles retired at that time and has not
since been active in the tool business.[212]
[212] Most of the foregoing details in regard to the Bement & Miles Works
have been obtained from Mr. Clarence S. Bement and Mr. W. T. Hagman,
their present general manager.

Although Bement and Sellers contributed more to the art of tool


building than any of the other Philadelphia mechanics, some of these
others ought to be mentioned. Matthias W. Baldwin, a native of New
Jersey, began as a jeweler’s apprentice. In partnership with David H.
Mason he began making bookbinders’ tools, to which he added in
1822 the engraving of rolls for printing cotton goods and later of bank
notes. From the invention and manufacture of a variety of tools used
in that business they were led gradually into the machine tool
business, the building of hydraulic presses, calender rolls, steam
engines, and finally locomotives. In 1830 Baldwin built a model
locomotive for the Peale Museum which led to an order from the
Philadelphia & Germantown Railroad for an engine which was
completed in 1832 and placed on the road in January, 1833. An
advertisement of that time says: “The locomotive engine built by Mr.
M. W. Baldwin of this city will depart daily, when the weather is fair,
with a train of passenger cars. On rainy days horses will be attached
in the place of the locomotive.”
From this beginning has sprung the Baldwin Locomotive Works,
which employs approximately 20,000 men. In 1834 they built five
locomotives; in 1835, fourteen; in 1836, forty. Their one thousandth
locomotive was built in 1861; the five thousandth in 1880 and the
forty thousandth in 1913. These works have naturally greatly
influenced the neighboring tool makers. From the beginning, both
Bement and Sellers specialized on railway machinery and they have
always built a class of tools larger than those manufactured in New
England.
The Southwark Foundry was established in 1836, first as a
foundry only, but a large machine shop was soon added. The owners
were S. V. Merrick, who became the first president of the
Pennsylvania Railroad Company, and John Henry Towne, who was
the engineering partner. The firm designed and built steam engines
and other heavy machinery and introduced the steam hammer into
the United States under arrangement with James Nasmyth. From the
designs of Capt. John Ericsson they built the engines for the
“Princeton,” the first American man-of-war propelled by a screw, and
later were identified with the Porter-Allen steam engine. Mr. Towne
withdrew from the firm about 1848, and the firm name became
successively Merrick & Son, Merrick & Sons, Henry G. Morris, and
finally the Southwark Foundry & Machine Company.
I. P. Morris & Company came from Levi Morris & Company,
founded in 1828, and for many years were engaged in a similar
work. In 1862 Mr. J. H. Towne, above referred to, was admitted to
the firm as the engineering partner, and the firm name then became
I. P. Morris, Towne & Company, until about 1869 when Mr. Towne
withdrew. At his withdrawal the firm name was restored to its original
form, I. P. Morris & Company. It is now a department of the Cramp
Ship Building Company. During the Civil War the works were
occupied largely in building engines and boilers for government
vessels, and blast furnace and sugar mill machinery. During this
period Henry R. Towne, son of J. H. Towne, entered the works as an
apprentice, served in the drawing room and shops, and finally was
placed in charge of the erection at the navy yards of Boston and
Kittery of the engines, boilers, etc., built for two of the double-
turreted monitors. Returning to Philadelphia, he was made assistant
superintendent of the works.
J. H. Towne was a mechanical engineer of eminence in his day,
whose work as a designer showed unusual thoroughness and finish.
He was a warm friend and admirer of both William and Coleman
Sellers, and through his influence, Henry R. Towne was at one time
a student apprentice in the shops of William Sellers & Company,
acquiring there an experience which had a marked influence on his
future work. Both of the firms with which J. H. Towne was connected
built machine tools for themselves and for others, especially of the
heavier and larger kinds, and thus were among the early tool
builders. I. P. Morris & Company, about 1860, designed and built for
their own use what was then the largest vertical boring mill in this
country.[213]
[213] From correspondence with Mr. Henry R. Towne.

It may surprise some to learn that the well-known New England


firm, the Yale & Towne Manufacturing Company in Stamford, Conn.,
is a descendant of these Philadelphia companies. It was organized
in October, 1868, by Linus Yale, Jr., and Henry R. Towne, who were
brought together by William Sellers. Mr. Yale died in the following
December. This company, under the direction and control of Mr.
Towne, has had a wide influence on the lock and hardware industry
in this country. While the products of the Yale & Towne
Manufacturing Company have always consisted chiefly of locks and
related articles, they have added since 1876 the manufacture of
chain blocks, electric hoists, and, during a considerable period, two
lines allied to tool building, namely, cranes and testing machines.
This company was the pioneer crane builder of this country,
organizing a department for this purpose as early as 1878, and
developing a large business in this field, which was sold in 1894 to
the Brown Hoisting Machine Company of Cleveland, Ohio. The
building of testing machines was undertaken in 1882, to utilize the
inventions of Mr. A. H. Emery, and was continued until 1887, when
this business was sold to William Sellers & Company, for the same
reason that the crane business was sold; namely, that both were
incongruous with the other and principal products of the company.
In recent years the Bilgram Machine Works, under the leadership
of Hugo Bilgram, an expert Philadelphia mechanic, has made
valuable contributions to the art of accurate gear cutting.
In the cities between New York and Philadelphia, and here and
there in the smaller towns of Pennsylvania, are several tool builders
of influence. Gould & Eberhardt in Newark is one of the oldest firms
in the business, having been established in 1833. Ezra Gould, its
founder, learned his trade at Paterson, and started in for himself at
Newark in a single room, 16 feet square. Within a few years the
Gould Machine Company was organized, the business moved to its
present location, and a line of lathes, planers and drill presses was
manufactured. To these they added fire engines. Ulrich Eberhardt
started as an apprentice in 1858 and became a partner in 1877, the
firm name becoming E. Gould & Eberhardt, and later Gould &
Eberhardt. Mr. Gould retired in 1891, and died in 1901. Mr. Eberhardt
also died in 1901; the business has since been incorporated and is
now under the management of his three sons. They employ about
400 men in the manufacture of gear and rack cutting machinery and
shapers.
The Pond Machine Tool Company, which moved from Worcester
to Plainfield, N. J., in 1888, was founded by Lucius W. Pond.[214] It is
a large and influential shop and one of the four plants of the Niles-
Bement-Pond Company. Their output is chiefly planers, boring mills
and large lathes.
[214] See p. 222.

The Landis Tool Company, of Waynesboro, Pa., builders of


grinding machinery, springs from the firm of Landis Brothers,
established in 1890 by F. F. and A. B. Landis. One was
superintendent and the other a tool maker in a small plant building
portable engines and agricultural machinery. A small Brown &
Sharpe grinding machine was purchased for use in these works. Mr.
A. B. Landis became interested in the design of a machine more
suited to their particular work, and from this has developed the
Landis grinder.
CHAPTER XX
THE WESTERN TOOL BUILDERS
Prior to 1880 practically all of the tool building in the United States
was done east of the Alleghenies. The few tools built here and there
in Ohio and Indiana were mostly copies of eastern ones and their
quality was not high. In fact, there were few shops in the West
equipped to do accurate work. “Chordal’s Letters,” published first in
the American Machinist and later in book form,[215] give an excellent
picture of the western machine shop in the transition stage from
pioneer conditions to those of the present day.
[215] Henry W. See: “Extracts from Chordal’s Letters”; McGraw-Hill Book
Co., N. Y. 12th Edition. 1909.

Good tool building appeared in Ohio in the early eighties, and


within ten years its competition was felt by the eastern tool builders.
The first western centers were Cleveland, Cincinnati and Hamilton.
Of these, Cleveland seems to have been the first to build tools of the
highest grade.
We have already noted that the Pratt & Whitney shop in Hartford
furnished Cleveland with a number of its foremost tool builders. The
oldest of these and perhaps the best known is the Warner & Swasey
Company. This company has the distinction, shared with only one
other, of having furnished two presidents of the American Society of
Mechanical Engineers. Oddly enough the other company is also a
Cleveland firm, the Wellman, Seaver, Morgan Company, builders of
coal- and ore-handling machinery, and of steel mill equipment.
Worcester E. Warner, of the Warner & Swasey Company, was
born at Cummington, Mass., in 1846. Although a farmer’s son and
denied a college education, he had access in his own home to an
admirable library, which he used to great advantage. When nineteen
years old he went to Boston and learned mechanical drawing in the
office of George B. Brayton. Shortly afterwards he was transferred to
the shop at Exeter, N. H., where he first met Ambrose Swasey. Mr.
Swasey was born at Exeter, also in 1846, went to the traditional “little
red schoolhouse,” and learned his trade as a machinist in the shop
to which Warner came. In 1870 they went together to Hartford,
entered the Pratt & Whitney shop as journeymen mechanics, and in
a short time had become foremen and contractors. Mr. Swasey soon
gained a reputation for accurate workmanship and rare ability in the
solution of complex mechanical problems. He had charge of the gear
department, and invented and developed a new process of
generating spur gear teeth, which was given in a paper before the
American Society of Mechanical Engineers.[216] Mr. Warner, also,
became one of the company’s most trusted mechanics, was head of
the planing department, and had charge of the Pratt & Whitney
exhibit at the Centennial Exposition in Philadelphia.
[216] Trans. A. S. M. E., Vol. XII, p. 265.

In 1881 they left Hartford and went first to Chicago, intending to


build engine lathes, each putting $5000 into the venture; but finding
difficulty in obtaining good workmen there, they moved in about a
year to Cleveland, where they have remained. Their first order was
for twelve turret lathes, and they have built this type of machine ever
since. At various times they have built speed lathes, die-sinking
machines, horizontal boring mills, and hand gear-cutters, but they
now confine their tool building to hand-operated turret lathes. They
have never built automatics.
Figure 54. Worcester R. Warner
Figure 55. Ambrose Swasey
The building of astronomical instruments was not in their original
scheme, but Mr. Warner’s taste for astronomy and Mr. Swasey’s skill
in intricate and delicate mechanical problems, led them to take up
this work. These instruments, usually designed by astronomers and
instrument makers, were in general much too light; at least the large
ones were. From their long experience as tool builders, Warner and
Swasey realized that strength and rigidity are quite as essential as
accuracy of workmanship where great precision is required. The
design of a large telescope carrying a lens weighing over 500
pounds at the end of a steel tube forty or sixty feet long, and
weighing five or six tons, which must be practically free from flexure
and vibration and under intricate and accurate control, becomes
distinctly an engineering problem. To this problem both Mr. Warner
and Mr. Swasey brought engineering skill and experience of the
highest order.
When the trustees of the Lick Observatory called in 1886 for
designs for the great 36-inch telescope, Warner & Swasey submitted
one which provided for much heavier mountings than had ever been
used before, and heavier construction throughout. They were
awarded the contract and the instrument was built and installed
under Mr. Swasey’s personal supervision. It is located on the very
top of Mount Hamilton in California, 4200 feet above sea-level; and
to give room for the observatory 42,000 tons of rock had to be
removed. The great instrument, weighing with its mountings more
than forty tons, “was transported in sections, over a newly made
mountain road, sometimes in a driving snowstorm, with the wind
blowing from sixty to eighty miles an hour.”[217]
[217] Cassier’s Magazine, March, 1897, p. 403.

As is well known, the instrument was a brilliant success. The


Warner & Swasey Company has since designed and built the
mountings for the United States Naval Observatory telescope, the
40-inch Yerkes telescope, the 72-inch reflecting telescope for the
Canadian Government, and the 60-inch reflecting telescope for the
National Observatory at Cordoba, Argentina, the largest in use in the
southern hemisphere. In addition to this large work, the firm has built
meridian circles, transits and other instruments for astronomical
work, range finders for the United States Government, and
introduced the prismatic binocular into this country.
In connection with this astronomical work Mr. Swasey designed
and built a dividing engine capable of dividing circles of 40 inches in
diameter with an error of less than one second of arc. A second of
arc subtends about one-third of an inch at the distance of one mile.
Although the graduations on the inlaid silver band of this machine
are so fine that they can scarcely be seen with the naked eye, the
width of each line is twelve times the maximum error in the automatic
graduations which the machine produces.
Although their reputation as telescope builders is international,
Warner & Swasey are, and always have been, primarily tool builders.
They were not the first to build tools in the Middle West, but they
were the first to turn out work comparable in quality with that of the
best shops in the East.
The Warner & Swasey shop has had the advantage of other good
mechanics besides its proprietors. Walter Allen, an expert tool
designer, did his entire work with them, rising from apprentice to
works manager. Frank Kempsmith, originally a Brown & Sharpe
man, was at one time their superintendent. Lucas, of the Lucas
Machine Tool Company, was a foreman. George Bardons, who
served his apprenticeship with Pratt & Whitney, went west with
Warner and Swasey when they started in business and was their
superintendent; and John Oliver, a graduate of Worcester
Polytechnic, was their chief draftsman. The last two left Warner &
Swasey in 1891 and established the firm of Bardons & Oliver for
building lathes.
Another old Pratt & Whitney workman is A. W. Foote of the Foote-
Burt Company, builders of drilling machines. Unlike the others,
however, Foote did not work for Warner & Swasey.
The first multi-spindle automatic screw machines were
manufactured in Cleveland. The Cleveland automatic was developed
in the plant of the White Sewing Machine Company for their own
work, and its success led to the establishment of a separate
company for its manufacture. The Acme automatic was invented by
Reinholdt Hakewessel and E. C. Henn in Hartford. Mr. Hakewessel
was a Pratt & Whitney man and Mr. Henn a New Britain boy, who
had worked first in Lorain and Cincinnati and then for twelve years in
Hartford with Pratt & Cady, the valve manufactures. In 1895 Henn
and Hakewessel began manufacturing bicycle parts in a little
Hartford attic, developing for this work a five-spindle automatic.
Seven years later the business was moved to Cleveland, where it
became the National-Acme Manufacturing Company, organized by
E. C. and A. W. Henn and W. D. B. Alexander, who came from the
Union Steel Screw Works. Their business of manufacturing
automatic screw machinery and screw machine products has grown
rapidly and is now one of the largest industries in Cleveland.
The White Sewing Machine Company and the Union Steel Screw
Works were among the first in Cleveland to use accurate methods
and to produce interchangeable work. It was at the Union Steel
Screw Works that James Hartness, of the Jones & Lamson Machine
Company, got his first training in accurate work. Their shop practice
was good and was due to Jason A. Bidwell, who came from the
American Tool Company of Providence.
The Standard Tool Company is an offspring of Bingham &
Company, Cleveland, and of the Morse Twist Drill Company of New
Bedford, Mass. From the Standard Tool Company has come the
Whitman-Barnes Company of Akron, and from that the Michigan
Twist Drill and Machine Company.
Newton & Cox was established in 1876, and built planers and
milling machines. Mr. Newton sold his share in the business to F. F.
Prentiss in 1880, went to Philadelphia, and started the Newton
Machine Tool Works. Cox & Prentiss later became the Cleveland
Twist Drill Company. They drifted into the drill business through not
being able to buy such drills as they required. They began making
drills first for themselves, then for their friends, and gradually took up
their manufacture, giving up the business in machine tools.
Cincinnati is said to have upwards of 15,000 men engaged in the
tool building industry, and to be the largest tool building center in the
world. There are approximately forty firms there engaged in this
work, many of them large and widely known.
This development, which has taken place within the past thirty-five
years, may possibly have sprung indirectly from the old river traffic.
Seventy years ago this traffic was large, and Cincinnati did the
greater part of the engine and boat building and repair work. When
the river trade vanished, the mechanics engaged in this work were
compelled to turn their attention to something else, and there may be
some significance in the coincidence of the rise of tool building with
the decline of the older industry.
There had been more or less manufacturing in Cincinnati for many
years, but little of it could be described as tool building. Miles
Greenwood established the Eagle Iron Works in 1832 on the site
now occupied by the Ohio Mechanics Institute. It comprised a
general machine shop, an iron foundry, brass foundries, and a
hardware factory which rivaled those of New England, employing in
all over 500 men. The hardware factory was important enough to
attract the special attention of the English commissioners who visited
this country in 1853.
In the fifties and early sixties, Niles & Company built steamboat
and stationary engines, locomotives and sugar machinery, and
employed from 200 to 300 men. This company was the forerunner of
the present Niles Tool Works in Hamilton. Lane & Bodley were
building woodworking machinery about the same time, and J. A. Fay
& Company, another firm building woodworking machinery, which
started in Keene, N. H., began work in Cincinnati in the early sixties.
The first builder of metal-working tools in Cincinnati was John
Steptoe; in fact, he is said to have been for many years the only tool
builder west of the Alleghenies. Steptoe came to this country from
Oldham, England, some time in the forties. It is said that he was a
foundling and that his name came from his having been left on a
doorstep. He was married before he came to Cincinnati, and had
served an apprenticeship of seven years, although he was so young
in appearance that no one would believe it. After working some time
for Greenwood, he started in business for himself, making a foot
power mortising machine and later a line of woodworking tools. The
first metal-working tool which he built was a copy of the Putnam
lathe. With Thomas McFarlan, another Englishman, he formed the
firm of Steptoe & McFarlan, and his shop, called the Western
Machine Works, employed by 1870 about 300 men. Their old
payrolls contain the names of William E. Gang of the William E.

You might also like